0% found this document useful (0 votes)
155 views9 pages

AlarmManagement July2012 CEP

The document discusses the importance of implementing an effective alarm management program in chemical processing plants, emphasizing the need for adherence to the ISA-18.2 Standard. It outlines the lifecycle approach to alarm management, which includes stages such as alarm philosophy, identification, rationalization, and detailed design. The article highlights common alarm issues and provides guidance on how to design, implement, and maintain a robust alarm system to improve safety and operational efficiency.

Uploaded by

Tuğba
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
155 views9 pages

AlarmManagement July2012 CEP

The document discusses the importance of implementing an effective alarm management program in chemical processing plants, emphasizing the need for adherence to the ISA-18.2 Standard. It outlines the lifecycle approach to alarm management, which includes stages such as alarm philosophy, identification, rationalization, and detailed design. The article highlights common alarm issues and provides guidance on how to design, implement, and maintain a robust alarm system to improve safety and operational efficiency.

Uploaded by

Tuğba
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Reprinted with permission from CEP (Chemical Engineering Progress), July 2012.

Copyright © 2012 American Institute of Chemical Engineers (AIChE).


Instrumentation

Implement an Effective
Alarm Management
Program
Todd Stauffer, P.E. Apply the ISA-18.2 Standard on Alarm Management
exida
to design, implement, and maintain
an effective alarm system.

A
larms are used in chemical processing plants to thought goes into deciding which points should be alarmed
draw the operator’s attention to an abnormal condi- and why. This has led to an epidemic of alarm management
tion that, if disregarded, could lead to poor product issues including:
quality, unplanned downtime, damaged assets, personnel • nuisance alarms (chattering alarms and standing/stale
injury, or a catastrophic accident. When employed appro- alarms)
priately, alarms help the operator to safely run the process • alarms identified with the incorrect priority level
within normal operating conditions. They are one of the • alarms that require no operator response
first layers of protection to prevent the escalation of a haz- • alarms that occur frequently (“bad actors”)
ard into an accident (Figure 1). • alarm overload during normal conditions
Alarm management has become increasingly important • alarm floods during process upsets
as chemical plants look for ways to reduce costs, increase • improper alarm suppression.
productivity, and deal with the
loss of experienced operators.
It has also become more chal-
lenging due to the adoption of
the modern distributed control Community
system (DCS). Alarm systems Emergency Response
FIC
of the past consisted of panel- Plant
Mitigation

PRV
board control rooms, where the Emergency Response
number of alarms was limited Loss of Passive Protection
by the finite wall space, and Containment (e.g.,Bund/Dike)
there was an actual cost to Incident Active Protection LSSH
Chemical = A
(e.g.,Relief Valve, Rupture Disk)
hard-wire the system into the Material = B
LAH
process (approximately $1,000 Trip Safety Instrumented System
LSH Pressure = X
Temperature = Y
per alarm) (1). Today, alarms Volume = Z
are considered free because
Prevention

Alarm Operator Intervention


they are implemented via
software. Consequently, less Loop Process Control
u Figure 1. Alarms are one layer of Process Design
Process Value
protection to prevent the escalation of
hazardous situations.

CEP July 2012 www.aiche.org/cep 19


Instrumentation

actually one of the most important accomplishments of


1. Philosophy
the standard. An alarm is:
• an audible and/or visible means of indicating — for
2. Identification something to be considered an alarm, it must provide some
9. Management
sort of warning signal (a control device can be configured
3. Rationalization of Change with limits that trigger control actions or data collection yet
not be an alarm)
4. Detailed Design 10. Audit • to the operator — the indication must be directed
toward the operator, not merely be a means to provide infor-
5. Implementation mation to an engineer, maintenance technician, or manager
• an equipment malfunction, process deviation, or
6. Operation
8. Monitoring
abnormal condition — the alarm must indicate a problem,
and not a normal process condition (such as an expected valve
Assessment
7. Maintenance closure or pump stoppage)
• requiring a response — a specific operator response
p Figure 2. The alarm management lifecycle (2) consists of ten stages. (other than acknowledging the alarm) to correct the abnormal
condition and bring the process back to a safe and/or produc-
Recognizing the increased importance of alarm manage- tive state must be necessary; if the operator does not need to
ment, the International Society for Automation (ISA) issued respond, then the condition should not initiate an alarm.
a new standard in 2009, ANSI/ISA-18.2, “Management of Many alarm management issues are caused by alarms
Alarm Systems for the Process Industries” (CEP, Mar. 2011, that do not meet these requirements.
p. 14). This article provides an overview of the standard and
how it can be used to eliminate common alarm issues. Stage 1: Alarm philosophy
The cornerstone of an effective alarm management pro-
The basics of the standard gram is the alarm philosophy document, which establishes
ISA-18.2 — developed by a committee composed of guidelines for addressing all aspects of alarm manage-
suppliers, consultants, government representatives, and end ment, including the criteria for determining what should
users of automation systems — provides a framework for be alarmed, roles and responsibilities, human machine
the successful design, implementation, operation, and man- interface (HMI) design, alarm prioritization, management
agement of alarm systems (2). It contains guidance to help of change (MOC), and key performance indicators (KPIs).
prevent and eliminate the most common alarm management This document is critical for helping plant staff maintain an
problems, as well as a methodology for measuring, analyz- alarm system over time and for driving consistency.
ing, and improving the performance of the alarm system. It is important to establish the methodology for alarm pri-
The standard builds on a guide published by the oritization and classification before beginning alarm ration­
Engineering Equipment and Materials Users Association alization — i.e., the process used to ensure that every alarm
(EEMUA), “Alarm Systems: A Guide to Design, Man- is valid and necessary. Priority is used to indicate how criti-
agement and Procurement” (3), which was the primary cal the alarm is and to help the operator know which alarms
reference for alarm management before the publication of to respond to first. To ensure consistency, alarms should
ISA-18.2. The International Electrotechnical Commission be prioritized based on the severity of the potential conse-
(IEC) is using ISA-18.2 as the basis for an international quences and the time available for the operator to respond.
alarm management standard (IEC-62682). Alarm classification organizes alarms based on com-
The ISA-18.2 standard takes a lifecycle approach to mon characteristics and requirements (e.g., testing, train-
alarm management (Figure 2) that encompasses design, ing, MOC, reporting). Certainly, an alarm that is identified
training, operation, maintenance, monitoring, and change as a safeguard in a hazard and operability (HAZOP) study
management. Key activities are executed in the various or as an independent protection layer (IPL) will have
stages of the lifecycle, and the products of one stage are the more-stringent requirements for testing and operator train-
inputs for the next stage, as shown in Table 1. ing than the average process alarm. A good philosophy
provides a listing of relevant alarm classes (e.g., critical for
What is an alarm? personnel safety, quality, environmental protection, process
The ISA-18.2 standard defines common terminology safety, compliance with the U.S. Occupational Safety and
that can be used by all plant personnel when talking about Health Administration (OSHA) process safety manage-
alarms. Although this may seem rather insignificant, it is ment (PSM) standards), and their requirements.
Article continues on pg. 22

20 www.aiche.org/cep July 2012 CEP


Table 1. The alarm management lifecycle consists of
ten stages that direct the design and implementation of an effective alarm system.
Activity Inputs Outputs
Stage 1: Philosophy
Document the objectives, guidelines, and Objectives and standards Alarm philosophy document, alarm system
work processes for the alarm system requirement specification (ASRS)
Stage 2: Identification
Determine potential alarms Process hazard analysis (PHA) report, List of potential alarms
safety requirements specification (SRS),
piping and instrumentation diagrams
(P&IDs), operating procedures, etc.
Stage 3: Rationalization
Determine which alarms are necessary, Alarm philosophy, and list of potential Master alarm database (MADB),
establish their design settings (e.g., priority, alarms alarm design requirements
setpoint, classification), and document their
basis (cause, consequence, corrective
action, time to respond, etc.) in a master
alarm database
Stage 4: Detailed Design
Design the system to meet the requirements MADB, alarm design requirements Completed alarm design
defined in rationalization and philosophy;
includes basic alarm design, human-machine
interface (HMI) design, and advanced
alarming design
Stage 5: Implementation
Put the alarm system into operation (instal- Completed alarm design and MADB Operational alarms, alarm response
lation and commissioning, initial testing, and procedures
initial training)
Stage 6: Operation
Alarm system is functional. Operators use Operational alarms, alarm response Alarm data
available tools (e.g., shelving and alarm procedures
response procedures) to diagnose and
respond to alarms
Stage 7: Maintenance
Alarms are taken out of service for repair and Alarm monitoring reports and Alarm data
replacement, and periodic testing alarm philosophy
Stage 8: Monitoring and Assessment
Measure alarm system performance and Alarm data and alarm philosophy Alarm monitoring reports, proposed
compare to key performance indicators (KPIs) changes
defined in the alarm philosophy; identify
problem alarms (nuisance alarms, frequently
occurring alarms, etc.)
Stage 9: Management of Change
Process to authorize additions, modifications, Alarm philosophy, proposed changes Authorized alarm changes
and deletions of alarms
Stage 10: Audit
Periodically evaluate alarm management Standards, alarm philosophy, and Recommendations for improvement
processes (e.g., comparing control system audit protocol
alarm settings to the MADB)

CEP July 2012 www.aiche.org/cep 21


Instrumentation

The philosophy stage also includes preparation of the is not concerned with what could happen if all protection
alarm system requirements specification (ASRS), which layers fail — the ultimate consequence — as defined in
identifies the alarm system’s functional requirements. The a HAZOP. If inaction does not generate significant con-
ASRS can be used to support vendor selection, serve as the sequences, for example if the only consequence is the
basis for system testing, and help in determining whether generation of another alarm, the alarm may not be needed.
any advanced/enhanced alarming techniques, such as cus- Operator response. Another important step in identify-
tomization or third-party products, are needed. ing and eliminating unnecessary alarms is documenting
the steps to be taken by the operator to correct the abnor-
Stage 2: Identification mal situation, such as closing a valve or starting a backup
Potential alarms are identified by reviewing plant and pump. If an operator response cannot be defined, then
process documentation. This documentation includes pro- the alarm is not valid and can be removed from consider-
cess (or piping) and instrumentation diagrams (P&IDs), pro- ation. If multiple alarm conditions share the same opera-
cess hazard analyses (PHAs), operating procedures, product tor action, this may indicate redundant alarms, and one or
quality reviews, layer-of-protection analyses, safe operating more can be eliminated.
limits, failure modes and effects analyses, environmental Response time. After determining how the operator
permits, and the existing control system configuration. should respond, the time available to take this action is
Candidate alarms should not be considered valid until estimated. Operator response time is defined as the time
they have successfully gone through the rationalization pro- between the activation of the alarm and the last moment
cess (discussed next). Even alarms that have been identified the operator can act to prevent the consequence; thus, it
as safeguards in a HAZOP analysis must be rationalized. represents the time available to the operator to fix the prob-
The criteria for determining whether an alarm is valid lem. If the available time is insufficient, the alarm should
should be applied when the alarm is first identified (e.g., be redesigned (because it will not be reliable) and replaced
during a hazard analysis) and its basis (e.g., purpose, with an automated response (i.e., an interlock).
cause, potential consequence, and time to respond) should Alarm priority. Alarm priority is established based on
be documented. These forward-thinking activities will operator response time and severity of the consequences,
improve the quality and amount of information available which are assessed against predefined thresholds in areas
for this evaluation. such as safety, environmental impact, and cost. ISA-18.2
recommends a maximum of three or four different priori-
Stage 3: Rationalization ties. To help operators respond effectively to the most
The modern DCS makes it easy to add alarms without critical alarms, only a small fraction should be set to high
significant effort, cost, or justification. To avoid unneces- priority (e.g., 5%), with the remainder set to medium
sary alarms, alarm rationalization aims to identify the (15%) or low (80%) priority.
minimum set of alarms needed to keep the process safe and Alarm class. Alarm class is assigned based on the type
within its normal operating range, and to ensure that every of consequences and the method used to identify the hazard
alarm is valid and necessary. This is a multistep process that and consequences (e.g., a HAZOP analysis). Alarms can be
includes defining and documenting the design attributes assigned to more than one classification.
(e.g., priority, setpoint, type, and classification), as well as Setpoints. Alarm setpoints (limits) should be defined far
the cause, consequence, time to respond, and recommended enough away from the consequence threshold to give the
operator response in a master alarm database (MADB). It is operator adequate time to respond, yet not so close to nor-
a team activity (similar to a HAZOP study) involving pro- mal operating conditions that nuisance alarms are triggered
duction and process engineers, process control engineers, as a result of normal process variation. A common mistake
experienced operators, and other personnel as needed. is to configure setpoints based on rules of thumb relative to
Alarm validity. The first step in the rationalization the range of a process variable. An example is configuring
process is to verify the validity of the alarm based on the setpoints for high-high, high, low, and low-low as 90%,
the criteria set forth in the philosophy document. If the 80%, 20%, and 10% of range, respectively.
candidate alarm does not meet the criteria — e.g., it does Advanced alarm handling. Lastly, one should evalu-
not represent an abnormal situation, it is not unique, it ate the need for advanced alarm handling by documenting
does not require a timely operator response, etc. — it can states, conditions, steps, phases, or products for which the
be removed from consideration. alarm limit or priority should be different from steady state,
Consequences. Next, the consequences of inaction — or the alarm should be suppressed from the operator. This
that is, the direct and immediate consequences of failing to helps to ensure that an alarm is always relevant when it is
manage each individual alarm — are identified. This step presented to the operator.

22 www.aiche.org/cep July 2012 CEP


The results of this rationalization process are recorded designed HMI enhances the operator’s ability to detect new
in a master alarm database (MADB), which can range from alarms quickly, diagnose the cause of the problem, and
a user-developed spreadsheet to a commercially available respond with the appropriate corrective action.
tool (Figure 3). HMI graphic displays should be designed so that alarms
“jump off the page,” drawing the operator’s attention to the
Stage 4: Detailed alarm design alarm rather than less-important information (e.g., pump
Basic alarm design. In basic alarm design, alarms and status). The level of visibility of information should be
alarm components are designed and configured based on related to its operational importance — background infor-
the requirements identified in the rationalization stage. This mation should have low visibility, normal plant measure-
includes the establishment of alarm deadbands and on/off ments medium visibility, and abnormal conditions (values
delays, as well as basic logic to define when the alarm and states) the highest visibility.
should be active. For example, in some plants, motors and The appropriate use of color, text, and patterns helps
pumps generate a nuisance alarm whenever they are not the operator detect the presence of an alarm and determine
running, instead of alarming only when they stop unexpect- the order of priority. Certain colors should be reserved
edly. Nuisance alarms are defined as alarms that activate for alarms and not be used for other functions within the
excessively, unnecessarily, or do not return to normal after HMI (such as equipment status or process piping). Alarm
the correct response is taken. colors should reflect the priority of the alarm. In addition to
The alarm deadband compensates for fluctuations in the color, symbols, patterns, and/or text should also be used to
process variable, reducing the number of times an alarm indicate alarm status, because approximately 8%–12% of
triggers for a given abnormal condition, which should be the male population is color-blind.
only once. Deadband adds an offset to the alarm limits to Enhanced and advanced alarming. Overloading the
prevent an alarm from returning to normal until the process operator with stale alarms (alarms that remain activated for
variable clears the limit by this additional amount. an extended period of time, e.g., more than 24 h) or alarm
The deadband should be set wide enough to accom- floods (10 or more alarms in 10 min) can lead to increased
modate the expected noise level in the variable’s measure- operator stress, missed alarms, and/or operator error. An
ment, but narrow enough to ensure that the alarm is mean- effective alarm system manages the number of alarms
ingful. This will minimize chattering alarms (i.e., alarms presented to the operator and ensures that they are pre-
that repeatedly transition between the alarm state and the sented only when they are relevant and when they require
normal state in a short period of time). On/off delays can a response. Transient plant conditions, the use of different
also prevent chattering alarms. Industry studies have dem- feedstocks, production of different products, idled equip-
onstrated that following recommended practices for use of ment, and unplanned process upsets can make this a chal-
alarm deadbands and on/off delays (like those in ISA-18.2) lenge. In batch processes, for example, a large number of
can reduce the alarm load on the operator by up to 90% (4). nuisance alarms result from not suppressing alarms during
Human machine interface (HMI) design. An effective steps in which they are not applicable. The CSB investiga-
HMI design maximizes the operator’s situation aware- tion of the accident in Belle, WV, (5) found that the control
ness, helping him or her see the big picture and proactively system was not engineered to suppress nuisance alarms
address process deviations before they become more seri- originating from idled process equipment.
ous. Graphic displays should provide an appropriate level In advanced alarming, additional layers of logic, pro-
of process and equipment information for the operator gramming, or modeling are used to modify alarm attributes
to verify or confirm the existence of an alarm. A well- such as setpoint, priority, or suppression status based
on the state of the process and/or equipment.
Alarm suppression — preventing the alarm from
activating when the base alarm condition (i.e.,
the condition that would normally generate the
alarm) is present — is a common technique.
ISA-18.2 defines three types of suppression
(although the terminology and functionality vary
among different control systems):
• designed (automatic) suppression — sup-
p Figure 3. The rationalization stage identifies (among other things) the cause,
presses alarms based on operating conditions or
consequence, and corrective action for each alarm, and records this information in the plant states, for instance when equipment is out
master alarm database. (Source: SILAlarm, © exida 2012) of service or in response to an event (e.g., a com-

CEP July 2012 www.aiche.org/cep 23


Instrumentation

pressor trip) that would otherwise lead to an alarm flood; independent protection layer, or as part of an OSHA PSM
this is controlled by the logic that determines the relevance mechanical integrity program. These alarms do not occur
of the alarm often — typically only in periods of high operator stress
• shelving (manual) suppression — a mechanism, such as during a major plant upset.
typically initiated by the operator, to temporarily suppress
an alarm Stage 6: Operation
• out of service — the state of an alarm during which During the operation stage, alarms perform their func-
the alarm indication is suppressed, typically manually, for tion of notifying the operator of an abnormal situation. A
reasons such as maintenance. useful system provides tools, such as shelving and alarm-
ISA-18.2 defines other types of advanced and enhanced response procedures, to help the operator handle alarms.
alarming methods, including time-varying alarm attributes, Shelving is critical to responding effectively during a
redirection of alarms (e.g., via pagers) to personnel outside plant upset, as it allows the operator to manually hide less-
the control room, and techniques for automatically deter- important alarms on a temporary basis. In some systems,
mining the cause of abnormal situations. shelved alarms reappear automatically after a preset time
Advanced alarming could be applied, for example, period so that they are not forgotten.
to a reactor and its associated temperature, pressure, level, The alarm philosophy should specify which alarms can
and flow alarms. When the reactor is in operation, alarm be shelved and by whom, as well as which alarms cannot
limits could be set differently depending on the product be shelved (e.g., those that are of the highest priority or
that is being made or the step of the batch recipe that is related to personnel safety). Systems that support shelving
underway. When the reactor is idle or offline for main- require that the operator be able to view a list of all shelved
tenance, most of the alarms will not be useful and some alarms for review anytime, such as during shift change.
might be triggered unnecessarily. Alarm suppression can A key best practice is providing operators with alarm-
hide these unnecessary alarms, which would otherwise response procedures. Alarm-response procedures contain
remain active until the equipment is put back into service, process knowledge that was captured during rationaliza-
thus becoming stale alarms. tion (e.g., cause, consequence, corrective action, and time
Before suppressing an alarm, it is important to con- to respond), typically based on input from senior opera-
sider whether it is needed to detect a hazardous condition tors. This information, provided in context to the operator
even when the process or equipment is out of service. The from within the HMI (Figure 4), can be indispensable for
alarms for reactor high pressure and flow might be required helping operators (especially junior operators) respond to
to detect a leak (which would indicate a loss of isolation alarms more quickly and consistently.
from the process). Thus, these alarms should not be sup-
pressed and their limits should be set to detect the abnor-
mal condition.

Stage 5: Implementation
The alarms are put into service in the implementa-
tion stage. This stage includes commissioning, training,
and testing, all of which are ongoing activities that result
from process design changes or the addition of new
instrumentation.
For alarms to be effective, the operator must know how
to respond to each alarm. An effective training program
covers all realistic operational situations, including:
• system functionality and features such as sorting/fil-
tering, navigation, and shelving
• principles of the process to ensure a full understand-
ing of why the alarm is created as well as what could hap-
pen if the alarm is disregarded
• procedures that should be followed to shelve an alarm
or take it out of service. p Figure 4. The alarm-response procedure can be integrated into the HMI
Training is particularly important for safety-related to give operators easy access to critical information. Image courtesy of
alarms, such as those identified as a safeguard, as an Emerson Process Management.

24 www.aiche.org/cep July 2012 CEP


Stage 7: Maintenance terminal in Hertfordshire, England (6), was that the design
The maintenance stage is concerned with alarms that and location of the failed independent high-level safety
are out of service, typically for equipment repair, replace- switch made it difficult to test, and its integrity could
ment, or testing. The out-of-service state is not a function not be verified.
of the process equipment, but describes an administrative
process of suppressing (i.e., bypassing) an alarm using a Stage 8: Monitoring and assessment
permit system. During the monitoring and assessment stage, plant
The ISA-18.2 standard provides recommendations personnel measure the performance of the alarm system and
on what should be contained in a procedure to remove compare it to the KPIs identified in the philosophy docu-
an alarm from, and return it into, service. Recommenda- ment. Results are analyzed to identify issues such as nui-
tions include documenting why an alarm was removed sance alarms, bad actors, and alarm overload. All of these
from service, assessing the impact on safety, and defining can clutter the operator’s display — making it more difficult
what testing is required before putting an alarm back into to detect a new alarm and increasing the chances that the
service. Prompt repair of hardware failures is important operator will respond incorrectly or miss a critical alarm.
to minimize alarms associated with the failures, as these A key metric to consider during this assessment is
alarms can quickly become stale or nuisance alarms that the rate at which the alarms are presented to the operator.
interfere with the operator’s ability to detect new alarms. If In order to provide adequate time to respond, an operator
prolonged out-of-service periods are required, then interim
Table 2. ISA-18.2 recommends these targets
alarms may be necessary.
for the number of alarms presented to the operator during
Periodic testing of alarms is an important maintenance each 10-min period, each hour, and each 24-h day.
activity for verifying alarm integrity. The frequency of
Target Value
testing is typically dictated by the alarm’s classification Number of Annunciated
Alarms per Operating Likely to be Maximum
and expected frequency of activation. For example, IPL
Position per … Acceptable Manageable
alarms should be proof-tested at a rate based on their
Day ~ 150 ~ 300
expected level of risk reduction, whereas alarms that are
part of an OSHA PSM mechanical integrity (MI) program Hour ~ 6* ~ 12*
should be tested according to the MI program’s require- 10 minutes ~ 1* ~ 2*
ments. One of the findings of the Buncefield investigation * For these metrics, averages should be calculated based on at least 30
of the fire and explosion at the Hertfordshire oil storage days’ data.

Table 3. ISA-18.2 recommends these targets for performance and diagnostic metrics.
Metric Target Value
Percentage of hours containing more than 30 alarms <1%
Percentage of 10-min periods containing more than 10 alarms <1%
Maximum number of alarms in a 10-min period ≤10
Percentage of time the alarm system is in a flood condition <1%
Percentage contribution of the top 10 most frequent alarms to <1% (target), with a maximum of 5%
the overall alarm load Action plans are required to address deficiencies
Number of chattering and fleeting alarms 0
Action plans are required to correct any that occur
Number of stale alarms Less than 5 present on any day
Action plans are required to address excess alarms
Distribution of priorities of annunciated alarms 3 priorities: ~80% Low, ~15% Medium, ~5% High
4 priorities: ~80% Low, ~15% Medium, ~5% High, <1% Highest
Other special-purpose priorities are excluded when
calculating the value of this metric
Number of unauthorized alarm suppressions 0
(i.e., outside of controlled or approved methodologies)
Number of unauthorized changes to alarm attributes 0
(i.e., outside of approved methodologies or MOC)

CEP July 2012 www.aiche.org/cep 25


Instrumentation

should be presented no more than one to two alarms every Stage 9: Management of change
10 min. A related metric is the percentage of 10-min Even the most well-designed alarm system can experi-
intervals during which the operator receives more than ence problems if changes to it are not strictly controlled.
10 alarms, which indicates the presence of an alarm flood. Management of change ensures that modifications to
The ISA-18.2 standard’s recommended targets for perfor- the alarm system, such as changing a setpoint or adding/
mance and diagnostic metrics are shown in Tables 2 and 3. removing an alarm, are reviewed and approved prior to
Performance targets are approximate and are based implementation. An effective MOC process balances
primarily on what an operator is capable of handling. The the need for rigor and traceability with the need to make
use of these targets as metrics for a particular plant, and changes promptly to avoid impacts on production. For
the maximum acceptable numbers, will depend on many example, changing the limit for a safety-critical alarm may
factors, including the type of process, operator skill level, require a different level of review and authorization than
HMI design, degree of automation, operating environment, changing the deadband of a general process alarm. Once a
and types and significance of the alarms generated. For change is approved, the master alarm database should be
example, acceptable rates for alarms related to safety or updated and operators should be trained on the impact of
product quality in certain industries (e.g., nuclear, pharma- the change.
ceutical) are likely to be close to zero. The alarm philosophy should define the level of MOC
One of the most beneficial analyses is to routinely that is required based on the type of change and the alarm’s
review the top 10 or 20 most frequently occurring alarms. classification or priority. A contributing factor to the
In the absence of an effective alarm management program, Deepwater Horizon drilling rig accident was the practice
these bad actors may contribute 50%–80% of the overall of disabling the annunciation of the general master alarm
alarm load on the operator. Fixing these alarms represents designed to notify personnel of danger (fire or explosive/
low-hanging fruit for improving performance. toxic gas), in order to prevent false alarms from waking
Analyzing alarm system performance by class can personnel in the middle of the night (7). Perhaps if this
provide valuable information. For
example, it can identify whether Monthly Performance Review and Ongoing Alarm Rationalization
any safety-critical alarms are being
suppressed or behaving as nuisance Monthly Review / Update Cycle
alarms, both of which are indicators Identify bad actors Measure
of a dangerous situation. One of the Monthly Review and evaluate
of Alarm System alarm load
contributing causes to the accident Performance on the operator Analyze
at the DuPont Belle, WV, plant was
the frequent false (nuisance) alarms
generated by a burst disc sensor. The Tracking of
Rationalization Status
alarms from this sensor, which had Operations
been designated as OSHA-PSM- Feedback
critical equipment, were ignored by
operators because they had become Rationalization Status
accustomed to it behaving as a nui-  Boiler
sance alarm (5).  Steam Turbine
 Feedwater Perform Delta Alarm Focus on
Alarm management is a continu-  Condenser Rationalization bad actors
ous process that is never finished.  Stack (e.g., 1 week per month) and on Improve
Measuring alarm system perfor-  Ash Extraction highest-priority
mance and taking action on the Update unit that has
Master Alarm not yet been
findings is an important ongoing Database rationalized
activity and is critical to continuous
improvement. An effective alarm
philosophy documents the KPIs in a
format that clearly defines target vs. Update
Control System Implement
unacceptable levels, the frequency Configuration
of measurement and review, and
the personnel responsible for taking p Figure 5. Ongoing alarm management should include a periodic review of alarm-system performance,
action based on the results. followed by corrective actions when necessary.

26 www.aiche.org/cep July 2012 CEP


alarm had been classified as personnel-safety-critical, the clude this. In some cases, it may be necessary to implement
proper controls would have been in place to prevent it from rationalization in stages.
being disabled. An effective ongoing program includes a periodic
review of alarm-system performance (e.g., monthly),
Stage 10: Audit followed by prompt action to address any alarm system
During the audit phase, plant personnel conduct periodic performance issues that are identified (Figure 5). Plants
reviews to assess actual alarm management work practices should constantly strive to improve performance as part of
against the designed work practices outlined in the alarm a continuous improvement initiative. CEP
philosophy. The goal is to maintain the integrity of the alarm
system and to identify areas of improvement. Audit also
TODD STAuffeR, P.e., is exida’s director of alarm management services
includes a review of system performance, which may reveal and product manager for the SILAlarm rationalization tool (64 N. Main
gaps not apparent from alarm performance monitoring. St., Sellersville, PA 18960; Phone: (215) 453-1720; Email: tstauffer@
exida.com; Website: www.exida.com). He has over 20 years of process
Operator interviews should be conducted to assess control experience in the chemical, pulp and paper, food and beverage,
system performance from a human perspective — for and life sciences industries. He is an editor and voting member of the
ISA-18.2 standard committee and co-chairs the development of the
instance, to verify that alarm priority is applied consis- ISA-18 technical report 3, “Basic Alarm Design.” He received his BS
tently. A recommended best practice is to periodically from Pennsylvania State Univ. and MS from the Univ. of Pennsylvania,
both in mechanical engineering. He is a registered P.E. in the Common-
compare the running alarm system configuration with the wealth of Pennsylvania.
master alarm database to ensure that unauthorized configu-
ration changes have not been made.
Literature Cited
Create an effective alarm management program 1. O’Brien, L., and D. Woll, “Alarm Management Strategies,”
The hardest part of creating an effective alarm manage- ARC Advisory Group, Boston, MA (Nov. 2004).
ment program is getting started. Brownfield facilities (those 2. The International Society of Automation, “Management of
with existing control systems) should start with either the Alarm Systems for the Process Industries (ISA-18.2),” ANSI/
ISA 18.2–2009, ISA, Research Triangle Park, NC (June 2009).
monitoring and assessment or the audit stage. Facilities
3. Engineering Equipment and Materials Users Association,
with new control systems (greenfield sites) should start by
“Alarm Systems: A Guide to Design, Management and Procure-
creating an alarm philosophy document and obtaining man- ment,” 2nd ed., EEMUA 191, Engineering Equipment and
agement approval. Materials Users Association, London, U.K. (2007).
Another critical success factor is structuring an alarm- 4. Zapata, R., and P. Andow, “Reducing the Severity of Alarm
management program that is realistic — one that empha- Floods,” Proceedings of the Honeywell Users Group Americas
sizes the ongoing nature of alarm management and that Symposium 2008, Honeywell, Phoenix, AZ (2008).
key personnel can commit to. Ideally, existing operational 5. U.S. Chemical Safety and Hazard Investigation Board, “E. I.
DuPont de Nemours & Co., Inc.,” Investigation Report 2010-6-
plants would complete alarm rationalization early in this I-WV, CSB, Washington, DC (Sept. 2011).
effort, but the time and personnel requirements may pre- 6. Buncefield Major Incident Investigation Board, “The
Buncefield Incident 11 December 2005: The final reports of the
Major Incident Investigation Board,” Vol. 1, Buncefield Major
For More Information Incident Investigation Board, London, U.K. (Dec. 2008).
7. Muskus, J., “Deepwater Horizon Alarm System Was Partly

P erformance-based standards like ISA-18.2 define the


“what” (requirements and recommendations), but not
the “how.” While this article touched briefly on the how,
Disabled Prior To Explosion, Technician Tells Congress,” Huff-
ington Post, www.huffingtonpost.com/2010/07/23/deepwater-
horizon-alarm-s_n_657143.html (Jul. 23, 2010).
additional guidance can be found in the six technical
reports (TRs) created to supplement the standard:
Additional Reading
• TR1: Alarm Philosophy
Stauffer, T., et al., “Managing Alarms Using Rationalization,”
• TR2: Alarm Identification and Rationalization
Control Engineering, 58 (3), pp. 30–35 (Mar. 2011).
• TR3: Basic Alarm Design
• TR4: Enhanced and Advanced Alarm Design Stauffer, T., et al., “Alarm Management and ISA-18 — A Journey,
Not a Destination,” Texas A&M Instrumentation Symposium,
• TR5: Alarm Monitoring, Assessment, and Audit
Available at www.exida.com/index.php/resources/whitepapers
• TR6: Alarm Systems for Batch and Discrete Processes
(Jan. 2010).
These technical reports are in the process of being Stauffer, T., et al., “Get a Life(cycle)! Connecting Alarm Man-
developed. Three have been completed and should agement and Safety Instrumented Systems,” ISA Safety and
soon be available for download from the ISA website, Security Symposium, Available at www.exida.com/index.php/
www.isa.org. resources/whitepapers (Apr. 2010).

CEP July 2012 www.aiche.org/cep 27

You might also like