Operator response to alarms is important layers of protec- tion (LOP). When implemented with good design, engineering and maintenance practices, an alarm can help reduce the safety integrity level (SIL) of a safety instrumented function (SIF).
The ISA84 standard covers the functional safety life cycle requirements of safety instrumented systems (SISs) and has been recognized by the US Occupational Safety and Hazard Administration (OSHA) as a generally accepted good engi- neering practice that can be used to comply with the process safety management (PSM) standard 29CFR1910.119 for SIS. In the initial phase of the safety life cycle, hazard and risk assess- ments identify the risks associated with the process. The risk is reduced to a tolerable level as defined by the corporate standard by implementing safety functions that act as LOP.
Layers of protection analysis (LOPA) is the most commonly used technique in the process industry, especially for selecting the amount of risk reduction required to be provided by each protection layer. If the risk is not reduced to a tolerable level us- ing non-instrumented safety functions such as pressure safety valves (PSVs), mechanical stops and process design, then ad- ditional risk reduction is necessary by means of a SIF imple- mented in a SIS. Often, the performance requirement of a SIF can be reduced by one order of magnitude by implementing an operator response to an alarm as a protection layer. Reducing SIL requirements (e.g., from SIL3 to SIL2) offers huge capital and maintenance cost savings. The higher the SIL, the higher the cost of implementation and maintenance.
The ISA18.2 standard describes the alarm management life cycle. However, the standard does not give enough guid- ance on the alarms that have been credited in LOPA for risk reduction. The standard talks about the alarm classification and defines “highly managed alarms” as a special class that has requirements similar to the alarm for which credit has been taken in LOPA for risk reduction. The rules for taking credit for an operator response to an alarm are defined in the ISA84 standard. Teams performing alarm engineering may not have familiarity with the requirements of ISA84; therefore, those re- quirements are outlined here. If alarm management and func- tional safety teams come up with separate lists and databases, it becomes difficult for operations to maintain and keep the lists up-to-date. The best practices for maintaining a common list as a master alarm database will be discussed here.
The ISA 18.2 alarm management standard provides a life cycle approach to manage alarms starting from alarm phi- losophy and rationalization to operations and maintenance.
It states the mandatory and nonmandatory requirements for alarm engineering in different stages of the life cycle. The stan- dard refers to ISA84 for the requirements of alarms related to process safety. In many organizations, the teams who perform the alarm management life cycle steps are different from those who work on SISs and may not be familiar to the requirements of ISA84, which imposes specific requirements on the alarms related to process safety. The LOP model identifies the process alarm as one of the protection layers in reducing the demand rate on SISs. Although casual verbal or written communication refers to alarms as a protection layer, the correct term is “opera- tor response to alarm as protection layer.”
LOPA. To illustrate the concept, consider a process example
consisting of a high-pressure (HP) knockout drum, as shown in FIG. 1. The product gas at high pressure and temperature exit-
ing the reformer is cooled in a process gas cooler, which con- denses the water in the gas. The condensate is removed in the knockout drum V-100. The liquid level in the knockout drum is controlled by a level control loop, LC-100. The normal operat- ing range of 35%–50% has been established to maintain a liquid level blanket in the knockout drum. If the level goes high, the
LT 100A LC 100 LP storage tank LT 100C LT 100B LAL 100 SIF 1 SP PV OP LALL 100 50 35 30 10 Normal operating range Low alarm
Low level trip
Product gas HP product gas + condensate Knockout drum Regulating valve LV-100 On/off valve XV-100 V-100 V-101
66MARCH 2013 | HydrocarbonProcessing.com
Safety Developments
separation will not occur, as the liquid will be carried over into the product stream. If the level drops too low, there is a risk of the HP gas entering the low-pressure (LP) system downstream of the control valve LV-100. Maintaining the liquid level blanket to avoid this hazardous event is very important.
The low alarm limit for the pre-trip safety-related alarm is set to 30%. A SIF is implemented to prevent the HP gas from enter- ing the LP system. If the level falls below the trip limit of 10%, then the on-off valve XV-100 closes.
During hazard and risk assessments, all possible process risk scenarios are identified and documented. Hazard and oper- ability (HAZOP) studies are one of the most commonly used techniques of performing process hazard assessments. The scenarios are risk ranked based on the risk matrix developed by the organization. The risk matrix typically defines different levels depending on the severity of consequence and likelihood of the hazardous event. The scenarios with risks higher than a predefined threshold level are considered for LOPA to ensure adequate LOP exist. For the process scenario described previ- ously, an extract from the LOPA is presented in TABLE 1. The
safety functions are assigned to LOP during LOPA.
Two independent protection layers are identified in the LOPA example above. The first protection layer is the operator response to an alarm with a risk reduction factor (RRF) of 10. The RRF is also represented in terms of probability of failure on demand (PFDavg). In this instance, PFDavg = 1/RRF = 1 X
10-1. As per ISA84, the maximum risk reduction that can be as- signed to operator response to an alarm implemented in a basic process control system (BPCS) is 10.
The second protection layer is a SIF with an RRF greater than 100. A SIF is a combination of sensor, logic solver and final control element. The logic solver is often referred to as the SIS.
The logic of a SIF is executed in the SIS. The amount of risk reduction provided by the SIF determines the SIL of the SIF. As shown in TABLE 2, there are four SILs defined in the ISA84
standard that offer different bands of risk reduction.
In the LOPA example, the SIF must offer an RRF of greater than 100 to reduce the overall PFDavg to less than 1x10–5 and
thus the SIF has a SIL requirement of SIL2. Without the alarm, the SIF must have an RRF greater than 1,000 (or PFDavg < 1 X
10–3), making it a SIL3 SIF. A SIL3 SIF requires considerably
higher capital cost to implement. For example, it may require two ON/OFF valves in a double block and bleed arrangement, an additional pressure transmitter for pressure between the two block valves and a logic solver with much more stringent re- quirements of failures rates.
Safety related alarm. An alarm is called a safety related
alarm (SRA) when operator response to the alarm is used as a protection layer to reduce the overall risk of hazardous process events. This alarm is identified in LOPA and a risk reduction credit is taken for the operator response to the alarm. Other terms used to describe this alarm type are: critical alarm, safe- ty-critical alarm and process safety alarm. The ISA18.2 stan- dard talks about an alarm class called highly managed alarms (HMAs). The ISA84 life cycle requirements of SRA are similar to those of highly managed alarms, although many people do not like to use the term “highly managed alarms” to classify safety-related alarms.
Considerations for using operator response to an alarm as a protection layer are summarized as:
• The sensor used for the alarm system is not used for con- trol purposes where loss of control would lead to a demand on the SIF
• The sensor used for the alarm system is not used as part of the SIS
• Limitations have been taken into account with respect to risk reduction that can be claimed for the BPCS and common cause issues
• Risk reduction claimed is not more than a factor of 10 • There is sufficient time for the operator to take corrective action
• Documented description of the response to the alarm (corrective action) is available, and rationalization has been performed
• The operator has been trained to take preventive actions • Performance shaping factors have been considered • Human ergonomic factors have been considered
• The test and maintenance requirements are the same as any other independent protection layer
TABLE 1. An example of LOPA
Hazard description
Loss of primary containment of the product gas upon failure of level control. Loss of liquid blanket in KO drum V-100 and HP fl ammable gas entering the LP system through KO drum bottom with a possibility of explosion and fatality.
Description
Probability (frequency per year)
Tolerable risk (defi ned by organization) [ 1 in 100,000]
1 X 10–5
Likelihood of initiating event (control loop failure) [1 in 10]
1 X 10–1
Probability of ignition 1
Likelihood of operator present near the vessel
1 X 10–1
Frequency of unmitigated consequence
1 X 10–2
Independent protection layers
IPL description Probability of failure IPL-1 Operator response to low alarm
in KO drum
1 X 10–1
IPL-2 SIF to close the valve XV-100 upon low-low level alarm
1 X 10–2
Frequency of mitigated consequence 1 X 10–5
TABLE 2. SIL determination
SIL
Demand mode
Target PFDavg Target RRF
1 > = 10–2 to < 10–1 > 10 to < = 100
2 > = 10–3 to < 10–2 > 100 to < = 1,000
3 > = 10–4 to < 10–3 > 1,000 to < = 10,000
Hydrocarbon Processing | MARCH 2013 67
Safety Developments
• The sensor used for generating alarms should be tested at a proof test interval established in the safety requirements speci- fications
• The person who performed the tests and any maintenance should be documented and archived for the records
• Access control: Access to make changes to alarm param- eters such as alarm setpoint, priority and filter constant are re- stricted, and the proper management of change (MOC) process is followed to make any changes once the system is put in service.
Independency requirements. ISA84 clause 11.2.10 states
that one should not share a device with a SIF and control func- tion where failure of the device will cause the BPCS loop to place a demand on the SIS and simultaneously cause the SIF to fail in a dangerous state. Therefore, the same sensor used for generating an alarm cannot be shared with the control function in a BPCS and cannot be shared with the SIF implemented in the SIS. When independent sensors are available for each func- tion, they can be configured in a fault tolerant mode to achieve higher reliability and to increase the diagnostic coverage. Each owner operator company is responsible for doing analysis to en- sure their configurations are valid and can satisfy the indepen- dency criteria of ISA84.
Standalone SRA. When the operator response to an alarm is
identified in LOPA as a protection layer, and there are no other safety instrumented functions or control functions associated with the measurement, then the transmitter is wired to a BPCS, as shown in FIG. 2.
SRA and control function. SRAs and control functions are
both implemented in BPCSs. When the same process measure- ment is required for both the functions, separate transmitters should be used for alarm and control. Each transmitter should be wired to separate cards in the controller or preferably separate controllers of a BPCS.
The two transmitters could be configured in a BPCS, as shown on the right of FIG. 3, to improve the availability, facili-
tate the maintenance and to improve the diagnostic coverage. The right side of FIG. 3 also illustrates that a software switch
(HS-100x) is provided for the operator to manually change the source of input for the alarm, as well as for the control function. This allows taking one of the transmitters out of service for maintenance or proof testing. Depending on the functionality in the BPCS, a deviation alarm should be configured for the maintenance technician. If the difference between the readings of two transmitters is more than a pre-set threshold, a low pri- ority deviation alarm is generated. The operator action for this alarm is typically to generate the maintenance work order for the instrument technician to check the transmitters and cor- rect the situation.
When the hand switch is used to switch the input to another source, a timer KS-100x should be started with alarm KAH- 100x. If the time in the switched input mode exceeds the pre- configured limit, an alarm should be generated. The preconfig- ured timeout limit to generate a warning alarm should be less than the minimum time to repair (MTTR) of the transmitter. If the time in the switched state exceeds the MTTR, then the SIF should initiate the action to put the process in a safe state.
SRAs, control functions and SIFs all need the same process
measurement. When the available instrumentation is adequate to meet the independency criteria of each function, it is benefi- cial to wire it in a fault tolerant configuration to improve reliabil- ity. A typical implementation is shown in FIG. 4. In this example,
the three differential pressure transmitters using independent taps for process connection are wired to a SIS through the safety certified current-loop isolator and repeater. The HART trans- mitters are powered by a SIS and the isolators have the capabil- ity to pass through the HART signals on each channel.
SIL1 or SIL2 SIF is implemented in a SIS with a 2oo3 voting logic for the level input signals. Depending on the BPCS, the actual implementation may differ. FIG. 4 shows a generic repre-
sentation where a middle of 3 selector block is used in a BPCS that has an SRA configured. Some BPCS have a standard 2oo3 block that can be used as well.
When sharing the transmitters between a BPCS and SIS, several considerations should be taken into account:
• Failure of any hardware or software outside the SIS should not prevent any SIF from operating correctly
100B
LT
100B
LI
100B
LAL
SRA
FIG. 2. P&ID representation of a stand alone SRA.
100BLI 100BLAL SRA 100ALT 100LC LV-100 100B LT 100BLI 100BLAL SRA 100ALT 100LC LV-100 100BHS 100AHS 100BKS 100AKS 100BKAL 100AKAL
DEV
100BLT
FIG. 3. SRA and control functions are both implemented in BPCSs.
100ALT 100ALI 100 LAL SRA 100BLT 100CLI LC-100 100CLT XV-100 1 SIF I I I I I I 100 LC 100BLI m of 3 m of 3 DEV 100 LALL
LY-100A LY-100B LY-100C
FIG. 4. SRAs, SIFs and control functions are implemented to achieve high reliability.
68MARCH 2013 | HydrocarbonProcessing.com
Safety Developments
• Failure of a BPCS component does not result in the ini- tiating cause for the process hazard and the failure (or defeat/ bypass) of the SIF that protects against the specific scenario un- der evaluation
• The probability of common mode, common cause or de- pendent failures has been adequately evaluated and determined to be sufficiently low; it is often recommended to use diverse measurement technology to reduce the common cause failure problems (a combination of differential pressure and guided wave radar transmitters is an example of using diverse technolo- gies to measure the same process value)
• The shared components are managed according to ISA84, including proof testing, access security and management of change
• The sensor is powered by the SIS
• The signal is transmitted to the BPCS by an optical isolator or other means to ensure that no failure of the BPCS affects the functionality of the SIS.
Time considerations. Time available for the operator to take
corrective action is an important factor in alarm engineering. In the alarm rationalization process, the alarm priority is im- pacted by the severity of the consequence of inaction to alarms and the time available for the operator. Alarm philosophy documents typically recommend the use of a rationalization matrix of consequence severity and maximum time to respond.
Process safety time (PST) is the difference between the time at which the unacceptable condition occurs (TCONDITION)
and the time where the unwanted event occurs (TEVENT).
Process safety time = TEVENT – TCONDITION
In the previous example, there are two protection layers. The process safety time for an alarm is the time when the lev- el reaches 30% until it goes to the trip setpoint of 10%. The time can be calculated by dividing the volume of the knock- out drum for the 20% of instrument range (the difference be- tween the alarm setpoint and the trip setpoint) by the worst case flowrate of condensate when the level control valve LV- 100 stays wide open. The process safety time for the SIF can be calculated in a similar manner. It will be the time when the level reaches a trip point of 10% until it goes to 0%.
Typically, the process safety time calculation is done by pro-
cess engineers. Such calculations become easy when the process model is available.
Alarm response time is the difference between the time at which the alarm condition occurs and the time when the process starts responding in the direction to correct the alarm condition. It includes the sensor lag, BPCS lag, operator response time and any process lag. Process dead time is the amount of time it takes for the process to begin reacting after corrective action.
Alarm response time = Sensor delay + BPCS delay + Opera- tor response time + Process dead time
The process safety time for alarm has to be greater than the alarm response time. These different time elements are shown in FIG. 5.
Operator response time is impacted by things like human factors and ergonomics, which are collectively called perfor- mance shaping factors. As per the ISA18.2 feedback model of operator process interaction, the operator response time consti- tutes the following human interactions:
Detect: The operator becomes aware of the deviation from
the desired condition. The design of the alarm system and the operator interface impact detection of deviation.
Diagnose: The operator uses knowledge and skills to inter-
pret the information and diagnose the situation when determin- ing the corrective action to take in response.
Respond: The operator takes corrective action in response
to the deviation.
Minimum time to respond is the quickest possible time
to allow an operator to go through the detect, diagnose and re- spond steps. It should be defined in the alarm philosophy docu- ment. It is not physically practical to take necessary corrective actions in less than this time. Three to 10 minutes is the most commonly used value as a minimum time to respond.
If the required operator response is faster than the minimum time to respond, then no credit can be taken for the operator response to alarm as a protection layer. This requirement is ap- plicable to not just the SRA, but also to any alarm configured in the system. In such situations, various options should be reviewed to allow sufficient time for operator response. In the above process example, the simplest option is to check if a low