IPF

(1)

SIL Assessment & SIS design

for non Functional Safety

1 Experts

Revision : 0 April 2004

Author: Jan Wiegerinck

1

(2)

Shell Global Solutions

What everybody

1

_{should know about IPF}

Presenter: …

Author: Jan Wiegerinck - Shell Global Solutions Int.

1_{) Especially Process engineers and Operation superintendents}

The title was intended to be “IPF for Dummies”. However “…. For Dummies” is a registered trademark of Wiley Publishing Ltd, the well known U.S. publishing company. Therefore we could not use that title.

This presentation and hand-outs are intended for process engineers, operational personnel and others that are involved in the process of IPF Classification and testing. It is made for those who need to know the basics and essentials of IPF classification without having to know all details, if’s and but’s.

This presentation aims to provide appreciation of the IPF method (e.g. why an IPF study needs to be done) as well as buy-into the conclusions and resulting IPF design and test effort.

IPF means: Instrumented Protective Function, ie a protective function that is realised by instruments. So a Relief Valve (RV) is not an IPF, nor is a non-return valve (NRV). One could apply risk based design and maintenance techniques also to RV’s and NRV. These methods however are still under development.

(3)

Shell Global Solutions 2

All about Risk

• Instrument Protective Functions (IPF) are used to reduce risk • If there is no process risk, there is no need for an IPF. • If the risk is high, the risk need to be reduced a lot,

if small, the risk is only to be reduced a ‘little’.

• The IPFClass or Safety Integrity Level (SIL) is a measure for the

amount of risk reduction required.

IPFs are all about risks. IPF’s are intended to reduce the risk using instrumentation. IEC 61511- the relevant international standard, refers to the risk reduction achieved by instrumentation as “functional safety”.

IPF methodology is intended to allow the design and maintenance of tripsystems to be based on the risk to be reduced. The higher the risk, the more effort we have to do to keep the remaining risk acceptable.

E.g. if a certain process hazard may occur every 10 year (e.g. the failure of a control loop in the dangerous direction) and the consequences are that a large compressor is exposed to liquid carry over from the inlet scrubber, we can assess the risk. E.g. if it happens we have to repair the compressor and the resulting cost of repair and lost revenue from downtime is 5 million $, we can estimate the risk at 500K$ per year. This is not acceptable and needs to be reduced. By installing a high level switch that trips the compressor, we can avoid the consequences (the ‘hazardous event’). This IPF should reduce the risk from 500K$ to say 5K$ per year.

The IPF in the above example reduces the risk by a factor of at least 100. Instead of referring to the risk reduction to be achieved, we refer to the SIL as per IEC 61511 …

(4)

What is Risk?

• Risk is the likelihood of an event times the severity of the

consequences.

• The likelihood is expressed as a frequency (e.g. 0.2 times per

year)

• In Shell the severity of consequences are expressed in terms of

consequences to people, environment and the business ($).

• For IPFs the risks are assessed for each hazardous event to be

protected against. E.g. burner flame-out leads to furnace explosion.

Flame out happens about once per 5 years, consequence will be 5M$ + possible casualties.

Because an IPF is intended to reduce the risk , we first have to assess the risk to be reduced.

What is risk?

Risk in the process industry is commonly expressed as the frequency at which the problem may occur multiplies by the severity of the consequences if it is not stopped by any protective measure.

The severity of the consequences is expressed as the consequences to people, environment and assets (repair costs and production losses).

In the IPF method, only the risk is assessed that is associated with the specific hazardous situation that the IPF is protecting against. So the hazardous situations are taken one by one. The totalised risk of operating an LNG plant is not calculated. Where such total remaining risk is a concern, other techniques are applied (e.g. QRA).

Only where the cumulative risk may be reduced by very obvious measures the IPF methodology recognises the situation and improves the trip system design, This is the so called ‘adding rule’ which is not discussed in this presentation.

(5)

What is Risk? (2)

• Risk can be mapped on a graph

Like liho od Like lih oo d Severity of consequences Severity of consequences

Lines of equal risk

Increasing risk

As discussed, risk is expressed as the product of frequency of occurrence (the likelihood expressed as a frequency) and the severity of consequences.

We can make a graph with the 2 parameters as axis and draw lines of equal product = lines with equal risk.

Risk increases from the lower left corner to the upper right corner of the graph. One could now try to assess the risk by plotting the likelihood and the severity of consequences and establish the risk as a dot (the intersection) on the graph. However assessing the risk accurately is very difficult….

(6)

Semi-quantified Risk assessment

• Risk can be semi-quantified in a matrix

• This is handy because likelihood and consequence severity are

difficult/impossible to assess accurately.

Consequence Consequence Li kel iho od Li ke lih o od High Risk High Risk Low Risk Low Risk

It would be much easier if we only needed to assess in which category the likelihood and consequence severity falls.

E.g. “I do not know the likelihood but it is between once per year and once per 10 years. I do not know about the consequences but I do know that it is between 1 and 10 M$.

By doing so I can relative quickly assess the risk category (e.g. High or medium high).

This technique is the basis for the Shell Hazards and Effect Management Process (HEMP) matrix that is also used by all Shell OU’s.

(7)

Risk reduction

Preventive and Mitigating IPF effects Preventive and Mitigating IPF effects

Consequence Consequence Li ke liho od Li kel iho od High Risk High Risk Low Risk Low Risk Mitigating IPF Mitigating IPF (F&G) (F&G) Preventative Preventative (normal IPF) (normal IPF) Base Risk

Base Risk = Demand rate x consequence = DR x CQ1= Demand rate x consequence = DR x CQ1 End

EndRiskRisk= DR x PFD= DR x PFDtargettargetx CQ1x CQ1

CQ1 CQ1 End

EndRiskRisk = DR x = DR x PFDPFDtargettargetx CQ2x CQ2

CQ2 CQ2

As discussed, an IPF is intended to reduce risk, but we need to know how and how much.

‘Normal’ IPF’s prevent the hazardous situation to develop into an event with undesired consequences. Sometimes, the IPF may fail such that the undesired consequences occur after all. However the frequency at which these events occur are reduced dramatically. So normal IPF’s move the risk downwards on the risk matrix.

Some IPF’s cannot reduce the frequency of occurrence of the event. E.g. a fire detector cannot reduce the frequency at which the fire occurs. However it can reduce the severity of consequences by e.g. initiating a sprinkler system.

(8)

Tolerable and Acceptable risks

risk

intolerable

tolerable

‘broadly acceptable’

SIL at least required to make the risk ‘tolerable; the minimum solution, e.g. SIL 1

SIL required to make the risk more ‘tolerable’; an intermediate solution, e.g.SIL 2

SIL required to make the risk ‘acceptable’; the normal solution (if ALARP), e.g. SIL 3

IPF classification aims to reduce the risk to “broadly acceptable”

According the Shell group HEMP risks should be reduced to a level where the are either as low as reasonably practicably (‘ALARP’) or so low that there is no longer a need to demonstrate that the risk is ALARP. However in all cases we should strive towards further risk (especially personal and environmental risk) reduction as soon as suitable techniques become available and the society acceptance of risks change. Some risks are so high that HEMP classifies them as ‘intollerable’. No matter what it takes, we have to do something about it.

In the ‘ALARP’ region we would need to demonstrate that either the risk can be reduced further (e.g. with IPFs) or that the efforts (and money) required to reduce the risk further would be disproportioned compared to the risk reduction gained. If that is the case the risk is ‘ALARP’.

E.g. if a risk is $50,000 per year and further reduction would also take $50,000 per year, the risk does not need to be reduced further and is ‘ALARP.

Normally IPF’s are not that expensive and using the normal IPF risk graph (see slide 12 – 14) will result in a remaining risk level that Shell considers ‘broadly acceptable’, I.e. there is no need to demonstrate ALARP.

Only in cases where IPF testing needs to be waived, ALARP considerations may be used to justify a waiver.

(9)

SIL Classes

>10000 >1000 >1000 >100 >10 No minimum No minimum Risk Reduction

Dual Redundant trip separate from DCS <0.0001 4 4 N/A N/A Redundant/diverse trip separate from DCS <0.001 3 3 VI VI

Redundant trip separate from DCS <0.001 3 3 V V

Trip separate from DCS

<0.01 2

2 IV IV

Trip separate from DCS

< 0.1 1 1 III III (DCS action) No requirements a2 a2 II II (alarm only) No requirements a1 a1 I I Typical Implementation PFD SIL IPF Class

As discussed IPF Classes are used as categories of IPFs that achieve a certain risk reduction.

Below IPF Class III (PFD < 0.1) there are no requirements with regards to the risk reduction to be achieved however there may still be a requirement/opportunity to reduce the risk further by having an alarm or an automated DCS action.

For SIL 4 IPF’s there is no equivalent IPFClass. Indeed a risk reduction better than 10,000 is very difficult to achieve and seeking alternative risk reducing measures is often a better option.

A High Integrity Pressure Protection System (HIPPS) is the only practical example of SIL 4 IPFs known. E.g. PDO’s Main Oil Line has a few.

(10)

Risk Reduction with IPF/SIFs

Consequence Consequence Li ke liho od ( y Li kel iho od ( y --1 1)) High Risk High Risk Low Risk Low Risk Initial risk Initial risk Risk Reduction of a Risk Reduction of a factor >100 factor >100 => SIL 2 => SIL 2 1 1-1 1-2 Remaining Risk Remaining Risk Tollerable risk Intollerable risks Broadly acceptable risks

When the initial risk has been mapped on the risk graph/matrix, and the area’s of tolerable and acceptable risks are known, one can determine how much risk reduction is needed.

In the example above, the risk reduction required is 100 to get into the ‘broadly acceptable risk’.

This kind of considerations form the basis of the calibration of a risk matrix that yields the required SIL directly….

(11)

A Risk Assessment Matrix (RAM; example only)

1 1 22 33 44 a a 11 22 33 a a aa 11 22 -- aa aa 11 Consequence Consequence Li ke liho od ( y Li kel iho od ( y -1 1)) High Risk High Risk Low Risk Low Risk 1 10-1 10-2 Tollerable risk Intollerable risks Broadly acceptable risks

The required SIL (to make the risk The required SIL (to make the risk broadly acceptable) can directly be broadly acceptable) can directly be entered in the cell that represents entered in the cell that represents the initial risk.

the initial risk.

As seen in the previous slide, each cell of the risk matrix requires a certain risk reduction to achieve ‘broadly acceptable risks’. So we can immediately put the required SIL in each cell such that – after the implementation of the IPF – the risk becomes broadly acceptable.

This has been done in the risk graph above. Please note that the above example is just an example and should not be used for any risk or IPF study!

(12)

RAM calibration

• For every RAM, the calibration is extremely important as it

embeds acceptable remaining risk criteria

• Assumptions and guidelines for use are critical e.g.: • Average consequences or potential consequences? • Credit for post top event mitigation layers built in or not?

(RRM RAM does include, SOPUS and SIC RAM does not)

• How to assess likelihood? Include which non-IPF protection

layers?

• Etc.

For those of you with special interest in risk assessment and differences in graphs used in and outside Shell:

For those of you that might have been exposed to different risk graphs and matrices, please note that the road to a calibrated risk matrix is full with pitfalls and

assumptions that should be clarified and enforced when it is used.

E.g. some matrices (like the RRM-IPF RAM and the 1996 IPF DEP risk graph) assume potential credible consequences where others assume avarage consequences.

The RRM-IPF RAM as well as the 1996 IPF DEP risk graph take credit for other ‘post top event’ (see slide 16) mitigation layers such that the user does not need to

specifically take them into account. This makes the matrix/graph easy to use but create seemingly high remaining risks, especially for personnel safety.

E.g. if an hazardous situation occurs every 10 years and a casualty may result, both the RRM-IPF RAM and the 1996 IPF DEP risk graph require an IPFClass IV / SIL 2 (risk reduction of 100). This means that the casualty is now experienced once per 1000 years. This is too much as per common corporate acceptable risk criteria (less than once per 10,000 years per hazardous situation). However if the embedded credit for other ‘post top event’ (see slide 16) mitigation layers is taken into account, the remaining risk becomes better than once per 10,000 years.

(13)

Risk Reduction-

the effect of over/under engineering

Risk

Trip system complexity

Under-engineering Over-engineering SIFpro optimizes the design into this area

ALARP LOPA

‘Every advantage has its disadvantage’ 2_{also apply to installing SIFs in a process plant.}

By installing a SIF a new situation is created that may create new hazardous situations. If the instruments fail spuriously economic losses are incurred and the event often results in flaring (environmental consequences).

So the risks associated with the original hazardous situations are reduced and new ones created such that at some stage the total risks again increase. At this point the plant becomes over engineered.

Therefore, to arrive at a fit for purpose SIS, also the risks associated with spurious trips (safe failures of instruments) need to be studied.

Tools such as Layer of Protection analysis (LOPA) and ALARP evaluation help to prevent over and under engineering.

LOPA helps to estimate the ‘unmitigated event frequency’ (the hazardous event frequency if the SIF were not realised) more accurately.

An ALARP evaluation also considers the new risk created by the various SIF designs planned.

Therefore SIFpro™ includes both tools to help to arrive at a design that is fit for purpose.

(14)

Fundamentals of IEC 61508 / 61511

• Know your hazardous situations

• Evaluate the acceptability of the risks of those hazardous

situations.

• Classify the required Safety Integrity of the protective measures

(establish the Safety Integrity Level, SIL)

• Implementation and testing to be based on SIL

• Implement and maintain a Safety Management System • Documentation

• Auditing (assessment and verification) • Procedures & Planning

• Control of Human Factors

The Fundamentals of Safety are at the heart of the IEC 61508 and 61511. It concentrates on:-

When designing and planning your process, you have to evaluate all your potential hazards. This may be done using HAZOP or any other method that arrives at a similar result.

Of each hazard, one should establish if the hazard is acceptable without additional measures or if safeguards maybe required. These maybe procedural, changes in the

design, mechanical (RV’s etc..) or by instruments.

For instruments, you have to classify the safety functions into safety integrity levels (SIL) that essential give a measure of the degree of risk reduction these functions

should offer. This risk reduction is expressed as ‘probability of failure on demand’.

Of course the instruments should be able to bring the process to a safe state! Following the establishing of the SIL, one should design and maintain the instruments to ensure that the requirements of the SIL are met. Moreover these design, construction, testing, commissioning and maintenance activities shall be planned and auditable (documentation).

(15)

Shell Global Solutions Process under control

Process deviation or disturbance

Process out of control

Hazardous situation Released Hazard Hazardous event IPF Demand scenario Design intent: “prevent <released hazard>”

Consequences of failure on demand

Consequences

Chain of events

Considering a situation where the process is perfectly under control up to a situation that a hazardous event has taken place with serious consequences, the stages as depicted above can be distinguished.

Obviously, the intention of all kinds of safeguarding measures is to prevent or mitigate the impact and consequences of a hazardous event. Of essential importance is that these safeguarding measures indeed realize their goal and altogether lead to an acceptable safe operating process installation. Therefore, these safeguarding measures need to function properly and need to be reliable enough. Adequate definition of safeguarding measures can only be achieved if a full understanding of their design-intent is known.

An Instrumented Protective Function (IPF) is defined as a function implemented by means of instruments, and intended to achieve or maintain a safe state for the process or mitigate consequences, in respect of a specific hazardous event. The slide above also illustrates the terms ‘demand scenario’, ‘design intent’ and ‘Consequences of Failure on Demand’.

(16)

Layers of Protection (the onion model)

Alarms Preventive IPFs Mitigative IPFs thre ats consequen ces barriers (independant) the bowtie

The risk of a scenario is reduced by applying multiple, diverse safeguarding layers. This has often been referred to as the “onion model” (Guidelines for Safe Automation of Chemical Processes, CCPS 1993).

We have often illustrated the same principle by the bow-tie. At the left hand of the bow-tie the protection layers are shown that reduce the frequency of the ‘top event’ (e.g. loss of containment). Because the likelihood decreases after each protection layer, the height of the triangle reduces. At the right hand side the top event, the protection layers are shown that try to mitigate (reduce) the extend of the

consequences. However each time a mitigative layer fails, the severity of the consequences increase, hence the height of the triangle increases.

IPFs form part of the overall protection system and when doing an IPF study, the presence and effectiveness of the other layers is taken into account when

(17)

IPF: Criticality analysis – RAM

Demand rate (how often is the IPF/SIF required; what is the frequency of the hazard situation to be protected against)

Consequences of failure on demand (of the hazard) Consequences of failure on demand (of the hazard)

Criticality Criticality Criticality

This is the RAM used in SIFpro. Either by direct selection or by doing a LOPA analysis the unmitigated event frequency is established. The unmitigated event frequency is often referred to as the demand rate although this term is essentially misleading.

The risk to be reduced by the SIF is also depending on other protection layers that would act in case the SIF fails on demand I.e. act after the SIF had its chance (e.g. a non return valve as part of a backflow protection system). This means that the frequency at which the hazardous event will occur (e.g. actual backflow) does not necessarily occur at the same frequency bat which the SIF is demanded to work. Next consequence severity is established depending on the consequence category, different questionnaires are available to help assessing the severity.

The highest consequence severity and the ‘demand rate’ establish the initial risk or ‘criticaliy’.

SIFpro™ allows the RAM to be calibrated and therefore rates the initial risk (using letters like L,M,H, etc.) and the SIL is mapped against each cell in the RAM.

(18)

Design of an IPF

• The SIL is a measure of the risk reduction expected to be

delivered by the IPF.

• Two requirements for each

IPF:-1. The IPF shall meet the required degree of fault tolerance 2. The IPF shall meet the required PFD

In order to comply with IEC 61508 and IEC 61511 the IPF methodology requires the design of an IPF to comply with both the following requirements:

The deterministic requirements (minimum degree of fault tolerance). E.g. for an SIL 3, at least a 1oo2 voting architecture is required. Detailed rules etc. would be too much detail for this slide pack.

This rule is intended to protect the designer against over optimistic probabilistic assumptions in cases of high risks (‘lies, damned lies and statistics’)

The probabilistic requirements (meet the maximum PFD of the SIL; see slide 8). E.g. for an SIL 3 the overall PFD should be better that 1E-3. See next slides for further details.

Additionally the designer of the trip system shall ensure that:-

The IPF meets the performance requirements (response time, TSO, accuracy) The documentation etc. is in order.

(19)

Meeting the PFD

element final solver ic initiator IPF PFD PFD PFD

PFDIPF =PFDinitiator +PFD_logic_{_}solver +PFDfinal_{_}element

PFD = + _log _{_} + _{_}

If the same PFD is assigned to sensor and final element, the If the same PFD is assigned to sensor and final element, the target PFD of the sensor or the final elements is calculated as target PFD of the sensor or the final elements is calculated as follows: follows:

(

)

2 _ log arg solver ic SIL et t PFD PFD PFD =

(

−

)

2 _ log arg solver ic SIL et t PFD PFD PFD = −

The following slides aim to introduce the statistical calculations that should demonstrate that the probabilistic requirements for the SIL have been met.

In general the PFD of the IPF is the sum of the PFD of all independent components like the initiators (the sensors), the logic solver (e.g. the safety PLC) and the final elements (the valves, etc.). Invariable the field devices (sensors and final elements) are the weakest part of the IPF.

Many IPFs share initiators and final elements. If the test effort (see next slides) is optimised for one IPF, it will influence other IPFs as well. Optimising all test efforts of all components of the tripsystem is quite a calculations task!

To simplify calculations, the PFD budget of an initiator or final element is often established by subtracting the PFD of the logic solver from the ‘available’ PFD of the complete IPF and divide the remainder equally between the initiator and the final element. This is the approach taken by the RRM-IPF software. SIFpro on the other hand optimises the whole function and takes the complete available PFD into account to optimise initiator and final element testing.

(20)

Instruments do fail sometimes!

time frequenc y of f ai lure (y -1)

Early life failures (‘infant mortality’)

Late life failures (‘ageing’) Combined (‘the bath tub curve’)

The calculation of the likelihood of failure of a safety related (IPF) instrument at the moment it is demanded to act (the probability of failure on demand or PFD) is based on the assumption that the failure behaviour of instruments is generally random. This assumption is illustrated above and on the next slide.

Instruments are initially exposed to ‘early life failures’ as caused by manufacturing defects, application and commissioning problems. The likelihood or frequency of occurrence decreases rapidly over time. (the green curve).

On the other hand instruments, they are subjected to ageing as well as caused by corrosion, erosion, fatigue, effects of possible stressful environment (UV, RFI etc.), etc.. The effects of these age related failures tend to rise slowly over time until wear-out sets in and the likelihood rises rapidly. This is shown with the red curve. E.g. for ESD valves used in refineries, statistics from Exxon suggest that this effect sets in after 10 years or so.

(21)

Instruments fail randomly..

time time freq uency o f fai lure (y fre que nc y o f fa ilur e (y -1 1)) Testing & Testing & commissioning

commissioning ‘Mission time’‘Mission time’ Replacement/overhaulReplacement/overhaul Failure rate is regarded constant and random during mission time

(e.g. λdu= 4E-2 per year) Failure rate is regarded constant and random during mission time

(e.g. λdu= 4E-2 per year)

During the initial phase of the life of an instrument, it is not really used for its safety mission yet. The purpose of testing and commissioning is to find systematic (wiring, configuration, integration etc. problems) and early life failures.

After commissioning the instrument is really used but before old age is taking its toll, it is either replaced or overhauled to re-instate the ‘as new’ condition.

In the mission time, the failure rate (the frequency at which a failure occurs) remains practically constant.

The failure rate could be e.g. be 2E-2 per year. Obviously an instrument cannot fail for 2%. It fails or it doesn’t. A failure rate of 2E-2 should be interpreted as 2 out of 100 instruments failure in one year. Which instrument and when in the year is taken as random.

(22)

Probability of failure

• Imagine… a bucket with 95 black and 5 red balls • Every year I take one ball and put it back if it is black.

If it is red I keep it and stop sampling.

• A red ball indicates that the instrument failed dangerously but

I do not know (unrevealed).

• What is the chance that I have a red ball after 1 year? (5%)

• What is the chance that I have a red ball after 2 years?

(0.05+0.05*0.95=9.75%). Etc.

• The chance of having a red ball increases over time until it is

100%.

The probability of failure on demand is the probability that I will find an instrument failed at the moment it is actually required to work properly as caused by a demand on the IPF. So we can compare it with an experiment with red and black balls in a bucket.

The bucket contains 100 balls of which 5 are red. Each year I take one ball (blind folded) and check the colour. If it is black there is no failure and I put it back. If it is red, it symbolises a failed instrument. Once the instrument failed, it cannot really fail again and therefore I stop taking samples once the red ball is taken.

After one year, I check the colour of the ball. What is the chance that it is red? 5% of course. What is the chance after 2 years that it is red? This is the probability that it is

red after 1 year + the probability that I take a red ball the next year. For the 2nd_year

the chance is equal to the probability that it was black the 1st_{year (95%) times the}

chance that it is red the 2nd_{year (5%).}

If this experiment is done during many many years, the probability that there is a red ball becomes 100%. The probability over the years is shown in the next slide.

(23)

Probability of failure as function of time

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0 10 20 30 40 Years PF D t PFDt

First few years PFDt is about linear: PFDt ~ λd* t

The first few years, the probability of failure on demand (PFD) rises almost linear with time. This is shown as the purple line on the slide above.

(24)

IPF testing

• Imagine… After a while I check if I have a red ball. If I do, I

put it back and start over again.

• In other words I check if the instrument failed dangerously and

unrevealed. If it did, I will repair.

• The PFDt is now reset to zero after the test because:-• I am sure it did not fail yet (PFD = zero)

• I repair if failed (PFD is zero again after the repair) • Suppose I test every 2 years…

Testing has the effect of putting the red ball back into the bucket if I verify if I have one or not.

(25)

PFD as function of time with testing

0 0.02 0.04 0.06 0.08 0.1 0.12 0 5 10 15 20 25 year PF D t PFDt Because a demand may occur any time we are interested in the average risk reduction, i.e. the PFDavg

Because a demand may occur any time we are interested in the average risk reduction, i.e. the PFDavg

PFDavg PFDavg

The PFD over time is now reset to zero every 2 years.

Because for real IPFs, the demand may come at any time, we are interested in the

(26)

PFD

_avg

of an instrument

• As can be seen PFDavg ~ ½ λ_duT • Where:

• λ_duis the random dangerous unrevealed failure rate

• T is the test interval.

• This assumes perfect testing and no unavailability during test,

no unavailability due to repairs etc.

• If test is not perfect there is a remnant PFD …

From the previous slide one can see that the PFDavg is about ½ λ T Another way of imaging the effect of testing is the following:

The instrument may fail at any moment. Some failures are noticed immediately because the plant trips, some are noticed because of diagnostics, some are not dangerous (e.g. instrument drift upwards for a h trip will cause the instrument to cause a trip too early), some are dangerous and will not be noticed.

In IPF terminology these failures are called unrevealed dangerous.

An unrevealed dangerous fault may occur any time in between tests. On average it would be half-way the test interval if it occurs at all.

So the fraction of time the instrument failed dangerous and unrevealed is the frequency of failure X half the test interval.

The fraction of time the instrument failed dangerous and unrevealed is also the probability that I will find it failed when there is a demand because the demand may occur at any time.

(27)

PFD as function of time with imperfect testing

0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0 5 10 15 20 25 ye ar PF D t PFDt testable PFDt untestable PFDt PFDavg PFDavg

If the test is not perfect (e.g. the probability that a dangerous fault, if it is there, will be found by the test, is not 100%), there is a remaining probability that there is a

dangerous unrevealed failure left after the test.

Every time the test is carried out, there is an aspect of the instrument that is not looked at by the test. The probability that this part of the instrument develops a dangerous problem increases of time. This is the purple line.

The resulting overall PFDt rises over time and hence the PFDavg is higher as

compared to the situation with perfect testing.

This implies that the ‘test coverage’ (how good is the test?) has an effect on the

(28)

Factors that affect the PFD

_avg

• Dangerous failure rate

• Diagnostic coverage (turning dangerous failures into detected

dangerous failures)

• Test interval

• Test coverage (how good is the test)

• Test duration (if the device is tested on line and not available

during test)

• Overall failure rate (revealed + unrevealed) in combination

with ..

• Repair time (if the device is repaired on line and not available

during repair)

The above slide summaries the parameters that affect the PFDavg of an instrument.

The list is self explanatory.

Obviously diagnostics are very powerful because a dangerous failure that would otherwise be left unnoticed until the next test, will be detected and alarmed. Repairs are initiated immediately resulting in a much improved ‘fractional dead time’. The fraction of the time that the instrument has a dangerous failure is much reduced because we do not wait until a next test is carried out.

(29)

What if PFD is not achieved..

• Add unrevealed failure robustness • 1oo2 voting: PFD_avg~ 1/₃λ_du2T2 • 2oo3 voting: PFDavg~ λdu2_T2

• Add/improve diagnostics

• Diagnostics reveal dangerous failures that would otherwise

keep ‘lurking in the dark’ until tested: λdu= λd* DCF

• Diagnostic coverage factor (DCF): the higher the better • Improve λ_du

• Buy instruments and hook-ups with low failure rates • Do PM’s such that age related failures ‘do not hurt’ (where

applicable).

If the instrument is used in redundant configurations, the overall PFD is different. Some simplified formulae are given above.

If instruments (like valves) are used in severe service such that they are exposed to accelerated wear and tear, the age related failures will occur much earlier and the instrument no longer behaves with random failures. These age related failure modes (e.g. valves getting stuck because of excessive fouling) should be ‘taken out of the equation’ by having PM tasks that prevent the failure mode to occur (e.g. clean out or move to valve regularly to prevent getting stuck).

(30)

Learning's…(1)

• IPF testing effectively reduces the time a dangerous undetected

failure remains ‘lurking in the dark’: reduces PFD, reduces risk.

• IPF testing is dictated by the risk reduction (= PFD_avg) to be

achieved. (PFDavg ~ ½ λduT)

• Required risk reduction is dictated by the initial risk.

(31)

Learning's…(2)

• Unrevealed Failure Robustness dramatically improves PFD_avg • Diagnostics dramatically reduces manual testing efforts. • MVC is an effective way to diagnose transmitters

• Reducing the test interval by a factor 2 reduces the PFDavg by

a factor 2 and thus increases the remaining risk with a factor 2.

• The initiator(s), logic solver and the final element(s) should all

successfully work to avert the hazardous event. Hence:

element final solver ic initiator IPF PFD PFD PFD

PFDIPF =PFDinitiator +PFD_logic_{_}solver +PFDfinal_{_}element

PFD = + log _ + _

(32)

Quiz…

• What is risk?

• What are the Shell risk criteria? • What is safety?

• Do we need an IPF if the initial risk is ‘acceptable’?

• What happens to a risk if an IPF is installed as classified using

the corporate risk graph?

• What happens to the risk if tests are postponed or waived? • Why does testing reduce the PFD_avg?

• How can I improve the PFD_avgof an instrument without testing

more?

What is risk? For the process industry (IEC 61511) it is defined as the product of the event frequency and severity of consequence. Unit is consequence per time (e.g. 0.1 casualty per year)

What are the Shell risk criteria? Discussion…..

What is safety? The absence of unacceptable risk (Class discussion…..not discussed in this slide pack!).

Do we need an IPF if the initial risk is ‘acceptable’? No.

What happens to a risk if an IPF is installed as classified using the corporate risk graph? It becomes broadly acceptable.

What happens to the risk if IPF tests are postponed or waived? The risk increases and will likely become ‘tolerable’. ALARP should be demonstrated (acc HEMP). It is not expected that the risk becomes intolerable because that would require the test interval to increase with more than a factor 10 (inferred from the HEMP).

Why does testing reduce the PFDavg? Because it reduces the time an undetected dangerous failure may be present, I.e. it reduced the ‘fractional dead time’, the fraction of time the device is not available to carry out its safety mission. How can I improve the PFDavg of an instrument without testing more? Add

unrevealed failure robustness, Improve diagnostics or improve the dangerous failure rate.

(33)

Shell Global Solutions

Jan A.M. Wiegerinck

Senior consultant instrumentation & plant automation.

E-mail: [email protected]

Tel: +31 70 3772083 Fax: +31 70 3771950