r2A Risk and Reliability 5th_Edition

(1)

(2)

Risk & Reliability Associates Pty Ltd, Consulting Engineers.

5th Edition Cover by Peter Anderson

5th Edition Co-ordination and review by Kris Francis. 5th Edition editing by Cherilyn Tillman and Bob Browning. Printed and Bound in Australia by Imscam Pty Ltd, Melbourne.

This text is copyright. Apart from any fair dealing for the purpose of private study, research, criticism or review or as otherwise permitted under the Copyright Act, no part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means electronic, optic, mechanical, photocopying, recording or otherwise without the prior written permission from the publisher, Risk and Reliability Associates Pty Ltd.

ISBN 0-9585241-3-0

RRP AUD $298.00 (including GST). Postage and handling extra.

Published by:

Risk & Reliability Associates Pty Ltd ACN: 072 114473 ABN: 98 072114473 Consulting Engineers Level 2 56 Hardware Lane MELBOURNE AUSTRALIA 3000 e-mail: [email protected] web: http//www.r2a.com.au fax: +61 3 9670 5278 voice: +61 3 9602 4747

Also in Sydney and Wellington.

This text is intended to provide general information concerning the concepts and applications of risk and reliability theory. The text is used by R2A in its training courses on risk and reliability assessment. The examples and templates are provided as examples of the analytical tools used in assessing and managing risk. They should not be used a substitute for obtaining professional advice or assistance. The authors accept no responsibility for any errors or omissions in the material, or for the results of any actions taken as a result of using these examples or templates.

(3)

R2A Document Control

Risk & Reliability – An Introductory Text

Edn. Date Section Issue/Nature of Revision Prepared: Reviewed:

1.0 04/96 First Edition RMR KJA

2.0 02/97 Second Edition RMR KJA

3.0 01/98 Third Edition RMR

3.1 07/99 Third Edition, Revised RMR

3.2 06/00 Third Edition, Second

Revision

GEF RMR

3.3 01/01 Third Edition, Third Revision LS RMR

4.0 02/02 Fourth Edition GEF, CJT,

RWB RMR 5.0 02/03/04 Fifth Edition RMR, KJA, CJT, RWB CJT, RWB

15/03/04 Typos and layout RMR KNF

23/03/04 Chapter 16 & Index CJT, RWB RMR

04/04/04 Chapters 17 & 18. RWB MK, FS,

RMR

19/04/04 Typos & Index RWB RMR

Contributors to earlier editions and revisions include: Teresa Alam John Bellhouse Keith Hart Matthew Lambert Simon Meiers Paul Rees PM Strickland. TABLE OF CONTENTS

(4)

PART 1 – GENERAL PRINCIPLES

1. INTRODUCTION TO RISK AND RELIABILITY CONCEPTS 1.1

1.1 The Nature Risk 1.2 Types of Risk

1.3 Risk Management Evolution 1.4 Historical Perspective of Risk 1.5 Reliability

1.6 Quality

2. RISK PARADIGMS & MODELS 2.1

2.1 The Rule of Law 2.2 Insurance

2.3 Asset Management 2.4 Threats and Vulnerabilities 2.5 Risk as Variance 2.6 Best Practice 2.7 Simulation 2.8 Culture 2.9 Paradigm Integration 2.10 Risk Models

3. RISK AND GOVERNANCE 3.1

3.1 Risk Management’s Role in Good Governance 3.2 Corporate Governance Systems

3.3 Origins of the Good Governance Movement 3.4 The Rise of the Risk Society

3.5 Governance and Non-Financial Risk 3.6 Public Sector Governance and Risk 3.7 Risk and Corporate Citizenship 3.8 Fallout Severity

3.9 Basic Principles of Good Corporate Urban Governance

4. LIABILITY 4.1

4.1 Criminal vs Civil Standard 4.2 Common Law Criteria 4.3 On Juries and Justice 4.4 Due Diligence

4.5 Safety Cases

4.6 Adversarial Legal System Contradictions 4.7 Risk Auditing Systems

5. CAUSATION 5.1

5.1 Paradigms

5.2 Biological Metaphors 5.3 Discrete State Concepts 5.4 Time Sequence

5.5 Energy Damage

5.6 Energy Damage Models 5.7 Latent Conditions

6. RISK CRITERIA 6.1

6.1 Legal Criteria

6.2 Individual Risk Criteria 6.3 Societal Risk Criteria 6.4 Environmental Risk Criteria 6.5 Insurance Criteria

(5)

PART 2 – TECHNIQUES

7. TOP DOWN TECHNIQUES 7.1

7.1 SWOT Assessments 7.2 Upside and Downside Risk 7.3 Vulnerability Assessments 7.4 Enterprise Risk Profiling 7.5 Project Risk Profiling

8. RANKING TECHNIQUES 8.1

8.1 Risk Registers

8.2 Ranking Acute OH&S Hazards

8.3 Ranking Property Loss Prevention Hazards 8.4 Integrated Investment Ranking

9. MODELLING TECHNIQUES 9.1

9.1 Trees 9.2 Blocks

9.3 Integrated Presentation Models 9.4 Common Cause Failures 9.5 Human Error Rates 9.6 Equipment Fault Rates 9.7 System Safety Assurance

10. BOTTOM UP TECHNIQUES 10.1

10.2 RCM 10.3 HazOps

10.4 Common Mode Failures

10.5 Risk Management and the Project Life Cycle 10.6 QRA

10.7 HACCP

11. GENERATIVE TECHNIQUES 11.1

11.1 James Reason et al

11.2 Transparent Independent Rapid Risk Reporting 11.3 Generative Interview Technique

11.4 Generative Solutions Technique

12. RISK & RELIABILITY MATHEMATICS 12.1

12.1 Discrete Event Mathematics 12.2 Breakdown Failure Mathematics 12.3 State Theory Mathematics

(6)

PART 3 – THEMES, APPLICATIONS AND CASE STUDIES

13. PROCESS INDUSTRY MODELLING 13.1

13.1 Safety Cases 13.2 Context (Top Down)

13.3 Quantitative Risk Assessment (QRA) 13.4 Fire Modelling

13.5 Pool Fires 13.6 Jet Flames 13.7 Explosions 13.8 Toxic Gas Clouds 13.9 Fire Safety Studies

13.10 Risk Criteria Used in Australia and New Zealand

14. CRISIS MANAGEMENT 14.1

14.1 Intention

14.2 Lessons in Fallout Management 14.3 Design Stage

14.4 Case Studies 14.5 Conclusion

15. INDUSTRY BASED CASE STUDIES 15.1

15.1 Airspace Risk Assessment 15.2 Train Operations Rail Model

15.3 Fire Risk Management (in buildings) 15.4 Transmission Line Risk Management 15.5 Bushfire Risk Management

15.6 Tunnel Risk Management

16. OCCUPATIONAL HEALTH & SAFETY 16.1

16.1 Legislative Framework 16.2 OH & S Risk Assessment 16.3 Performance Indicators 16.4 Information Structures

16.5 Audit & Safety Management Systems

17. FINANCIAL RISK 17.1

17.1 Risk and Opportunity 17.2 Terms

17.3 Utility and Risk 17.4 Models

17.5 Market Risk Mathematics

18. SECURITY 18.1

18.1 Security and Risk Management 18.2 Security Terms

18.3 Basic Elements of Security Management 18.4 The Terrorist Threat

(7)

Preface to the 5th Edition

This is the 5th Edition of Risk and Reliability - An Introductory Text. Risk and Reliability Associates Pty Ltd published the first edition of this Text in April 1998.

Presently the Text has three parts. Parts 1-2 are based on the very successful 2-day risk management short courses presented by R2A director Richard Robinson for EEA (Engineering Education Australia). Part 3 summarises published R2A practice experience.

R2A’s intention is to extend the Text to four parts so as to include material based on the System Safety Assurance Course presented by R2A Director Kevin Anderson for EEA. This course presently uses the 4th Edition as background reading, but work on the 6th edition is scheduled for later in 2004.

The evolving nature of risk and risk management in the contemporary globalising environment that is sometimes described as the Risk Society necessitates frequent revision and additions. The recent spate of high profile, local and overseas corporate failures, for example, has created unprecedented interest in corporate governance. The evident vulnerabilities flowing from large-scale technology require scrutiny both from accidental and deliberate actions. And liability is increasingly ubiquitous.

An integration of top down and bottom up risk management concepts and techniques as explained in Parts 1-2 becomes necessary to cope with the widening range and severity of modern risk. Part 3 comprises technical explanations of the practical applications of these concepts and techniques.

The addition of Part 4 to the planned 6th Edition will address risks resulting from the rise of computer systems, and how, in the context of human frailty, such risks can be managed.

R W Browning

Hardware Lane, Melbourne March 2004

(8)

A Short Dictionary of Risk & Reliability Terms and Acronyms The dictionary below defines the usage of key terms in the R2A Text.

Given the multi-disciplinary nature of risk management, different specialist groups often attribute different meanings to commonly used terms and different terms are often used for similar or near identical concepts.

Items underlined are referenced as a separate entry in the R2A dictionary. For simplicity, acronyms have been included rather than giving them a separate listing.

The list is adapted from an earlier list presented in a paper by R M Robinson and D B L Viner (1983). Accountability The property that ensures that the actions of an entity can be traced.

ALARA As Low as Reasonably Achievable.

ALARP As Low As Reasonably Practicable.

Algorithm An explicit and finite step-by-step procedure for solving a problem or achieving a required end.

Asset In engineering and commerce, usually a capital cost item. In security, insurance and loss control, usually refers to an item that if (accidentally) lost would cause a loss.

Audit An inspection or checking of methods of doing business. Audit Trail Data collected and potentially used to facilitate an audit.

Availability The ratio of the total system or entity ‘up time’ to system or entity elapsed time, the latter being the sum of the total ‘up time’ and ‘down time’. It is therefore a function of reliability and repair time.

Business Interruption In insurance terms, the loss of profits over a defined period, typically a year; otherwise any production or sales stoppage.

Common Law The unwritten law derived from the traditional law of England as developed by judicial precedence, interpretation, expansion and modification

(Butterworths (1998). Concise Australian Legal Dictionary. Butterworth, Australia).

Common Mode Failure Common Mode Failures refer to the simultaneous failure of multiple components or systems due to a single, normally, external cause such an earthquake or fire. It is used to distinguish discreet failures of individual components or systems due to a defect arising locally within that component or system.

In commercial terms it refers to threats whose occurrence would simultaneously affect multiple inputs to any equation, for example, the advent of a third world war, change in interest rates, raw material sources and the like.

(9)

Controls The most common term used in safety and in this context means to hold in check or to restrain. It encompasses a large range of measures taken to reduce the likelihood and consequences of adverse outcomes. Controls can encompass both protection and precautions. For example, personal

protective equipment is generally protection. The usual hierarchy of controls is:

Elimination, that is, removal of the hazard or risk

Engineering controls, that is, those that design out the hazard or reduce it Substitution of a less hazardous substance or equipment or process Administrative controls such as job rotation to reduce exposure time to the

hazard

Personal protective equipment, for example, dust masks, hearing protectors, gloves etc

Critical Control Point A point, step or procedure at which control can be applied and a food safety (CCP) hazard can be prevented, eliminated or reduced to acceptable levels. Damage Control Procedures designed to minimise the severity of loss. The same

performance of a function by two or more independent and dissimilar means (of particular reference to software) (Smith D J (1993) Reliability,

Maintainability and Risk. Practical Methods for Engineers. 4th Edition. Butterworth Heinemann, Oxford).

Due Diligence A minimum standard of behaviour involving a system which provides against contravention of relevant regulatory provisions and adequate supervision ensuring that the system is properly carried out (Butterworth (1998). Concise Australian Legal Dictionary. Butterworth, Australia).

A statutory defence to a charge of causing or permitting environmental harm or pollution (Butterworth (1998). Concise Australian Legal Dictionary. Butterworth, Australia).

Engineering Those activities devoted to changing the material world to a desired state (Robinson Richard M (1981). An Outline of the Philosophy of Engineering and its Consequences, General Engineering Transactions, Engineers Australia, Vol. GE5, No.1, July 1981 pp.35-41).

ERA Environmental Risk Assessment.

ERRF External Risk Reduction Facility.

Engineers Australia The trading name of The Institution of Engineers, Australia

Environmental Hazard An event or continuing process, which if realised, will lead to circumstances having a potential to degrade, directly or indirectly, the quality of the

environment in the short or long term. (Wright N H (1993). Development of Environmental Risk Assessment (ERA) in Norway. Norske Shell Exploration and Production).

Environmental Risk A measure of potential threats to the environment, which combines the probability that the events will cause, or lead to degradation of the environment and the severity of that degradation Wright N H (1993).

(10)

Event Tree Analysis A hazard identification and frequency analysis technique, which employs inductive reasoning to translate different initiating events into possible outcomes. (AS 3931:1998 Risk Analysis of Technological Systems – Applications Guide). These are displayed graphically.

Failure (risk) A cessation of function that has consequences (usually meaning death, injury or damage) beyond a component or entity merely becoming unavailable to perform its function. It can also be referred to as a

‘hazardous’ failure (Smith D J (1993) Reliability, Maintainability and Risk. Practical methods for Engineers. 4th Edition. Butterworth Heinemann, Oxford).

Failure (reliability) See Fault.

Fault The inability of an entity to perform its required function, resulting in unavailability. Non-performance to some defined performance criterion (Smith D J (1993) Reliability, Maintainability and Risk. Practical methods for Engineers. 4th Edition. Butterworth Heinemann, Oxford). It can also be referred to as a breakdown failure.

FDT Fractional Dead Time (a form of unavailability). The fraction of any time period that a defence or control system is ‘dead’ (cannot operate correctly). It is therefore a function of audit frequency and the time to revive/restore the control system.

FMEA Fault Modes and Effects Analysis. (AS 3931:1998 Risk Analysis of Technological Systems – Applications Guide).

FMECA Fault Modes, Effects and Criticality Analysis. (AS 3931:1998 Risk Analysis of Technological Systems – Applications Guide).

Frequency The rate at which something occurs per unit time.

FTA Fault Tree Analysis. A hazard identification and frequency analysis

technique, which starts with the undesired event and determines all the ways in which it could occur. These are displayed graphically. (AS 3931:1998 Risk Analysis of Technological Systems – Applications Guide).

Group Risk See Societal Risk

HACCP Hazard and Critical Control Point analysis. An approach of identifying, evaluating and controlling safety hazards in food processes.

Hazard A source of potentially damaging energy, which can give rise to a loss and used extensively by engineers and physical scientists. To be compared to a vulnerability. A source of potential harm or a situation with a potential to cause loss. (AS 3931:1998 Risk Analysis of Technological Systems – Applications Guide and AS 4360:1999 Risk Management).

A situation that could occur during the lifetime of a product system or plant that has the potential for damage to the environment.

Hazard Identification Process of recognising that a hazard exists and defining its characteristics. (AS 3931:1998 Risk Analysis of Technological Systems – Applications Guide).

HazOp HAZard and OPerability study. A formal analysis of a process or plant by the application of guidewords.

(11)

Heuristic Proceeding to a solution in the absence of an algorithm, by incremental exploration using conceptual devices such as ideal types, models and working hypotheses which are intended to provide solutions rather than explain facts.

HPR Highly Protected Risk. US engineering term used to describe a level of loss control excellence.

HRA Human Reliability Assessment.

IChemE The Institution of Chemical Engineers (UK).

IPENZ The Institution of Professional Engineers, New Zealand

Incident An event or situation, which occurs in a particular place during a particular interval of time which should provide an alert to the risk management system. This can be a failure of a control system or a near miss.

Individual Risk The frequency at which an individual may be expected to sustain a given level of harm from the realisation of specified hazards (Institution of

Chemical Engineers (1985). Nomenclature for Hazard and Risk Assessment in the Process Industries. IChemE, Rugby, Warwickshire).

Insurance A method of transferring risk by financial means.

Integrity A property of an object or data that has not been modified and is fit for the purpose for which it is to be used.

IRR Internal Rate of Return.

JSA Job Safety Analysis.

Latent Condition A failure which is not detected and/or enunciated when it occurs. (SAE ARP 4781:1998 Guidelines and Methods for Conducting the Safety Assessment process on Civil Airborne Systems and Equipment).

Liability A person’s present or prospective legal responsibility, duty, or obligation (Butterworth (1998) Concise Australian Legal Dictionary. Butterworth, Australia).

Life Cycle Costing Life cycle costing provides a method for determining the total cost of a system over its entire life cycle and is used to establish the cost

effectiveness of alternative asset solutions. Cost effectiveness is defined as the ratio of systems effectiveness to life cycle cost (Blanchard (1991) Systems Engineering Management. Prentice Hall; Blanchard and Fabrycky (1990). Systems Engineering and Analysis. 2nd Edition, Prentice Hall International; Aslaksen and Belcher (1992). Systems Engineering. Prentice Hall).

Likelihood A term to describe the probability or frequency of an occurrence.

Loss The embarrassment, harm, financial loss, legal or other damage which could occur due to a loss event. Any negative consequence, financial or otherwise

(12)

Maintainability The set of technical processes that apply maintainability theory to establish system maintainability requirements, allocate these requirements down to system elements, and predict and verify system maintainability performance (Blanchard and Fabrycky (1990). Systems Engineering and Analysis. 2nd Edition, Prentice Hall International).

MDT Mean Down Time.

Mitigation The act of reducing the severity of the potential adverse outcome. In the context of the types of controls listed above mitigation of risk could be achieved by any bar the first, that is, elimination.

Monitor To check, supervise, observe critically, or record the progress of an activity, action or system on a regular basis in order to identify change. (AS

4360:1999 Risk Management)

Monte-Carlo Simulation A frequency analysis technique, which uses a model of the system to evaluate variations in input conditions and assumptions. (AS 3931:1998 Risk Analysis of Technological Systems – Applications Guide)

MORT Management Oversight and Risk Tree.

MTBF Mean Time Between Failure.

MTTF Mean Time To Failure.

MTTR Mean Time To Repair.

Occurrence A sequence of events leading to damage or injury. P&ID Process (or Piping) and Instrumentation Diagram.

Paradigm A universally recognised knowledge system that for a time provides model problems and solutions to a community of practitioners (Kuhn T S (1970). The Structure of Scientific Revolutions. 2nd Edition, enlarged, sixth

impression. University of Chicago Press).

Pathogen In the risk context Reason (1993) has defined pathogens as analogous to latent failure in technical systems, similar to resident pathogens in the human body. (Managing the Management Risk: New Approaches to

Organisational Safety Chapter 1 of Reliability and Safety in Hazardous Work Systems: Approaches to Analysis and Design.).

Perceived Risk That risk thought by an individual or group to be present in a given situation (Institution of Chemical Engineers (1985). Nomenclature for Hazard and Risk Assessment in the Process Industries. IChemE, Rugby, Warwickshire). Precautions Measures taken beforehand to ward off possible adverse events. In the

context of risk management precautions are the result of prudent foresight, that is due diligence. In the context of a Cause-consequence model, precautions act before the loss of control point.

Probability The likelihood of an event occurring. A number in a scale from 0 to 1 that expresses the likelihood that one event will succeed another (Institution of Chemical Engineers (1985). Nomenclature for Hazard and Risk Assessment in the Process Industries. IChemE, Rugby, Warwickshire).).

(13)

Protection Protection has many meanings. However in the context of risk management it is the state of being protected or something that protects or preservation from injury or harm. In the context of a cause-consequence model, protection usually acts after the loss of control point such as much fire protection equipment.

QRA Quantified Risk Assessment. The estimation of a given risk by logical and analytical modelling techniques, or using statistical information from

historical data from circumstances similar to existing or planned operations. Quality Conformance to a set of requirements that, if met, results in an organisation,

service or product that is fit for its intended purpose. Totality of

characteristics of an entity that bear on its ability to satisfy stated and implied needs (AS/NZS 9000.1:1994 Model for Quality Assurance in Design,

Development, Production, Installation and Servicing). RAROC Risk Adjusted Return On Capital.

RBD Reliability Block diagram. A frequency analysis technique that creates a model of the system and its redundancies to evaluate the overall system reliability. (AS 3931:1998 Risk Analysis of Technological Systems – Applications Guide)

RCM Reliability Centred Maintenance.

Recovery Restoration of a system to its desired state following a fault or failure. Reliability The probability that a device will satisfactorily perform a specified function,

under given operating conditions, for a specified period of time (Smith David J (1993). Reliability, Maintainability and Risk. Practical Methods for

Engineers. Fourth Edition. Butterworth Heinemann, Oxford.).

Reliability Engineering The set of technical processes that apply reliability theory to establish system reliability requirements, allocate these requirements down to system elements, predict and verify system reliability performance and establish reliability growth programs (US MIL-HDBK-338-1A).

Residual Risk The remaining level of (pure) risk after risk treatment measures have been taken. (AS 4360:1999 Risk Management)

Resource/s The human, physical and financial assets of an organisation.

Risk The chance of something happening that will have an adverse impact upon objectives. It is measured in terms of consequences and likelihood. (AS 4360:1999 Risk Management)

Risk (Pure) The potential realisation of the unwanted consequences of an event from which there is no prospect of gain.

Risk (Speculative) Generally, risk deliberately undertaken for a perceived benefit.

Risk Analysis A systematic use of available information to determine how often specified events might occur and the magnitude of their consequences. (AS

(14)

Risk Engineering The application of engineering techniques to the risk management process. Risk Evaluation The process used to determine risk management priorities by comparing the

level of risk against predetermined standards, target risk levels or other criteria. (AS 4360:1999 Risk Management).

Risk Financing The methods applied to fund risk treatment and the financial consequences of risk. Note: in some industries risk financing relates to the funding of the financial consequences of risk. (AS 4360:1999 Risk Management) Risk Identification The observation and identification of new risk parameters (Rowe W D

(1977). An Anatomy of Risk. Wiley Interscience, New York). The process of determining what can happen, why and how. (AS 4360:1999 Risk

Management)

Risk Management The process of planning, organising, directing and controlling the resources and activities of an organisation in order to minimise the adverse effects of accidental losses to that organisation at least possible cost (Head E L (1978). The Risk Management Process. The Risk & Insurance Management Society Incorporated New York. Page 8)

Safe An acceptably low or tolerable level of risk. The opposite of dangerous. SafetyMAP Safety Management Achievement Program. Term coined by the Victorian

WorkCover Authority.

Security The combination of availability, confidentiality and integrity.

Sensitivity Analysis Examines how the results of a calculation or model vary as individual assumptions are changed. (AS 4360:1999 Risk Management). Severity The measure of the absolute consequences of a loss, hazard or

vulnerability, ignoring likelihood. In insurance terms the absolute magnitude of the dollars associated with a single (potential) loss event.

Societal Risk The relationship between frequency and the number of people suffering from a specified level of harm in a given population from the realisation of

specified hazards (Institution of Chemical Engineers (1985). Nomenclature for Hazard and Risk Assessment in the Process Industries. IChemE, Rugby, Warwickshire). Sometimes referred to as Group Risk.

Stakeholders Those people and organisations who may affect, be affected by, or perceive themselves to be affected by, a decision or activity. (AS 4360:1999 Risk Management).

Statute Law Law created by legislation, that is, made by Parliament (Butterworth (1998) Concise Australian Legal Dictionary. Butterworth, Australia)

System Safety A set of technical processes that apply risk management theory to establish system safety requirements. These requirements are allocated down to the system elements, and predict and verify system safety performance and direct actions to prevent and/or reduce unacceptable levels of identified safety hazards (Blanchard B (1991). Systems Engineering Management. Wiley Interscience)

SRS Safety Related System.

(15)

Tolerable Risk Risk that is not regarded as negligible or something that can be ignored, but must be kept under review and reduced further still (Health and Safety Executive (1988). The Tolerability of Risk From Nuclear Power Stations. HMSO, London).

VAR Value At Risk. A concept similar to that of Loss Expectancy. Vulnerability A weakness with regard to a threat. To be compared to a hazard. Vulnerability Analysis A method of 'completeness' checking for a defined scenario.

(16)

1. Introduction to Risk and Reliability Concepts

1.1 The Nature of Risk

Risk means different things to different people at different times. However, one element that is common to all concepts of risk is the notion of uncertainty. If we knew what would happen next, there would be no risk.

If immortal and omnipotent beings existed, the concept of risk would be incomprehensible to them. But in the world of finite beings, all face uncertain, possibly precarious futures. Risk, and what to do about it, are vital human concerns.

Decision-making processes whether of statutory regulators, court judges, business managers or

ordinary individuals reflect human concern to improve safety and security, and the reliability and efficacy of their endeavours in the face of ever present uncertainty.

1.2 Types of Risk

Risk is generally divided into two broad types: Pure Risk and Speculative or Business Risk.

If the likely consequences of a risk are considered to be always bad, offering no prospect of gain, it is designated pure risk. The possible events or situations that pure risk poses are treated as hazards or vulnerabilities.

If the possible consequences of a risk are considered potentially desirable, that risk is designated as speculative or business risk, and is treated as an opportunity.

Consequently, risk is assessed according both to its estimated likelihood or probability (how often it is likely to occur) and the value of its estimated consequences (how desirable or undesirable its impact may be).

1.3 Risk Management Evolution

USER OBJECTIVES LIMITATIONS

Insurance Broker Maximise new clients Maximise profits

Affordable services only Conflict of objectives Insurance Company Maximise underwriting

profits

Conflict of objectives Narrow approach Safety Manager Maximise safety budget

Minimise loss

Loss reduction may not be cost effective Risk Manager Maximise corporate profits Lacks knowledge of

specialised disciplines Not line management Line Manager Meet production objectives

Maximise profits

May not understand contribution of risk management to results Investment Manager Maximise investment

returns Minimise Risk

Risk and profit do not directly accrue to adviser Auditors Confirm reality matches

reports

Historical analysis; the past

may not reflect the future Legal Advisors/Lawyers Manage (potential)

conflicts

Win court cases

Disputes = prosperity Sign off is difficult Board Members Maximise corporate profits

Minimise personal liability

Lacks knowledge of specialised disciplines Users of the term "Risk Management" (Adapted from Blombery, 1982)

(17)

Several large international insurance brokers introduced both the concept and the term "risk

management" into Australia in the 1970s. The move derived largely from a marketing strategy to gain new clients. Subsequently, others outside the insurance industry took up the term, using it to serve various purposes.

Because the term risk management is used now in many different ways by different groups of professionals, confusion often arises as to what precisely is being referred to. Blombery (1982) suggests that the best way to avoid misinterpreting intentions is to examine what the main professional users of the term customarily imply when they refer to risk management, as shown in the table above. NB: Recently the financial investment industry also adopted the concept, developing a new lexicon in the process. For example, VAR (Value At Risk), which is a variation on the more traditional term and Loss Expectancy, which historically has been used by the insurance industry (Taylor, 1996).

1.4 Historical Perspectives of Risk

What we think about risk and how we address it depends on the way we perceive that risk and what, at different times, we believe to be its cause. For example:

1.4.1 The Plague

When a society believes that the reason many are dying from the plague is because God is punishing people for their sins, it will manage the risk differently from a society that believes in viruses and bacteria. The following illustrates some early attempts to control the plague (Nohl, 1926):

SPEYER 1347

A strict prohibition against gambling in churchyards. COUNCIL OF TOURNAI

All concubines to be expelled or married; Sundays to be strictly observed; manufacture, sale and use of dice completely suppressed. (Dice factories turned to making rosary beads). ROUEN (France) 1507

'No gambling, cursing, drinking or excesses'.

1.4.2 United Kingdom - Public Health Reforms in the 1840s

A particularly interesting risk management issue arose with the control of epidemics in the UK in the 1830s and 40s (Winslow 1967). Note that at this time viruses and bacteria were not known. The then theory of contagion related to miasmas or clouds of noxious, odious gases.

Chadwick's Report on the Sanitary Conditions of the Working Classes (1842) recognised that disease struck where there was work and urban congestion. By providing clean water, sanitation and

reasonable housing, the problem would be contained, if not solved. In part, his concept was a flow on from the Crimean war and Florence Nightingale, that “cleanliness is indeed next to Godliness”

To quote from Chadwick's report:

...That the expense of public drainage, of supplies of water laid on in houses, and of means of improved cleansing would be a pecuniary gain, by diminishing the existing charges attended on sickness and premature mortality.

(18)

Chadwick’s arguments to justify his risk management recommendations appealed to humanitarian-public interest benefits as well as cost savings over time. This did not achieve the immediate

acceptance and success one might expect in today’s more democratic society with greater capacity for public scrutiny, accountability, and liability. There were many with vested interests that could not see, or did not agree that the very expensive fresh water and sewerage treatment was necessary or even effective. Today, passive smoking may be considered in this same context.

1.4.3 The 1840 North American Factory Mutual System

In the early 1800s, cotton mills were a notorious source of fire and burned down regularly. A major part of the problem was the need to extract the cotton seeds from the cotton balls, which generated a significant amount of friction in a highly combustible medium.

Zachariah Allen, a factory owner in the 1840s decided to build a superior mill. He fire-isolated the cotton gins, provided massive construction, and taught his people how to respond to a fire

appropriately, using hoses and sand buckets. He then went to his existing underwriter and asked for a discount. The underwriter responded, “No, the good pay for the bad”.

He then approached other owners who had built superior facilities and suggested that they pool the premiums they were paying to existing underwriters. As they should have fewer losses, they could then pay back a profit after a few years. This was a great success and was the forerunner of the Factory Mutual System and the "Highly Protected Risk" (HPR) concept.

Such an engineering-underwriter viewpoint contrasts dramatically with a wholly financial view of insurance. With the Factory Mutual concept, only those plants that meet certain minimum design and management system requirements can join the premium pool. The loss rate will therefore remain static over time with minimal influence from market forces. With a purely financial approach a burning building can be insured if sufficient premium is paid.

1.4.4 Tripartite Risk Control Philosophies

For Health and Safety policy particularly, Australia adopted the philosophies of the United Kingdom, following from the work of the Robens Committee (Creighton, 1996). The general concept is that there are three key parties to the risk control process: those who own the industry, those who work there, and the government. Each party is of equal status. This particularly applies to the development of codes of practice and regulations.

While the tripartite concept has driven traditional approaches to OH&S risk control processes, the emerging legal environment puts increasing emphasis on a fourth party. Attention is swinging to stakeholders. Stakeholders range from consumers of products such as food or pharmaceuticals to the public and communities disaffected by industrial pollution or corporate governance failures.

1.4.5 Bipartite Philosophies

An alternative is what might be called the bipartite approach apparently adopted by Germany, arising from industry based insurance efforts started by Bismarck in the 1890s.

A bipartite guild (berufsgenossenschaft) is established for appropriate industries. The government’s role is confined to ensuring that the process occurs; specifically that the industry guild exists, that it functions to determine what the acceptable levels of risk are for that industry and to ensure that the consequences of this target are appropriately funded by industry based insurance.

(19)

1.5 Reliability

Reliability is a risk-related concept, and a specific area of professional activity. The main concern of reliability-focussed professionals is to ensure that systems or system components work the first time they are required, and every time thereafter.

The military has always had a very specific interest in this in both organisational and technological terms. The beginnings of the 20th century arms race in Europe can be traced to the involvement of industrial technology in production of the HMS Warrior in 1861.

World War 1 provided the impetus to the development of the aircraft and armoured vehicles and the beginning of increasingly capable military equipment. World War ll brought the development of

electronics and a dramatic increase in the complexity of increasingly accurate and destructive weapons. Such systems often consumed enormous resources yet failed to deliver effective service to the

customers. As might be expected, the use of sophisticated valve based electronic systems in the emerging fighter jet industry proved very unreliable in the 1950s.

1.5.1 Failure Modes

Until the mid 1970s, reliability-focussed professionals saw system components as exhibiting a standard failure profile consisting of three separate characteristics:

An infant mortality period due to quality of product failures. A useful life period with only random stress related failures

A wear out period due to increasingly rapid conditional deterioration resulting from use or environmental degradation.

These are shown in the figure below.

The consequence of such beliefs was that equipment was taken out of service and maintained at particular intervals, regardless of whether it was exhibiting signs of wear or not.

Failure Rate

Infant Mortality Useful Life Wear Out

Time

Bathtub Failure Curve

However, actuarial studies of aircraft equipment failure data conducted in the early 1970s identified a more complex relationship between age and the probability of failure below. It evolved in the private airline industry primarily through the activities of the Maintenance Steering Group of the International Air Transport Association. The final report of the Maintenance Steering Group in 1980 titled MSG-3, provided the backbone of the logic processes contained in the referenced texts and RCM analysis (Moubray, 1992).

(20)

Wear-in to Random Wear Out

Random then Wear Out

Steadily Increasing

Inceasing during Wear-in and then Random

Random over measurable life

Wear-in then Random

4%

2%

5%

7%

14 %

68%

89%

Failure Rate Curves

Specifically, the bathtub curve was discovered to be one of the least common failure modes and that periodic maintenance increased the likelihood of failure. This led to the idea that the maintenance regime ought to be based on the reliability of the components and the required level of availability of the system as a whole.

(21)

1.6 Quality

Davis (2001) reviews a large number of contributors to the quality movement. Although there are differences in approach there appear to be 6 common principles namely; management commitment, measurement to determine current position and goals, quality teamwork in the workforce, system based tools, prevention is better than inspection, and customer focus.

1.3.1 W Edwards Deming (US circa 1948)

Defines quality as a predicable degree of uniformity and dependability at low cost and suited to the market. The objective of his approach is to reduce the variability by continuous improvement, the "PDCA Cycle" (Plan, Do, Check, Act). Management is responsible for 94% of quality problems. 1.3.2 Joseph M Juran (US)

Defines quality as fitness for use. He has a 10-step process to quality improvement. Like Deming, Juran believes that senior management are largely responsible for quality with less than 20% of quality issues being due to workers. However, quality improvements are not free.

1.3.3 Phillip B Crosby (US)

Believes that quality is conformance to requirements. He introduced the concept of “zero defects” within the framework of his “four quality absolutes”. The cost of quality is the costs incurred due to non-conformance and therefore quality is free.

1.3.4 William E Conway (US)

Has similar beliefs to Deming and indicates that quality increases productivity and lowers costs. He has a 6-tool process for quality improvement and advocates the use of simple statistical methods to identify problems and point to solutions.

1.3.5 Kaoru Ishikawa (Japan circa 1949)

Focussed on seven basic tools for quality improvement, quality circles and company wide quality control (CWCC) from top to bottom. Cause and effect diagrams used extensively (see section 5.4).

1.3.6 Shigeru Mizuno (Japan)

Promoted 7 tools for quality management; relations diagram, KJ or affinity diagram, systemic/tree diagram, matrix diagram, matrix data-analysis, process decision program chart, and arrow plan. 1.3.7 Masaaki Imai (Japan)

Kaizen process to develop logical systemic thinking. Has an expanded form of the PDCA cycle. 1.3.8 Genichi Taguchi (Japan)

Restates the Japanese view of investing first and not last. That is, design should be superior. 1.3.9 Shigeo Shingo (Japan)

Promoted just in time manufacturing and defects = 0 (Poka-Yoke). 1.3.10 Armand V Feigenbaum (US)

Holds that total quality management (TQM) is the way to completely manage an organisation. 1.3.11 Tom Peters (US)

He has a focus of leadership and customer satisfaction rather than management. He includes tools like management by walking about (MBWA).

1.3.12 Claus Møller (Denmark)

(22)

REFERENCES

Blombery R I (1982). Risk Management Origins, Objectives and Directions. Proceedings of the Victorian Industrial Safety Convention, Vol. 1, 1982, pp.39-48.

Chadwick E L (1842). Report on the Sanitary Condition of the Labouring Population of Great Britain. Presented to Both Houses of Parliament, London.

Creighton W B (1996). Understanding Occupational Health and Safety in Victoria. 2nd edition, Federation Press.

Davis, Dr Elwyn C (2001). The quality gurus: What have we learnt from them? Reprinted in Engineering World. December 2001 / January 2002. pp15-19.

Moubray, John (1992). RCM II Reliability Centred Maintenance. Butterworth Heinemann Nohl J (1926). The Black Death, a Chronicle of Plague, George Allen & Unwin Ltd, London.

Taylor R T and W A MacDonald (1996). The Future of Market Risk Management. Article in Financial Derivatives & Risk Management. Issue 6, June 1996. IFR Publishing

Winslow C E A (1967). The Conquest of Epidemic Disease. The Hafner Publishing Company, New York, New York. The particularly relevant chapter is Chapter XII, the Great Sanitary Awakening.

READING

Beck Ulrich (1986). Risk Society: Towards a New Modernity. Translated © Sage Publications, London. Reprinted 1998.

Head E L (1978). The Risk Management Process. The Risk & Insurance Management Society Incorporated New York. Page 8

McCabe FM (1978). Risk Management and the Australian Safety Practitioner. Marsh & McLennan Pty Ltd, Melbourne, Australia.

Robinson R M, D B L Viner and M A Muspratt (1985). National and Public Risk: Risk Control Strategy – Some Fundamentals. Paper presented at the ANZAAS Festival of Science, Monash University.

(23)

2.0 Risk Paradigms and Models

Efforts to demonstrate how risk should best be managed have given rise to a number of risk management paradigms.

A paradigm is a universally recognised knowledge system that for a time provides model problems and solutions to a community of practitioners (after Kuhn, 1970). New paradigms based on more

comprehensive or convincing theories may supersede older ones or exist co-jointly with them. The following describes a number of the most common paradigms including some of the advantages and disadvantages of each:

The paradigms are: i) The rule of law.

ii) Traditional risk management historically typified by the Lloyds Insurance and the Factory Mutual Highly Protected Risk (HPR) approaches.

iii) Asset based risk management, typified by engineering based Failure Modes, Effects and Criticality Analysis (FMECA), Hazard and Operability (HazOp) and Quantified Risk Assessment (QRA) 'bottom-up' approaches.

iv) Threat-based risk management typified by Strengths, Weaknesses, Opportunities and Threats (SWOT) and vulnerability type 'top-down' analyses.

v) The comparatively recent market based risk management, which uses the notion of the risk being equal to variance with an equivalent risk of gain as well as risk of loss.

vi) Solution-based ‘best practice’ risk management rather than hazard based risk management. vii) The development of biological, systemic mutual feedback loop paradigms, practically

manifested in hyper-reality computer based simulations.

viii) The development of risk culture concepts including quality type approaches. Many proprietary risk management systems integrate several of these approaches.

2.1 The Rule of Law

When everything else fails, the ultimate appeal is generally to the rule of law. In a very real sense, all the other paradigms represent methods of satisfying legal outcomes in the event of an adverse outcome. As a consequence, asking lawyers which paradigm is applicable to ensure ‘due diligence’ generates a response that all paradigms, once they are explained, are necessary. The diagram below shows a pathogen based cause-consequence diagram in a legal context, with LOC indicating loss of control.

The power of the legal approach is that it is time-tested and proven. If the judiciary is independent of political and commercial interests of the day, then an independent and potentially fair resolution of otherwise potentially catastrophic social dislocation can occur. Perhaps this is why it works: both the political and judicial systems must simultaneously fail before social breakdown occurs.

(24)

WHAT WRONG WHY NOT WHAT IF Cradle (Whole of Life) Grave Pathogens Immune System LOC Hit Miss Event Horizon

CAUSATION FORESEEABILITY PREVENTABILITY REASONABLENESS Pathogen Cause-Consequence Model in Legal Context

In the common law tests of negligence the four key words are Causation, Forseeability, Preventability and Reasonableness. This Rule of Law underpins the ALARP principle that risks shall be demonstrated to be “As Low As Reasonably Practicable”. It also provides a focus for other risk management principles including "not less safe", "continuous improvement" and "best practice”.

(i) Define WHAT we are talking about CAUSATION

(ii) Identify what could go WRONG FORESEEABILITY

(iii) Control WHY it will not happen PREVENTABILITY

(iv) Assess balance of Precautions to the Consequences IF it did REASONABLENESS Common Law is covered in more depth in Chapter 4.

2.2 Insurance Based Risk Management

The Lloyds Insurance and the Factory Mutual Highly Protected Risk (HPR) approaches historically typify this. Both consider empirical history to be the source of wisdom. Looking at past incidents and losses and comparing these to existing plants and facilities can make judgements made about risk. The difference is that one approach, Lloyds', has a financial focus, where the Factory Mutual focus is on a target level of engineered and management excellence.

The power of the process is the very tangible nature of history and in a sense the results represent the ultimate Darwinian ‘what if’ analysis. Its weakness is that in the modern rapidly changing world empirical history has become an increasingly less certain method of predicting the future. 2.3 Asset Based Risk Management

Asset based risk management is typified by engineering based FMECA, HazOp and QRA 'bottom-up' approaches.

Any bottom up method has problems with common cause or common mode failures. A detailed

assessment from individual components or sub-systems such as HazOp or FMECA examines how that component or sub-system can fail under normal operating conditions. It does not examine how a catastrophic failure elsewhere might affect this component or the others around it. One attempts to address such ‘knock on’ effects in HazOps by a series of general questions after the detailed review is completed, but it nevertheless remains difficult to use a HazOp to determine credible worst-case scenarios. FMECA and QRA have the same problems.

(25)

The power of bottom up techniques lies in the detailed intense scrutiny of complex systems and the provision of closely coupled solutions to identified problems. Any proposed risk control solutions are focussed and specific. They can be easily considered for cost/benefit results. The resulting risk registers are powerful decision making tools.

2.4 Threats and Vulnerabilities

Threat based risk management is typified by SWOT and vulnerability type 'top-down' analyses. These methods mostly identify areas of general strategic concern rather than solutions to particular problems. A very simple example of a Threat and Vulnerability analysis is shown in the table below. Again this focuses on areas of concern rather than precise solutions.

Critical Success Factors

Threats Reputation Operability Staff

Technical xx xx xx Community - - xx Political (change of government) x x x Financial xxx xxx xxx Natural Events x xxx x

Sample Vulnerability Matrix Scores

xxx Critical potential vulnerability that must be addressed. xx Moderate potential vulnerability.

x Minor potential vulnerability. - No noticeable vulnerability.

The intersections of a threat with a "critical success factor" or "asset" are termed vulnerabilities. The SWOT analysis interpreted from a risk perspective provides insight into vulnerabilities or the risk of loss and value addeds, or the risk of gain. This is shown in the figure below.

External / Internal Factors Opportunities Strategy Organisation Strengths Value Addeds Threats Weaknesses Vulnerabilities

(26)

2.5 Risk as Variance

The comparatively recent market based risk management stems from the notion of risk being equal to variance with an equivalent risk of gain as well as risk of loss (see figure below). In finance, risk is normally assumed to be symmetric. This is not absolutely true, but by making such an assumption many of the tools of statistics become available, most notably the normal distribution, which is symmetric about its mean value. This is the principal strength of the approach.

Standard deviation

deemed

to equal risk

Rate of Return

Pure

Risk

Speculative

_Risk

Standard Distribution showing the Mean and Variance

However, from a systems engineering perspective at least, this should really be known as the "boom/bust" model since, if everyone uses the same model, mutual feedback loops are inevitable. If pure risk only is assumed, then self-dampening effects are likely, which is the position adopted by most engineers and technologists.

Business risk is usually considered to be the sum of both pure risk and speculative risk. 2.6 Best Practice

So far all paradigms considered have been hazard based that is looking for problems and then

solutions. In health & safety, a hazard is defined as a source of potentially damaging energy, which can give rise to a loss. In more general terms a hazard is a source of potential harm or a situation with a potential to cause loss. In this sense it is analogous to vulnerability, that is the potential impact of a threat upon an asset.

Most risk systems like the Australian/New Zealand Risk Management Standard, AS/NZS 4360:1999 suggest a process of hazard identification, risk assessment, control option development and then implementation. An alternative to this is solution based 'best practice' risk management. The best practice risk management approach simply looks at all the good ideas other people in an industry use and see if there is any reason why such ideas ought not to be applied at your own site. In the figure below this means starting on the right rather than at the top or the left.

(27)

Judgements

Statute, TLS, ALARP, Common Law Due Diligence etc

Credible Hazards,

Vulnerabilities or Pathogens

Risk

Actions

and Residual Risk Allocation

Hazard Assessment Assess Consequences Estimate Likelihoods Control Options Mitigate Consequences Decrease Likelihood

Best Practice Approaches (TLS = Target Level of Safety)

The best practice approach is particularly powerful in a common law ‘due diligence’ sense. The hazard assessment approach implies that statutes may be satisfied, target levels of safety met or 'As Low As Reasonably Practicable' (ALARP) arguments fulfilled. But if there were a simple solution to a trivial problem implemented at a competitor's facility then common law negligence could arise if something went wrong at the facility in question. A best practice process is one of the few approaches that target this difficulty.

In a sense, this is confirming the view that liability arises when there are unimplemented good ideas rather than the existence of hazards or vulnerabilities in themselves.

2.7 Simulation

Biological/Computer Simulation Paradigms are derived from the application of evolutionary concepts developed in virtual reality. The most practical manifestation of biological paradigms is in computer simulations. This amounts to modelling a complex system in a virtual reality environment and playing endless “what if” scenarios.

For example, oilrigs and process plants are generally modelled in 3D before construction so that designers and operators can ‘walk around them' and in many ways ‘try them out’. If every component (or at least all those containing or controlling major energy sources) is identified and has its risk and reliability properties assigned to it then the designer can play ‘god’. Continuing this example, suppose every vessel in the plant ‘knows’ what over temperature or overpressure it can withstand before rupture, and after having ruptured under such conditions can ‘project’ and ‘communicate’ its thermal and

pressure energies to adjacent vessels, which then respond accordingly.

If the designer then told one to explode, a chain reaction may result. This would depend on separation distances, the force of the explosion and very many other factors. But by resetting the computer

(28)

2.8 Culture

James Reason (1997) develops a cultural paradigm model in several ways (he is a psychologist by training). He notes three types of risk culture:

Pathological Culture Bureaucratic Culture Generative Culture Don't want to know

Messengers are 'shot'

Responsibility is shirked

Failure is punished

New ideas actively discouraged

May not find out

Messengers are listened to if they arrive

Responsibility is compartmentalised

Failures lead to local repairs

New ideas often present new problems

Actively seek it

Messengers are trained and rewarded

Responsibility is shared

Failures lead to far reaching reforms

New ideas are welcomed

Three Risk Cultures after Reason (1997)

To some extent, those dealing with technological risks have generally suffered a decline in influence as business risks and associated risk management techniques have come to the fore over the past ten years. However, culture has now been identified as central to effective risk management suggesting a new focus has been emerging in the last five years as shown in the figure below. Reason's Pathogen model is discussed in Chapter 5.

Hazards Technological Risks Vulnerabilities Business Risks Pathogens Risk Culture Movement from Technological to Business to Risk Culture

(29)

2.8.1 Safety Culture

An interesting application of the cultural risk paradigm arises when considering safety in Australian industry. A major study endeavouring to determine why Australia has a good commercial aviation safety record documented aspects of Australian culture that affect safety performance (Braithwaite et al, 1997). The graph below reflects the answers that staff gave to a request from their manager to help paint his house. Australians have the highest likelihood (up to 95%) of any of the interviewed nations of saying, “No”. Percentage 100 50 0 25 75 Australia Netherlands UK West Germany USA Italy Japan Canada Poland Pakistan Mexico Hong Kong Malaysia Egypt Singapore Indonesia Nepal China

“No” Responses to the question "Would you help paint your manager's house?"

Australians tend to be individualistic and to have a low “power-distance”. That is, actions or instructions from others have a comparatively limited effect on the way in which they act. They perceive a relatively flat power gradient between manager and subordinate.

For example, on aircraft flight decks junior crew members feel able to speak up without loss of face to the senior crew or other repercussions, if they think an error has occurred. This facilitates initiation of effective additional checks.

In industries with different management styles, difficulties can arise. If a person being directed does not believe that the directive is either practical or safe, then that person will tend to assess the situation and do it his/her own way. The person may do so without declaring his/her intention or discussing the intended change to procedures with management.

(30)

2.9 Paradigm Integration

The figure below describes an understanding of how the different paradigms presented in this section fit within a large organisation.

Vulnerability Analyses,

SWOT Analyses etc, Audits, Underwriting Assessments, Availability Assessments.

Crisis and Fallout Management

QRA, HazOPs, FMECA, RCM, Job Safety Analysis, Cause Consequence Modelling etc

Losses, Incidents and Breakdowns Fire Fighting, First Aid, Legal Actions Insurance Payments Pre-event Event Horizon Post-event

Stategic

Tactical

Operations & Maintenance Board and CEO (Policy) A B C D E 1 2 3 4 5 AS4360 IEC (AS) 61508 Courts

Top

Down

An Integrated Risk Paradigm Framework

The top left hand box shows those paradigms that would be expected to apply strategically at the higher levels of an organisation, whilst those in the bottom left hand box could generally be applied at the operational level. On the right hand side are the tactical issues that are faced post-event. The objective of risk management is to stay on the left hand side of the event horizon but a complete risk

management framework must provide for the post-event scenarios.

There are a number of risk techniques available but only three generic methods by which organisations can proceed with strategic tasks to address the concept of risk. These are:

i) Expert knowledge provided from experts, literature and research ii) Facilitated workshops of experts and interested parties

iii) Interviews with selected players.

Each of these methods has different strengths and weaknesses depending on the culture of the organisation and the nature of a particular task.

(31)

The best methodologies to use in the implementation of each of the paradigms are illustrated in the following table:

Technique>> Risk Management

Paradigm

Expert reviews Facilitated workshops

Selective interviews

1. The rule of law Yes

(Legal opinions) Yes (Arbitration, moot courts) Yes (Royal Commissions) 2. Insurance approaches Yes

(Risk surveys, actuarial studies) Yes (Risk profiling sessions) Yes (especially moral risk) 3. Asset based,

'bottom-up' approaches

Yes (QRA, availability &

reliability audits)

Yes (HazOps, FMECAs etc)

Difficult

4. Threat based 'top-down' approaches Difficult in isolation Yes (SWOT & vulnerability) Yes (Interviews) 5. Business (upside AND downside) approaches Yes (Actuarial studies) Difficult in isolation Yes (Fact finding tours) 6. Solution based ‘best

practice’ approaches Difficult to be comprehensive Difficult to be comprehensive Yes (Fact finding tours) 7. Simulation Yes (Computer simulations) Yes (Crisis simulations) Difficult

8. Risk culture concepts Yes (Quality audits)

Difficult Yes

(Interviews)

Risk Management Paradigm - Technique Matrix

The concept of a Safety Case, which is logically prior to and supports the Business Case for an enterprise, is one interesting development. Those techniques and paradigms highlighted in the table could be used in developing a safety case.

(32)

2.10 Models

2.10.1 Risk and Reliability Diagrams

A particularly useful way of examining (pure) risk and reliability in an organisational sense is via a risk diagram. A risk diagram is fundamentally a plot of the likelihood of events occurring against the severity of the outcomes. This can be done in different ways depending on the industry or organisation that is being examined. The frequency denominator (events per year, events per kilometre, events per passenger mile, or events per any frequency denominator) is plotted against consequence severity in down time, dollars, lives lost, working days lost, or days lost to the community.

Protest Pickets Personal Injury Industrial Stoppage

OH&S

Fire &

Explosion

Catastrophic

Maintenance

Breakdowns

Public

Crtiticism

Staff

Complaints

Reliabilty Engineering

FMECA and RCM

Defence Industry Driven

Product Boycott

Safety

High technology and high hazard system failures, Class Actions, Market Collapse

Relative Severity of Consequence

Service

Risk Engineering

HazOp and FTA

Aerospace & Nuclear Industry driven

Relative

Likelihood

Organisation Risk Diagram

In organisational terms the risk diagram describes the relationship between the different technical and commercial areas of activity and the relationship between risk and reliability. Plotted on normal axes, the curve typically takes the form of a hyperbola as shown. If the plot is likelihood against severity in dollars, then the area under the graph represents the size of the economic loss. Typically, the greatest area is at the maintenance end, then the OH&S or personal injury area, then the fire and explosion zone and lastly the catastrophic event region.

The Maintenance region, being the largest, therefore provides the greatest returns for good

management and is the target of such programs as Reliability Centred Maintenance (RCM). However the other regions which deal with damage, injury and death also have a legal dimension. One view is to suggest that failure to optimise the maintenance region can send an organisation broke, but failure to deal with the legal dimension can send directors to goal. Certainly, both are important.

(33)

2.10.2 Asset Management and the Costs of Ownership

Asset management is more than ownership, accountability and demand management after the assets are in place, it is about whole of life approach to management. Asset management is about all those actions, from the first stirrings of a need to the final recycling of the disposed asset which ensure that an asset achieves the business objectives of:

i) Being safe for operators, users and the public.

ii) Not adversely impacting on environment during its use, maintenance or disposal. iii) Providing the service for which it was procured.

iv) Achieving the above at minimum cost of ownership over its life.

The cost of ownership includes at least: a) The initial capital cost, plus

b) the whole of life cost of operation and maintenance, plus

c) the whole of life cost of risk (the cost of prevention plus the cost of loss).

In some cases the largest component of the cost of ownership will be the whole of life cost of risk. For public authorities especially, it is very common to have very large expenditures on risk control measures that are not identified specifically as part of the cost of ownership of the operating assets.

For example, signalling on railways is a risk control measure to prevent trains from colliding. If all trains ran exactly on time and the timetable was perfect then there would be no red signals ever occurring in a train network. This indeed was historically the case. The reasons for the introduction of signalling systems was because eventually the train system became sufficiently complicated that perfect achievement of timetable was no longer possible. This meant that collisions would inevitably occur unless some interposing system was installed. The cost of the signalling system should be included as part of the cost of ownership for the railways but identified as part of the preventive aspects of the cost of risk.

This concept is reflected in market risk terms, especially in banks as RAROC (Risk Adjusted Return On Capital).

2.10.3 Risk Management Process Model

The Risk Management Process Model is one of the most commonly used risk management models and dates from the mid seventies.

historical data (past experience) surveys workforce scientific literature Identification likelihood of occurrence and severity of consequence Quantification balance of advantages / disadvantages of running the risk with the advantages / disadvantages of controlling it Evaluation risk retention risk reduction insurance risk transfer Control

A Risk Management Process Model

(34)

The model in below is an overview from the Australian/New Zealand Standard, Risk Management, AS4360: 1999. This follows the process model.

Establish the context

Evaluate risks Analyse risks Identify risks Treat risks Assess risks Monitor & Review Communicate & Consult

Risk Management Overview The main elements are in the form of an iterative process:

a) Establish the Context - This step establishes the strategic, organisational and risk management context in which the rest of the process will take place. Risk assessment criteria and structure to be used should also be defined.

b) Identify Risks - Identify what, why and how hazards arise.

c) Analyse Risks - Determine existing controls and establish the likelihood of the events and the severity of the consequence.

d) Evaluate Risks - Compare projected risk levels against criteria to determine acceptability or otherwise of each hazard and set risk priorities.

e) Treat Risks - Accept and monitor low-priority hazards. For all other hazards develop and implement a specific management plan, which includes consideration of funding.

f) Monitor and Review - Monitor and review the performance of the risk management system and

changes which might affect it.

g) Communicate and Consult. Communicate and consult with both internal and external

stakeholders at each stage of the risk management process and concerning the process as a whole.