Reliability Analysis of Phased Missions - Methods for the efficient measurement of phased missi

Due to the need to meet the mission goals and avoid the consequences of failure, the ability of these real world systems to carry out their missions with high reliably, where reliability is defined as the probability that it is completed successfully, is often critical. Creating systems that attain high reliability is nonetheless difficult due to the sophistication of modern engineering systems and the complexity of the phased missions that they undertake. The designers, manufacturers and operators therefore need to understand how to achieve high reliability with new systems and require assurance and insight into the improvement of the reliability for those that already exist. One solution is to over-engineer such systems, perhaps by incorporating great levels of redundancy or extremely robust components, in the hope of gaining high reliability. However this approach can be detrimental to other aspects of the design such as leading to higher costs and is also still not guaranteed to result in the desired reliability. A better approach is to apply methods from reliability engineering to create a model of the system from the known logical connections between the failed status of the system and its components, together with the probabilistic models for failure and repair of the latter, so that the reliability of the system design can be accurately predicted and optimised in a rigorous manner. Some of the fundamental definitions, theories and methods from reliability engineering are presented in chapter 2 of this thesis, many of which form the foundation for the methods presented in later chapters.

1.2.1 Reliability Measurement

The measurement of the mission reliability and phase failure probability for a system operating in a phased mission is an important metric. However, the introduction of statistical time dependencies between components in different phases complicates the analysis significantly compared to the non-phased case. In the 35 years since Esary and Ziehms [3] first addressed this problem there have been two main approaches to the analysis of phased mission reliability:

 Combinatorial such as the fault tree and Binary Decision Diagram.

 State space such as the Markov method and Petri-nets.

The combinatorial methods are generally far more efficient computationally but are unable to deal with the statistical dependencies introduced by repair and are therefore used in the analysis of non-repairable systems. State space methods are able to deal with these dependencies but suffer from what is known as the state-space explosion problem, meaning that the analysis of large systems or those that operate over many phases becomes intractable.

Due to the complexity of phased mission reliability analysis and the large number of components contained within modern systems, any real analysis will need to be performed by computer. Despite the advancements made, the sizes of the phased mission problems that can be analysed on a modern commodity computer are still very limited, particularly for repairable systems. In addition, certain use cases require the performance of repeated reliability analysis in less time and with less memory, such as unmanned aerial vehicles (UAVs) optimising the phase configuration of their mission based on real time reliability predictions. There is therefore a

6

compelling need for the development of phased mission reliability measurement methods that have improved computational efficiency.

Chapter 3 of this thesis discusses the existing methods available for the analysis of phased mission reliability of non-repairable systems that were found in the literature. Many of these utilise some form of Boolean algebra to deal with the phase dependencies between component failures and the methods presented include both the early fault tree based approaches and the more recent, and much improved, analysis methods that use the Binary Decision Diagram (BDD) technique. Chapter 4 highlights some of the problems and areas for improvement that were discovered during the study of these methods. These include the inaccuracies found in the method for analysing systems that contain multiple failure mode components and the lack of a method optimised for the repeat reliability analysis of a system design that occurs when performing a reliability improvement exercise or real time analysis. A newly developed method that addresses both of these problems is then presented and shown to be an improvement. Moving to repairable phased missions systems, chapter 5 presents the methods found in the literature for their reliability analysis and these include methods utilising the Markov, BDD and Petri net techniques. Each of these methods has its own weaknesses, for example the Markov methods are unable to analyse the reliability of very large or complex phased mission systems due to the state space explosion problem and the BDD technique is limited to the case where repaired components are only integrated back into the system at the end of the phase in which they are repaired. Five new methods for the analysis of repairable phased mission systems are presented in chapter 6, some of which are extensions to the existing methods and others which are entirely new. These include extending an existing Markov method to allow it to analyse systems containing components with multiple failure modes, a new BDD method for identifying which system states represent system failure, and two new methods that are more efficient in the analysis of systems that contain both repairable and non-repairable components – a common case in real world systems.

1.2.2 Reliability Improvement

There are two ways in which the reliability of a phased mission system can be improved. The first is to alter its phase reliability structure, for example by adding additional redundancy. The second is to alter the reliability of the components used, perhaps by using more robust alternatives or increasing preventive maintenance. In either case, assuming a limited budget, a choice must be made as to which part of the system and which components should be targeted. This is a demanding task if the system is complicated, contains many components or the phased mission is large. Importance measures, which give an index of the influence or contribution of a component or group of components on the system reliability, can be used to help with this type of decision and also to predict the magnitude of improvement that can be obtained. For non-phased mission systems there are six commonly used importance measures available, namely the Birnbaum, Criticality, Fussell-Vesely, Risk Reduction Worth, Risk Achievment Worth and Differential importance measures. Each has a unique interpretation of component importance, and each appropriate for use with certain reliability improvement goals. For example, the Birnbaum importance measure shows those components with highest structural importance and therefore those for which a fixed reliability improvement would lead to the largest reduction in system reliability whilst the Risk Reduction Worth importance measure highlights those that provide the scope for achieving the greatest improvement in system reliability. Importance measures can provide even more precise

7

data for system reliability improvement with systems operating in phased missions. For example, they can show how the reliability of the system can be improved specifically in the phase of the mission with the highest consequence of failure. However, whilst the development of importance measures for non-phased mission systems has reached an advanced state, significantly less research into phased mission importance measures has been carried out. The further development of those suitable for phased mission systems is therefore required since importance measures have an important role in facilitating reliability optimisation.

Chapter 7 is a review of the existing importance measures and covers those that have been developed for both non-phased and phased mission systems. The importance measures range from the widely used and well known Birnbaum importance measure to the recently introduced Differential importance measure, and include both those for measuring the importance of a component and groups of components. Notable is that significantly less importance measures have been developed for phased mission systems and that no method for dealing with the cost of system failure, which may vary dependant on the phase in which the system has failed, has been presented. Chapter 8 seeks to address this imbalance by presenting several new importance measures that have been developed to help with the optimisation of phased mission system reliability. These include interpretations of many of those previously only available for non-phased mission systems, those with the ability to find the importance of components with respect to precise time periods and the first developed for the measurement of the importance of a group of components.

In document Methods for the efficient measurement of phased mission system reliability and component importance (Page 30-32)