Step Three: Identify basic failure events

3.5 Analysing Functional Aspects

3.5.3 Step Three: Identify basic failure events

Step three of the process involves identifying the ways in which the software might fail, and lead

to the HSFMs under consideration. As was discussed in section 3.2, it is through sequences

of interactions between objects that functionality is achieved by the OO system. This step

of the process is therefore concerned with identifying those failures in the interactions which

may lead to the HSFMs identified in the previous step. In order to do this it is necessary to

understand the interactions that occur. This requires a dynamic view of the system design. A

UML sequence diagram can provide the necessary information about what interactions occur

between different objects in the system to achieve the desired functionality.

The sequence diagram can be used to work back through the sequence of interactions that

occur, and identify which interactions are required to fail in order to bring about the HSFM. A

simple fault tree can be constructed to capture the combination of failures which contribute to

the HSFM. As was discussed in Chapter 2, there have been a number of attempts at applying a

fault tree analysis method to software. The most notable of these is probably Leveson’s SFTA

technique [38], [39]. In this method Leveson attempted to use fault trees as a way of identifying

the contributions of code-level behaviour to software-level hazards. The purpose of this step is

different in that fault trees are used in order to identify the failures at the software sub-system

level which may contribute to a system hazard. As such the code itself is not considered in

the fault tree. additionally, unlike Leveson, the use of fault tree templates is not advocated for

this purpose. The fault tree is used to identify the basic events, which are those which relate

to failures of individual elements of the design (such as objects or interactions). These failure

events require further analysis. This process can again be illustrated using the SMS example.

For the SMS, the top level failure in the fault tree is taken as ‘Software system fails to prevent

release on the ground’ as shown in figure 3.7. In order to construct the fault tree below this

failure, it is necessary to consider the UML sequence diagram for release of store as shown in

figure 3.5. For this part of the analysis it is only necessary to consider release of a single store,

as the general case is being considered. In order to work back through the sequence, the starting

point will be the call of the RemoveStore() operation on the station object. This represents

the end of the store release sequence. For the top level failure to occur the aircraft must be on

the ground, and the remove store operation must be sent. If these events do not occur then

neither will the top level failure. The aircraft being on the ground is however considered to be

a normal event as it is not a failure (the aircraft is meant to be on the ground at the time). It

call being sent is a failure event however, as this should not be allowed to happen when the

aircraft is on the ground. This event is therefore developed further.

S/W fails to prevent release of store on the

ground RemoveStore() sent WOW detected incorrectly Aircraft on ground WOW ignored WOW value held is incorrect Failure of checkWOW() Failure of stores manager Incorrect value obtained from sensor

Figure 3.7: Fault tree for SMS HSFM

It can be seen from the sequence diagram (figure 3.5), that before the remove store operation

is called, an interaction occurs to check whether there is weight on the wheels (WOW) or not.

WOW is a boolean value used to determine if the aircraft is on the ground (true), or not. If this

interaction returns true then the remove store operation should not be called. If the remove

store operation is sent with the aircraft on the ground then this will either be that the value

of WOW is not detected correctly, or that the WOW is detected correctly but is ignored and

the store is release anyway. A failure to detect the WOW correctly could either be due to some

sort of failure in the interaction that checks the WOW, or that the WOW value was incorrect

in the first place. This could either be due to a failure of the stores manager, for which WOW

is an attribute, or a failure of the WOW sensor to provide the correct information. The fault

tree (figure 3.7), shows how these events relate to the top level failure. It can be seen that four

basic events are identified (indicated by a diamond symbol under the event). It is necessary to

derive requirements that can mitigate these failures. In order to do this the possible causes of

these failures must be investigated. This is done in step four. The fault tree generated in this

simple example is quite small. For more complex designs there may be many more levels of

decomposition required to identify all the basic events. A more complex design is analysed as

In document Using Safety Contracts in the Development of Safety Critical Object-Oriented Systems. Richard D. Hawkins (Page 89-91)