3.5 Analysing Functional Aspects
3.5.3 Step Three: Identify basic failure events
Step three of the process involves identifying the ways in which the software might fail, and lead
to the HSFMs under consideration. As was discussed in section 3.2, it is through sequences
of interactions between objects that functionality is achieved by the OO system. This step
of the process is therefore concerned with identifying those failures in the interactions which
may lead to the HSFMs identified in the previous step. In order to do this it is necessary to
understand the interactions that occur. This requires a dynamic view of the system design. A
UML sequence diagram can provide the necessary information about what interactions occur
between different objects in the system to achieve the desired functionality.
The sequence diagram can be used to work back through the sequence of interactions that
occur, and identify which interactions are required to fail in order to bring about the HSFM. A
simple fault tree can be constructed to capture the combination of failures which contribute to
the HSFM. As was discussed in Chapter 2, there have been a number of attempts at applying a
fault tree analysis method to software. The most notable of these is probably Leveson’s SFTA
technique [38], [39]. In this method Leveson attempted to use fault trees as a way of identifying
the contributions of code-level behaviour to software-level hazards. The purpose of this step is
different in that fault trees are used in order to identify the failures at the software sub-system
level which may contribute to a system hazard. As such the code itself is not considered in
the fault tree. additionally, unlike Leveson, the use of fault tree templates is not advocated for
this purpose. The fault tree is used to identify the basic events, which are those which relate
to failures of individual elements of the design (such as objects or interactions). These failure
events require further analysis. This process can again be illustrated using the SMS example.
For the SMS, the top level failure in the fault tree is taken as ‘Software system fails to prevent
release on the ground’ as shown in figure 3.7. In order to construct the fault tree below this
failure, it is necessary to consider the UML sequence diagram for release of store as shown in
figure 3.5. For this part of the analysis it is only necessary to consider release of a single store,
as the general case is being considered. In order to work back through the sequence, the starting
point will be the call of the RemoveStore() operation on the station object. This represents
the end of the store release sequence. For the top level failure to occur the aircraft must be on
the ground, and the remove store operation must be sent. If these events do not occur then
neither will the top level failure. The aircraft being on the ground is however considered to be
a normal event as it is not a failure (the aircraft is meant to be on the ground at the time). It
call being sent is a failure event however, as this should not be allowed to happen when the
aircraft is on the ground. This event is therefore developed further.
S/W fails to prevent release of store on the
ground RemoveStore() sent WOW detected incorrectly Aircraft on ground WOW ignored WOW value held is incorrect Failure of checkWOW() Failure of stores manager Incorrect value obtained from sensor
Figure 3.7: Fault tree for SMS HSFM
It can be seen from the sequence diagram (figure 3.5), that before the remove store operation
is called, an interaction occurs to check whether there is weight on the wheels (WOW) or not.
WOW is a boolean value used to determine if the aircraft is on the ground (true), or not. If this
interaction returns true then the remove store operation should not be called. If the remove
store operation is sent with the aircraft on the ground then this will either be that the value
of WOW is not detected correctly, or that the WOW is detected correctly but is ignored and
the store is release anyway. A failure to detect the WOW correctly could either be due to some
sort of failure in the interaction that checks the WOW, or that the WOW value was incorrect
in the first place. This could either be due to a failure of the stores manager, for which WOW
is an attribute, or a failure of the WOW sensor to provide the correct information. The fault
tree (figure 3.7), shows how these events relate to the top level failure. It can be seen that four
basic events are identified (indicated by a diamond symbol under the event). It is necessary to
derive requirements that can mitigate these failures. In order to do this the possible causes of
these failures must be investigated. This is done in step four. The fault tree generated in this
simple example is quite small. For more complex designs there may be many more levels of
decomposition required to identify all the basic events. A more complex design is analysed as