Fault injection in software development cycle

2 Background and Related Work

2.4 Fault injection

2.4.2 Fault injection in software development cycle

Depending on the phase of the software development cycle in which the system is, different fault injection techniques can be applied, as summarized on Table 2-1: (i) Simulation-based fault injection and (ii) Prototype-based fault injection [Hsueh et al. 1997].

The simulation-based fault injection technique is used to evaluate the dependability of a system that is represented by a series of high-level abstractions, allowing early detection of design faults, before the system is

started to be built. The early stage of development, characterized by the absence of any implementation details, imposes a simulation based on simplified assumptions, like the occurrence of errors and failures according a predetermined distribution, such as the exponential distribution. With this technique, the faults are injected by directly modifying the computational state of the simulation [Carreira et al. 1999]. Among the most known simulation-based fault injectors, one can mention the FOCUS [Choi

et al. 1992], the MEFISTO [Jenn et al. 1995] and the DEPEND [Goswami et al.

1997] tools. Although this method is suitable for the evaluation of the effectiveness of fault tolerant mechanisms and a system dependability in the early phases of its development (conception and design), known as its main advantage, it requires accurate input parameters that are difficult to supply [Hsueh et al. 1997]. It should be noticed that parameters from previous experiments could not be adequate due to design and technological changes. This technique is also highly appropriate for the evaluation of dependability of critical systems where the injection of faults in the actual prototype or operational system would be dangerous, as happens in nuclear power systems and avionics. Despite these advantages, accurate results demand very detailed models, whose development can be very expensive. Moreover, manufacturers might not reveal the information needed and the simulation can take a long time to complete.

Phase in Software

Development Cycle Technique

Conceptual and

Design Simulation-based fault injection Prototype and

Operational System Prototype-based fault injection Operational System Measurement-based analysis

Table 2-1 - Experimental techniques for dependability evaluation and their suitability for the different phases of software development cycles.

On the other hand, Prototype-based fault injection allows the evaluation of the system without any assumptions about the system design, and thus, allows more accurate and realistic results, compared to simulation-based analysis. This technique consists on the injection of faults on the target system and on the observation of the corresponding effects. The prototype-based fault injection is useful to:

 Identify system weaknesses, regarding components causing dependability bottlenecks.

 Analyze the system behavior in the presence of faults: (i) determine the coverage of error detection and recovery mechanisms, and (ii) evaluate the effectiveness of the fault tolerance mechanisms and the corresponding performance loss. In this context, most of the approaches fall into two main categories [Hsueh et al. 1997]:

 Hardware Implemented Fault Injection (HWIFI) – The faults are injected on hardware level, through logical or electrical faults. This category can further be subdivided into HWIFI with contact, when there is physical contact with the circuit pins of the target system (e.g, methods that use pin level active probes and socket insertion), and HWIFI without contact, in the cases where the injector has no direct contact with the target system (e.g., faults are injected through heavy ion radiation and electromagnetic interferences).

 Software Implemented Fault Injection (SWIFI) – The faults are injected at software level (through the corruption of code or data), reproducing errors that would have been produced by faults occurring in hardware of software. SWIFI techniques can also be further categorized into two new classes, depending on the time at which the faults are injected: (i) compile-time injection, corresponding to the case when the faults are injected into the

source code of the target program, and (ii) run-time injection, when the faults are injected during system run-time.

Contrasting with SWIFI, HWIFI techniques require the use of additional and specific hardware to introduce physical faults on the target system, which increase the cost of its use. Moreover, the increasing complexity of hardware makes it harder to inject physical faults as well as to define the corresponding simulation models that effectively represent the systems. Thus, due to its greater flexibility, portability, lower cost and ease of development, the SWIFI tools have become a clear and popular choice in the last decades. However, despite these advantages, the SWIFI tools have some intrinsic drawbacks that should be mentioned:

 Inaccessibility of some locations, when compared to HWIFI tools (e.g. some processor and system resources cannot be reached) [Carreira et al. 1998b];

 Difficulty in injecting permanent faults, except for very particular circumstances;

 Disturbance of the execution and, consequently, on the performance of the system under test. This problem, known as intrusiveness, is a consequence of the instrumentation necessary to inject faults and monitor the corresponding effects in the target system. Special care should be taken in order to minimize its effects.

 Poor time resolution due to the possible inability to follow some error propagation, particularly, for errors with very short latency like CPU and bus faults.

Generically, as major drawbacks to the use of the prototype-based fault injection, one can mention the restriction of the study to the set of faults that can actually be emulated and the impossibility to obtain measures like availability and the mean time between failures.

Although all of the experimental techniques have their limitations, they should be used in their appropriate phases and, given their complementarity, their combination can result in a more complete study of the dependability of systems.

As stated in section 2.3, fault injectors are a crucial part of dependability benchmarks. The next section briefly presents the most relevant fault injection tools developed in the last decades. For the purpose of this thesis, only those belonging to the SWIFI family are mentioned.

In document Dependability Benchmarking for Large and Complex Systems (Page 64-68)