Fault Model - LnCm fault model : complexity and validation

• In a given block, whenever there is a data dependency between two variables, the block is split into two such that the dependency is across blocks. Thus, there is no data dependency within a block. This is done based on the assumption that error propagation occurs across blocks, rather than within a block.

Extended-CFG under Multiple Soft-Error

The extended-CFG used for Multiple-bits errors have this additional property:

• Φ: is a function Φ :V →B, whereBis a function that assigns a boolean value to each vertex v ∈ V. The vertex is assigned a value 1 if it is a potentially vulnerable block, 0 otherwise (details are provided in Chap- ter 5.3).

Notation: The set of paths inGis denoted byρGand the set of paths between

two vertices u, u0 ∈ V by ρu,u_G 0. Given a path ρ = u·v . . . v0·u0, then ˆρ = {a|a∈v . . . v0}. The length of a pathρinG, denoted byLength(ρ), is given by P

(u,v)∈ρW(u, v).

3.2 Fault Model

This thesis considers the transient hardware faults originating at the transistor level, that ultimately impact on the software modules. As previously mentioned, these faults are aggravated by current hardware trends. These faults impact on the program state by altering the content of CPU registers and memory, and through the process of error propagation causes errors [7] to exist in the software system. These errors are usually mimicked by injecting bit-flip errors1 1_{In this thesis bit-flip errors are taken to mean the same as corruptions, and from this} point, both would be used interchangeably unless specified otherwise.

3.2. FAULT MODEL 27

in registers and memory. The general assumption is any number of bit-flip errors may occur in any number of locations. This thesis considers single bit- flips and multiple bit-flips errors occurring. As mentioned, this thesis focuses on soft-errors in register locations only.

3.2.1 Single Fault

Traditionally a single fault assumption has been used for fault injection analysis. This means in a given location a single bit is flipped in a single execution of the program. Research has shown multiple bits errors2_{occurring in the field as} single-cell, single-row, single-column, multiple-rows, multiple-columns or chip- wide errors [51, 97, 150]. This pinpoints the need for considering multiple-bits errors in software dependable validation. This thesis adopts the Single Bit-Flip or L1C1 fault model as baseline for evaluating the efficiency of the adopted multiple-bits errors models.

3.2.2 Multiple Faults

In consideration of the potential of multiple soft-errors affecting the running software, research have begun studying the impact of double-bits errors on software for dependability evaluation [9]. In [163], the double-bits fault model has also been shown to mimic the presence of software bugs. The mentioned research focused on a version of double-bits errors occurring within a single location. In this thesis, this fault model is modelled as two bits flipped within a single location, and referred to as model as Double Bit-Flips or L1C2 (which is a specific case of the LnCmfault model) . Lu et. al. [103] adopted the double-bits errors

to show the applicability of their fault injection tool. This thesis adopts the L1C2 fault model to evaluate the viability of the proposed variant double-bits models in terms of its ability to induce programs to fail differently. This thesis

3.2. FAULT MODEL 28

adopts a second pattern of double hardware faults occurring as single faults in a pair of locations. This thesis models this fault model as double single bit-flip errors in two different locations, and this is refered to as L2C1 (which is also a specific case of the LnCmfault model). The thesis proposes this model in order

to ascertain the need to adopt it for the purposes of dependable software validation. The model assumes in any run of the program only two errors can occur, as such it selects two locations and flip one bit in each. It should be mentioned that the work presented in [103] tested the applicability of their fault injection tool with both the L1C2and L2C1fault models, and their work post dates that presented in [2] which serves as the basis of some of the work presented in this thesis.

This thesis generalises the double faults model to allow multiple faults to be introduced in a single run instead of two. This model assumes any number of errors can occur in a single execution of a program, as such several locations are selected and a minimum of a single bit is flipped in each location. This new multiple faults model is referred to as Multiple Locations Multiple Multiple Corruptions (LnCm, where n is the number of injection locations and m the

maximum number of faults to inject in each location) fault model.

From the extended-CFG perspective of a program, it means that (i) several variables in a given block can be corrupted, (ii) several blocks can be corrupted, with a single variable being corrupted in each block or (iii) several variables being corrupted in several blocks of the program. As this thesis focuses on capturing the impact of multiple hardware faults on a program, it is important to (i) determine the location (block) where the fault will be injected and (ii) determine the variables in which the faults will be injected into. At one extreme, the entry block (root of the control flow graph) can be chosen and all variables in that block being selected as target variables. At the other extreme, every variable within every block can be target variables. However, the computational cost of validation will be prohibitive.

In document LnCm fault model : complexity and validation (Page 54-57)