Test Evaluation Techniques - Using Tests for Verification

2.3 Using Tests for Verification

2.3.3 Test Evaluation Techniques

In general, the determination of the test’s result is carried out by the comparison of the outputs with expected values that arise from the requirements of the test objective. This comparison can be either done manually or automatically. The advantage of the automatic evaluation is that an overall result can be created, generating a statement about the success of the test execution – pass or fail. With xUnit, this is for example displayed as a red or green bar, see Chapter 2.2.3.

Typically, the automatic evaluation is done by two techniques, assertions and regression testing, which are described in the following sections.

Assertions

The concept of the assertion, which is represented by a Boolean expression, was first defined by Robert Floyd in 1967 to express the intended behaviour of a program [160]. Based on this concept, Hoare introduced the following notation in 1969 (today known as the Hoare triple):

P_{Q}R (2.1)

It means that if the assertion P is true before the execution of the program Q, then the assertion R is true after its execution [161]. The intention of this notation is the formal verification of the program. In practise, assertions were often used for design- by-contract [162] and for the run-time checking of program states [163, 164, 165]. With the techniques of extreme programming and test-driven development they first became important for the evaluation of test results [54].

Embedded systems typically interact with the real world through sensors and ac- tuators. The signals processed or created by them are dependent on the time, i.e. the value of a signal changes over time. This change can be either discrete or continuous

for the time as well as the signal magnitude. Therefore the signals are divided into four groups: analogue, time-discrete, value-discrete or digital [166]. For the evaluation of such signals the Boolean expression of the assertion has to be extended by a time-dependent behaviour. Different techniques can be realised based on the Assert block from chapter 2.2.3 [167, 168]:

Boundary Checks: Boundary checks evaluate whether the values of a signal remain inside or outside of a specified range. It is possible to define either static or dynamic borders. In addition, one-sided boundaries can be implemented by setting the opposite border to infinity. A special case of boundary checks is the comparison of a signal with a constant expected value (both borders are set to the same value), which can also be extended by symmetric and asymmetric tolerances (the upper border then equals the expected value plus a tolerance, the lower border equals the expected value minus a tolerance).

Gradient Checks: The gradient check is actually not a separate technique, because the gradient of a signal can be considered as an independent signal whose evaluation is carried out by boundary checks. Therefore the gradient check is only listed as an example for a group of checks, which combine a signal processing unit with a boundary check.

Type Checks: Type checks involve two tasks: The verification of the signal’s data type and its range. This is particularly useful when the co-domain is a set of discrete values.

Regression Testing

Regression testing generally specifies the re-running of all or a part of all test cases to identify the impact of modifications of already successfully tested systems [137, 138]. Here test’s outputs are evaluated by comparing them with the outputs of a previous test execution. A difference is made between actual regression tests which test different versions of the same representation of the system, e.g. different versions of the model, and the so-called back-to-back tests which check the equivalence of different representations of the system, e.g. between the model and the source code [169], cf. Figure 2.14. The outputs of the first test execution have to be initially evaluated, because no reference values exist at this point of the testing cycle. This initial evaluation can be done manually, i.e. by “looking” at the signals and then setting them as the new reference, or automatically with the help of assertions.

Model V Model V+x Test Outputs Test Outputs Initial Evaluation Test Execution Automatic Evaluation Comparator Code Comparator Automatic Evaluation Test Outputs Test Outputs Model Test Execution Regression Testing Back‐to‐back Testing

Figure 2.14: Difference between regression and back-to-back testing

The evaluation of the test’s outputs, i.e. the system’s reaction to specific input stimuli, and the expected values, i.e. the reaction of a different version or representation to the same stimuli, is based on the comparison of two signals. In doing so, it is not important that both signals are equal, but their similarity is checked for different criteria. These criteria can consider phase shifts, different computational accuracies, and the impossibility of an exact reproduction of a physical signal in reality. Signals are classified as similar with several methods, which are selected depending on the context:

Difference Calculations: The difference between the amplitudes of the output and reference values is calculated for each simulation step. It is possible to generate either absolute or relative differences. The signals are classified as similar if the minimum and maximum difference is within a specified tolerance. In addition, the tolerance can be modified depending on the characteristics of the signal, e.g. orthogonal to the signal, depending on the gradient, or through circles around the sample point creating a tolerance tube. Thus a phase shift between two signals can be compensated for. More complex techniques can also pre-process the signals, e.g. the difference matrix method first rearranges the output signal to sufficiently fit the reference signal for identified periods of time and then applies the difference calculation [170].

Statistical Methods: These techniques are based on the computation of statistical values, for example the cross-correlation coefficient, which is calculated with the following formula [34]:

c(T ) = 1 T T ! 0 y(t)y"(t − τ)dt (2.2) y is the reference signal, y"is the output signal, and c(T ) is the cross-correlation coefficient depending on the time T , which denotes the analysed time span of the signals. The maximum value of c marks the most probably time lag between the two signals and gives an indication about the similarity of the signals [171]. Other such values are the signal-to-noise ratio and the total harmonic distortion [50].

The fundamental characteristic of these statistical methods, namely the filtering of disturbance values, is simultaneously its biggest disadvantage as, for example, amplitude peaks are disregarded, although they might be important in the context of the test’s evaluation.

In document Test-driven development of embedded control systems: application in an automotive collision prevention system (Page 61-64)