5.3 Integration of Static and Dynamic Techniques
5.3.2 Problems and Challenges
The limited number of trigger conditions available in current versions of the MCDS im- poses some restrictions on the granularity with which execution contexts can be distin- guished. Additionally, the combination of events (i.e. preconditions to start a trace) are also limited. As a result of these limitations, only relatively small call strings can be used to describe execution contexts. To overcome some of these limitations, the translation of trace automata to TQL was done in such a way that different parts of an automaton are implemented in different components of the MCDS. This was achieved by moving the logic for starting a trace (the tracing state so to speak) to the processor observation block, while the management of the other states remained on the central MCX unit. Besides, some of the transition computations were moved to the otherwise unused bus observation blocks.
Due to the nature of the MCDS, which in some sense can be seen as a distributed system, there were some delay problems when triggering traces. This has the effect that a trace is missing a few instructions in the beginning. These problems got even worse when the combination of event signals was moved to unused components of the MCDS. Nevertheless, this was necessary in order to allow a more fine-grained representation of execution contexts. In most cases the delay problems could be resolved by adapting the TQL generation, but it could not be solved completely. Furthermore, the delay problems were not always easily reproducible. Especially trigger conditions which are only active for a single cycle seemed to be problematic, e.g. when a trace is meant to be triggered after
5.3 Integration of Static and Dynamic Techniques
a single call instruction. Events of this kind were observed to work correctly sometimes, but got lost or were delayed for some of the measurement runs. These problems made the automatic generation of TQL scripts from program segments and call strings quite tedious.
Besides the delay problems when starting a trace, instructions were sometimes left out during traces as well. Though this was not a severe restriction, it required an adaption of the annotation algorithm. Thus the implementation of the execution time annotation does not rely on the assumption that only complete program traces are generated by the measurement equipment. Instead, it suffices if the traces contain a timestamp for at least one instruction of every basic block. As this problem occurred only rarely, the Universal Debug Engine could be used to precisely determine basic block execution times nonetheless.
6.1
Methodology
6.1.1 Examined Properties
To evaluate the concepts presented in this work, the method was applied to a set of test programs which are based on real-world embedded software. Firstly, this was done to find out how increasing the number of measurements per program segment influences the WCET estimates and what effect the different cache behavior metrics have on them. It was also part of the experiments to check whether the technique is scalable and can be applied to programs of realistic size. Finally, the resulting WCET estimates were compared to the WCET bounds reported by a static timing analysis, to the execution times which were observed during end-to-end measurements and to context-insensitive WCET estimates.
In order to test the robustness and the efficiency of the partitioning and annotation algorithms, the maximal size of the program segments was restricted for most of the test cases. This means that the number of program segments was artificially increased because otherwise many of the test programs could have been traced by a single program segment. By doing this, it was possible to simulate the behavior of the algorithm for very large programs, but the time and space requirements for taking the measurements were minimized. Hence the results presented in the following should also be applicable to programs which are significantly larger than the test cases which were used for the experiments.
6.1.2 Test Setup
The experiments were conducted on an Infineon TriCore TC1797ED evaluation board. For generating program traces version 2.04.09 of the Universal Debug Engine from pls Development Tools was used. WCET bounds for the example programs were determined using the aiT module of the AbsInt Advanced Analyzer a3 for TriCore, version 9.08i.
To evaluate the effect of the cache on the execution time as well as on the predicted WCET, the code of the test programs was linked to uncached and cached portions of
6.2 Test Programs
instruction memory. For the first case, programs were stored in the TriCore scratchpad RAM (SPRAM). The same memory is used for the instruction cache, so in effect fetching an instruction from the SPRAM is as fast as fetching an instruction from the cache. Because of the limited size of the SPRAM, this could only be done for small programs. All other examples were stored in a cached memory area of the TriCore program flash. In the following, for all test cases which use the program flash (and hence the instruction cache), this is stated explicitly. If this is not the case, the test program was stored in the SPRAM which results in the same execution time as if every instruction fetch from a cached memory area would produce a cache hit.
6.2
Test Programs
6.2.1 DEBIE-1 Benchmark
The DEBIE-1 (DEBris In orbit Evaluator) satellite instrument is a sensor unit for detect- ing impacts of small space debris and micro-meteoroids. Its on-board control software was adapted to serve as a benchmark for WCET analysis tools. The software is written in C and was originally developed by Space Systems Finland Ltd (SSF). To serve as a benchmark, the software was adapted by Tidorum Ltd to make it more portable. An internal simulation of I/O devices and an internal test driver were added as well. As a result of these modifications, the benchmark can be compiled for new target architectures with very little effort and the resulting binary already contains code to generate meaning- ful input data for the original control software. Hence the benchmark is especially useful for measurement-based WCET analysis methods as the system can be observed under realistic conditions without difficulties. SSF provides the DEBIE-1 software for the use as a WCET benchmark under specific terms. A copy of the software can be requested from Tidorum4.
The control software comprises six tasks. In the original system, these tasks are activated by interrupts and all of them have real-time deadlines. For the benchmark, this behavior is only simulated, i.e. the benchmark is single-threaded. As part of the experiments, the WCET of every task was estimated. This is along the lines of what was done during the WCET Tool Challenge 2008 [HGB+08], for which the DEBIE-1 benchmark was also
used. Further information about the purpose of the tasks can be found in the wiki [WCC] of the WCET Tool Challenge 2008.