3.1 State Saving
3.1.3 Incremental State Saving (ISS)
If the above mentioned techniques address the problem of state log/restore by
tuning the checkpointing interval to minimize the WCT required for a rollback operation, they do not take into account that, even if seldom-executed, a check-
pointing operation can be onerous due to the size of the the state Si being saved. Additionally, if Si is large, but only a small portion of it was modified since the
last checkpoint, there is a large amount of wasted time spent copying redundant information.
The goal of Incremental State Saving (ISS) is to limit checkpointing overhead by reducing its execution time, and by limiting at the same time the amount of
memory used for a single log.
The first ISS approach has been published in [10], and is based on the fact
that each event e ∈ E is augmented with the following information:
• The value of modified state variables after the event is processed;
• The value of modified state variables before the event is processed; • The ST Tgen when the event is generated;
• The ST Texe when the event must be executed, with Texe> Tgen;
queue, where (for each LP) a list of modifications of the states is stored. The
nodes of this queue are linked to the events that caused the updates. Upon the receipt of a straggler message associated with timestamp Tstraggler, the rollback
operation is carried on in a way different from the previously presented one. In particular, all the events of the rolling back LP i such that Tstraggler ≤ Texe ≤
LV Ti are scanned, and the corresponding state variables that where modified are put back in place, taking care that the same variable is updated only once4.
Of course, this technique is not transparent at all, as the model programmer must be aware of the concept of rollback and state saving, and has to access the
kernel’s data structures to make copies of the state variables before modifying them. This is necessary, because the simulation kernel is not aware of where the
simulation state is stored, and of which parts are being updated by an event. The work in [141] makes an additional step towards transparency, by re-
lieving the application-model writer from directly modifying simulation kernel’s data structure. In particular, it proposes to implement incremental checkpoint-
ing in a transparent fashion by relying on Object Oriented overloading mecha- nism. This choice narrows its applicability to OO programming languages (like,
e.g., C++ or Java). The basic idea is based on two essential points:
• all (user-defined) functions which process events (i.e., the event handlers)
must be redefined by means of overloading;
• state variables must be encapsulated within classes defined by the simula-
tion kernel.
Encapsulation allows the simulation kernel to discriminate state variables
4
Although this might seem an incorrect algorithm for state restore, we emphasize that the original proposal in [10] was targeted at a specific scenario, namely digital logic simulation, where additional property guarantee that the final simulation state is restored correctly. We refer to the original paper [10] for a complete discussion of the algorithm, and to Chapter 4 for a more general approach to incremental restore.
from other ones which might be used to store only temporary results used for
computation. The application must therefore notify the kernel of which are the state variables by using the special signature State<class T>, which wraps the
class T that the system will handle as part of the state. The programmer will be allowed to manipulate wrapped objects by using overloaded methods, which will
make a copy of the accessed data buffers before the actual update is performed. In this scheme, the user must be aware of the notion of rollback and of
the fact that the simulation kernel will perform incremental checkpoints, yet differently from [10] the operation is performed by relying on a service exposed
by the simulation kernel which gives a certain degree of freedom in the definition of the simulation state.
The proposal in [179], targeted at x86 computing architectures, supports transparent incremental state saving by relying on software instrumentation.
The simulation model’s assembly code is parsed and, whenever an instruc- tion that updates a memory region is found, a call to an ad-hoc module is
prepended, which generates a copy of the old memory’s value before updating it. Whenever a rollback operation is performed, the chain of memory updates
is scanned backwards, in order to realign the content of the simulation state to time Trollback. This technique is oriented to the programmer, so that she
does not have to alter the original simulation code because the instrumentation phase automatically detects which are the possible assembly instructions that
will alter memory content. Nevertheless, the approach used in [179] suffers from a performance sub-optimization. Let us consider the following code snippet:
1 for(i = 0; i < MAX; i++) {
2 state−>array[i]++;
3 }
increment some statistics in the model’s state. Yet, each iteration of the loop
would entail two memory updates, one to variable i and one to state variable
state->array[i]. The proposed solution would call the ad-hoc module twice per iteration, and would create a node in the state chain for each modification of the state->array[i] variable, given that its granularity is word-based. There-
fore, the single event’s execution delay is increased depending on the memory- access pattern of the simulation model which, even in simple examples like the
proposed one, could be non-negligible.
The work in [125], which will be thoroughly discussed in Chapter 4, tackles this issue in the case of dynamically-scattered simulation states by relying on a dirty bitmap. Essentially, it relies on assembly-code instrumentation as well,
but given that an ad-hoc memory manager is (transparently) interposed between the simulation model and the underlying system memory manager, each memory
update is materialized by simply setting one bit (corresponding to the touched memory area) to 1, stating that the memory area was updated. In this way, the
checkpointing operation is carried out periodically, saving only the areas that were actually updated since the last checkpointing operation. This allows the
simulation kernel to rely on any of the aforementioned optimizations regarding the checkpointing interval χ, as it has been shown in [168]. The latter approach
relies on an integral function for fine-tuning the checkpointing parameters, en- forcing as well stability of the decision towards fluctuations in the execution
dynamics.
In Table 3.1 we show a comparative summary of all the state saving tech- niques presented so far, which highlights which properties are provided by each
SS Technique PSS ASS Scattered Sim ulation State Incremen talit y Arbitrary Gran ularit y T ransparency Instrumen tation-Based Analytic Mo del T emp oral Measuremen ts Heuristics Autonomicit y Stabilit y to Fluctuations [76] [77] [13] X X [121] X X X X X [140] X X X X [152] X X X X [45] X X X X X [133] X X X X [10] X X X [145] X X X [141] X X [179] X X X X X X X [125] X X X X X X X X [134] X X X X [168] X X X X X X X X X X X X
Advancements by the thesis
As it will be discussed in Chapter 4 we will explicitly allow the programmer
to rely on dynamically-allocated memory to scatter the simulation state, and to make it grow/shrink depending on the actual model’s execution dynamics.
This will be done transparently at linking time by relying on specific malloc wrappers, which will redirect calls to the standard memory-allocation library
to our specific memory manager. The memory manager will work at chunk granularity, i.e. upon simulation startup a large buffer (for each LP) will be
pre-allocated, for serving subsequent memory request by each LP.
Furthermore, we will transparently integrate (via static software instrumen- tation) a memory-update tracking subsystem, which will allow the simulation
kernel to know precisely which portions of the simulation states have been up- dated since the last log operations. This information will be kept as small as
possible, to cope with memory usage, by relying on a dirty bitmap, i.e. a com- pressed data structure where each bit tells whether a specific chunk has been
modified.
By relying on this information, the simulation kernel will be able to transpar-
ently execute Incremental State Saving, thus reducing the overhead required for taking a simulation snapshot and additionally minimizing the memory footprint
related to state saving.