7.4. The Parallel System Simulation Design
7.4.3. Functional Design of the Simulation
7.4.3.4. A Better Representation of Concurrency
The main simplification in the first process scheduling algorithm was the assumption that if the processing followed the pattern of "execute earliest process then allocate spawned processes" that this would simulate the behaviour of the concurrent system. In fact it only provides an approximation of it: the reason for this lies in the allocation strategy.
In order to achieve the optimum sharing out of work in the parallel machine, it is necessary to know at the time of allocation of a process the work load that exists in all the processing elements at that time, so that the process can be sent to the least busy. With process execution times of variable and unpredictable length it is not possible to update the state of all the processing elements accurately at the end of executing one process under the first scheduling algorithm.
Consider the situation shown in Fig.7.6: at time T1 the process on PE no.l will be chosen for execution by the simulation software finishing at time T2. A review of processes queued up in the different processing elements performed at T1 will show correctly that PE nos.2, 3 and 4 have
L / l / V U I f r 4 3 PE No. 2 1 0 T1 Time T2
Fig. 7.6 - Timing of Process Execution
execution of the process on PE no.l would not be able to predict that PE nos.2 and 3 had completed the execution of their processes because they were substantially shorter that the processes on PE no.l. It would however be able to judge correctly that PE no.4 was now involved in executing its process by inspection of the Time field in the process. Thus the information held in the controller about the state of work load of each processing element is likely to include some inaccuracies if this method of simulating the scheduling of processes is maintained. The degree to which these inaccuracies may effect performance figures is difficult to quantify and will vary from query to query. However as one of the important aims of the simulation was to test different load balancing strategies it was important to try to avoid inaccuracies in this area. For this reason a move to a more realistic approach towards the modelling of concurrent process execution and allocation was attempted.
The first approach to the representation of concurrency in the simulation can be described as event orientated or the variable time method. In this the state of the system was checked at time intervals set by the start and finish of a particular executing process. As shown above the use of this approach with the single global queue of processes ready to execute led to inaccuracies in the information about the work load in
L/WVUI*•
individual processing elements. Two corrections were possible for this situation: the first involved halting at the end of a process execution, and at that stage performing a trial execution of all processes that could interfere with the outcome of it (ie affect the work load of other processing elements in the case of the first process having produced spawned processes). The processes which were executed to obtain information would then either have to be subject to cancellation or roll-back, or their results stored on a "future" results list. This system was rejected on the grounds that it could involve considerable extra storage.
The second method of time representation is that of interval orientated simulation which involves stepping through the system at fixed time intervals and operating the system at that point. A variant of this was used for the full version of the simulation. Fixed intervals updating of the system data was used, but as the system model had been developed the execution of an individual process was an atomic, ie indivisible operation, and there was no attempt to alter this or halt execution of a process midstream. These considerations led to the development of the following algorithm for the "time step" method:
set System_time to Time_step, while there is still processing to do
{while (process_records on ready_to_run_queue with
Time less than System_time) or (allocation_records on ready_to_allocate_queue with
Time less than System_time) {while (process_records with Time less than System_time)
{identify corresponding process, execute process,
update relevant queues }
while (allocation_records with Time less than System_time) {identify corresponding processes,
distribute processes to suitable processing elements, update relevant queues
)
}
increment System_time by Time_step }
This method meant that for each time step in the processing of a query all processes whose Time, ie start time value fell within it were executed before any allocation of spawned processes took place. Data on the duration of each process was stored so that when the software came to allocate all the resulting processes, the full information about the work state of each processing element was available. Of course allocation of processes during the time step could result in further processes becoming executable within the time step, so the loop of "execute all processes then allocate all spawned processes" was repeated until no further action was possible within that time step. The system time was then incremented to the next time step and the entire loop restarted. Fig.7.7 shows the top level functioning of the system under this modified algorithm.
One criticism of this method is that injudicious choice of the time steps leads to large computational overheads: however these take the form of processing time not memory usage, and as far as this simulation was concerned total times for query answering were not excessive, the limiting factor proving to be storage space. This is discussed in more detail in Chapter 8 in the section on benchmark tests but a typical run time for one of the larger queries supported by the system was under three minutes.