VEX uses the POSIX thread library to control the execution of threads in virtual time, wrapping the operating system scheduler. The VEX scheduler is itself a thread that is spawned before entering the main method of the program. The scheduler allows an application thread to run for one timeslice by invoking a timed wait for the duration of the timeslice, suitably scaled by the Time Scaling Factor (TSF) of the thread’s currently executed method. For example, if the default scheduler timeslice is 100ms and the current method has a TSF equal to 3 then the thread executing it gets a timeslice of 300ms in real time. The method’s estimated virtual time is then its measured execution time divided by 3, thus adding 100ms of virtual time for 300ms of execution. This simulates the effect of the method executing 3 times faster within a single timeslice.
What happens if the scaling factor changes during the timeslice, for example as a result of a scaled method returning to its calling method that has scaling factor 1? In this case the scheduler is notified and updates the duration of the remaining timeslot accordingly. The scheduler is similarly notified if a method with scaling factor 1 calls a method with T SF 6= 1. The VEX scheduler suspends a thread by sending a POSIX signal to its registered thread-id. A thread T receiving such a signal updates its thread-local CPU time and compares its new timestamp with the thread at the head of the queue. The following scenarios are possible:
• T ’s virtual time increase is less than a certain percentage p of its assigned timestamp. The OS scheduler has been scheduling other processes (system background processes or the VEX scheduler thread) instead of the running thread selected by VEX, or the running thread has spent most of its time within VEX (whose execution time is disregarded from virtual time measurements as explained in Section 3.2). As our default policy enforces a constant virtual timeslice to all threads, T notifies the VEX scheduler that it will continue execution, while the scheduler sleeps for the remainder of T ’s virtual timeslice.
• T ’s virtual time increase is more than p of its assigned timestamp s. In this case, there are two possible courses of action:
– If the virtual timestamp of the head HQ of the runnable threads queue Q is lower
than T ’s virtual timestamp, then T is suspended, i.e is set to runnable, and remains blocked within the signal handler. HQ is then resumed.
– If the virtual timestamp of HQ is higher than T ’s virtual timestamp, for instance if
HQ is Timed-Waiting, then T exits the signal handler and resumes.
In our prototype implementation we have set p = 75%. In principle we could set p = 100% and request that each thread executes in virtual time no less than its assigned timestamp. We decided against this, to avoid requesting smaller timestamps, when a thread has progressed for a large part of s. For example a thread that would be found to have executed for 90% of s would be requested to run for another 10%. Asking for such small timeslices presents two issues:
• There exist system-related limitations on the lowest duration that can be assigned by VEX as a timeslice to a thread (see following paragraph).
• Assigning larger timestamps allows us to distinguish a thread that has been blocked without VEX knowing it (we elaborate on such “Native Waiting” threads in Chapter 5) and one that has not made enough progress due to OS scheduler decisions.
As the VEX scheduler is itself subject to the OS scheduler, it competes for CPU access with other system processes, that might have an equal or higher priority than it. Even the VEX controlled threads have the same priority as the scheduler thread. As a result, when the timeslice timeout expires, the VEX scheduler thread might not be immediately scheduled, thus increasing the average timeslice assigned to VEX-simulated threads. This creates limitations on the lowest possible timeslices that can be assigned to threads in virtual time and in turn, on the values that can be used as TSF s.
Based on our experimental results presented in Table 3.1, adapted timeslices of less than 1ms in real time for decelerations by a factor T SF > 100 (assuming the default scheduler timeslice of 100ms) might not be accurately simulated. The second and third column of Table 3.1 show the
3.1. Scheduler 71
Requested Timeslice [µs] Normal priority timeslice [µs] RT priority timeslice [µs] Mean C.O.V. (%) Mean C.O.V. (%) 20 77.05 31.52 24.28 12.63 40 98.48 187.08 44.39 27.02 80 131.14 30.09 84.52 23.18 160 228.11 109.87 164.63 16.43 320 388.56 16.27 324.89 11.18 640 739.88 53.02 644.24 5.33 1000 1079.03 42.46 1003.14 2.29 10000 10604.90 23.00 9953.69 1.19
Table 3.1: VEX actual scheduler timeslices according to requested timeslices: lower requested timeslices are inexact, because the OS schedules other processes after the expiry time. Reported values are for Host-1 (see Appendix A). The means and COVs refer to all samples gathered from a single run with the requested timeslice. Lower timeslices from asynchronous scheduler notification are not taken into account.
mean and coefficient of variance (COV) for the scheduler timeslices acquired for normal priority. The high COVs demonstrate the dependence of each timeslice on the other processes requesting to run at each point. Increasing the scheduler thread’s priority in Linux to the highest Real Time (RT), decreases this limit to approximately 20µs (see Table 3.1), thus allowing for decelerations by a factor of 5000. We expect that such values provide an adequately wide range of available TSF s to cover meaningful hypothetical virtual specifications. We note here that integration of the VEX and OS scheduler at the kernel-level would further relax these limitations.