A Survey of Fitting Device-Driver Implementations into Real-Time Theoretical Schedulability Analysis

(1)

A Survey of Fitting Device-Driver Implementations into Real-Time Theoretical Schedulability Analysis

Mark Stanovich Florida State University, USA

Abstract

General purpose operating systems (GPOSs) are commonly being used in embedded applications. These applications include cellphones, navigations systems (TomTom), routers, etc. While these systems may not be is considered “hard” real-time, they have timing constraints. That is, missing a deadline may not be catastrophic, but at the same time missed deadlines should not occur frequently.

Recently there have been many enhancements in GPOS’s real-time characteristics. These include standardized interfaces (e.g. POSIX), more precise processor scheduling, and increased control over system activities. While these enhancements have improved the ability for real-time applications to meet timing constraints, there is still much left to be desired. In particular, improper management of device driver activities can cause extreme difficulties for a system to meet timing constraints.

Device drivers consume processor in competition with other real-time activities. Additionally, device drivers act as I/O schedulers, meaning that timeliness of activities such as network and storage device I/O are directly affected by the operation of the device drivers. In order to guarantee deadlines, these factors (processor time and I/O scheduling) must be properly managed. This requires understand- ing device driver characteristics and performing abstract analysis (on their operations) to ensure timing constraints will be met.

This survey will first provide a brief introduction to real- time scheduling theory, which provides the foundation for ensuring timeliness of a system. We will then cover some basic attributes of an operating system that affect the ability of a system to adhere to the theoretical schedulability models. Finally, this paper will survey the approaches that have been developed to deal with the impact device drivers have on the timeliness of an implemented system.

1 Introduction

A real-time system has constraints on when activities must be performed. Some examples of such systems include audio/video devices (e.g., mp3 players, mobile phones), control systems (e.g., anti-lock braking systems), and special devices (e.g., navigation systems, pacemakers).

The correctness of the system not only depends on the correct output for a given input, but also depends on the time at which the output is provided. If the output arrives too early or too late, the system may fail. This failure could be as simple as a visual glitch when watching a movie, or could be as severe as an explosion at a chemical plant. These timing constraints are an integral part of the system, and guar-

anteeing that they will always be met is one of the main challenges of building such a system.

Traditional real-time systems have typically been embedded devices. These devices have limited resources, use specialized hardware and software, and provide only a few functions. Traditional real-time systems have the advantage of simplicity, which makes the challenge of validating the system’s timing correctness tractable. Timing constraints of the system are guaranteed to be met without great difficulty because mapping the theoretical models to the implementation is generally straight-forward and the slight variations from the theoretical models introduced by the implementation can be compensated by minor adjustments to the theoretical model.

Over time embedded computing systems have become more prevalent and more integrated into the world around us. Embedded systems are now expected to perform numerous, complex functions. One illustration of this evolu- tion toward greater complexity is the mobile phone. When these devices first emerged on the market, the only functionality expected was to allow voice communication. Now, if we look at any of the new “smartphones” we will see a much different device, not only capable of voice communication, but resembling a desktop system with services including gaming, email, playing music, taking pictures, web browsing, and navigational support. These additional functionalities require much more complex hardware and software, compared to traditional embedded systems.

While many real-time applications still run on specialized hardware and software, it is becoming much more common to see real-time systems utilizing off-the-shelf hardware and software. In particular, general purpose operating systems (GPOSs) have found their way into the real- time domain. One recent example of this is the development of the Android platform [21]. The Android platform utilizes a modified version of the popular Linux kernel and other readily available applications.

Using a GPOS has numerous advantages, including wide-spread familiarity, lower cost, reduction in main- tenance, and availability of many software components.

However, GPOSs are much more complicated, making it very difficult to analyze them, and thereby guarantee timing constraints.

These GPOSs were never designed for real-time envi- ronments. In fact, a common goal of a GPOS is to improve average-case performance and maximize throughput, many times at the cost of increasing the range of execution times between the best-case and worst-case behavior. In contrast to explicitly designed real-time systems, most GPOSs were not designed to consistently provide low-latency response times, predictable scheduling, or explicit allocation of resources. The lack of these attributes can significantly hinder the ability of a system to meet deadlines.

(3)

Fortunately, significant progress has been made toward providing real-time capabilities in GPOSs. For example, Linux kernel improvements have reduced non-preemptible sections [51], added high-resolution timer support [19, 20], and included application development support through POSIX real-time extensions [25].

On the other hand, the device-driver components of GPOSs have been generally overlooked as a concern for systems that must meet deadlines. Device drivers are an integral part of the overall system, allowing interaction between applications and hardware through a common inter- face. These device drivers allow an OS to support a wide range of devices without requiring a substantial rewrite of the OS code. Instead, if a new piece of hardware is added, one can just write a new device driver that provides the needed abstraction, without ever touching any of the application software that uses the device.

Device drivers can have a considerable effect on the timing of a system. They typically have complete access to the raw hardware functionalities, including the entire processor instruction set, physical memory, and hardware scheduling mechanisms. Worse, device drivers are often developed by third parties whose primary concern is ensuring their devices meet timing constraints, without regard for other real- time activities [34].

Numerous theoretical techniques exist to guarantee given activities will be scheduled to complete by their associated timing constraints [67]. These theoretical techniques generally rely on the real-world activities adhering to some abstract workload models and the system scheduling this work according to a specified scheduling algorithm.

The difficulties emerging are that many of the device driver workloads do not adhere to known workload models. Fur- ther, the scheduling of this work on GPOSs tends to be ad hoc and deviates significantly from the theoretical scheduling algorithms that have been analyzed, thereby making many analysis techniques unusable for such systems.

Other difficulties with device drivers are that the scheduling of their CPU time is commonly performed through hardware schedulers. These hardware schedulers are typically not configurable and the scheduling policy is prede- termined. This inflexibility in allocating CPU time means that workload models that provide better system schedulability may not be usable. Therefore, activities may not meet their timing constraints due to inappropriate allocation of the CPU, even though they logically should be able to because the amount of CPU time is available, just not at the right time.

This paper is organized as follows: Section 2 will cover some of the basic aspects of scheduling theory. Section 3 will provide an overview of how scheduling of the CPU is commonly performed on computer systems. Section 4 will provide an idea of what is required in order to implement

a real-time system. Finally, Section 5 will provide some of the more important problems and developments with fitting device drivers into a real-time system.

2 Scheduling Theory

Scheduling theory provides techniques to abstractly ver- ify that real-world activities will complete within their associated timing constraints. That is, scheduling theory provides the ability to guarantee that a given abstract workload will be scheduled on a given abstract set of processing resources by a given scheduling algorithm in a way that will satisfy a given set of timing constraints.

There exists a substantial amount of theoretical research on analyzing real-time systems, much of which traces back to a seminal paper published by Liu and Layland in 1973 [41]. In this section we will review a small portion of this theoretical research.

2.1 Workload Models

One aspect of a system that must be modeled is the work to be completed. An example is some calculation to be performed based on sensor inputs. If one were to think in terms of a gasoline engine in an automobile,¹a calculation may be used to determine the amount of fuel to inject into a cylinder. This calculation would use the sensor readings such as air temperature, altitude, throttle position, and others as inputs. Given the inputs, a sequence of processor instructions would be used to determine the output of the calculation. Then, the appropriate signal(s) would be sent to the fuel injection mechanism. Execution of these processor instructions are the work for the processor resource. The term typically used for one instance of processor work (e.g. one calculation) is a job.

Typically, to provide an ongoing system function such as fuel injection, a job must be performed over and over again.

We can think of performing some system functionality as a potentially endless sequence of jobs. This sequence of jobs performing a particular system function is known as a task.

10 15 20 25 30 35 40 45 50 55 60 65

5 0

Figure 1.Gantt chart representing execution of work over time.

One way to visualize the work being performed on a given resource over time is through a Gantt chart. In Fig-

1The automobile engine used throughout this section is imaginary and used only as an illustrative analogy. The actual design of an automobile engine is at best more complicated than this and most likely much different.

(4)

ure 1, each shaded block represents a job using a given resource for 5 units of time. So, one job executes over the time interval between 0 and 5, another executes over the time interval between 25 and 30, etc. The amount of work performed by a given job will be referred to as the job’s execution time. This is the amount of time the given job uses a resource. Note that all jobs of a given task may not have the same execution time. For instance, different code paths may be taken for a different sensor input values. One input may require fewer processor instructions, while another may require more, thereby varying the execution time from job to job.

Each job has a release time, the earliest time when the job is allowed to begin execution. This release time may depend on data being available from sensors, another job being completed, or other reasons. This is not necessarily the point in time when the job begins execution, since it may be delayed if another job is already using a required resource and the newly released job cannot acquire the resource immediately.

Another term used in the abstract workload model is deadline, which is the point in time when a job’s work must be completed. In an automobile engine, a deadline may be set so that the fuel must be injected before the piston reaches a certain position in its cylinder. At that position a spark will be produced and if the fuel is not present, no combus- tion will take place. This point in time when the job must be completed may be represented in several different ways.

One is known as a relative deadline, which is specified as some number of time units from the release of the job. The other is an absolute deadline, which is in relation to time zero of the time keeping clock. For example, a job with a release time of t = 12 and relative deadline of 5 would have an absolute deadline of 17.

One commonly used abstract workload model used to describe a recurring arrival of work is the periodic task model. In the periodic task model, the next job is separated from the previous job by a constant amount of time. An example for the use of the periodic task model can be thought of in terms of an engine running at a constant number of rev- olutions per minute (RPM). The calculation for the amount of fuel to inject must be performed some constant amount of time from the previous calculation. While this model does not work if the RPMs of the engine change (this will be taken into account later), this does characterize many applications that are used to monitor and respond to events.

To express a periodic task denoted as τi, where i identifies a unique task on a system that contains multiple tasks. A task has a period, Ti defining the inter-arrival time of jobs of the task. To represent the k^thjob of task i, the notation, j_i,k is used. Therefore, a task τi can be represented as a sequence of jobs j_i,0, j_i,1, ..., j_i,k, j_i,k+1, ... where the time between the arrival or release of any job ji,k+1is Ti time

units from job ji,k. A task also has an execution or compu- tation time, Ci, that is the maximum execution time of all the jobs for a given task. This is referred to as the task’s worst-case execution time (WCET). Each job also has a relative deadline from its release time. This relative deadline is the same for every job of a given task and is denoted as D_i. At this point we can describe a periodic workload as τ_i comprised of three parameters: Ci, Ti, and Di.

r _i,k d_i,k r_i,k+1

D_i

≥Τi

C_i

≤

i,k

completes j

Figure 2.Representation of sporadic task τi. The periodic task model has the constraint that the inter- arrival times between tasks must be equal to the period (Ti).

A task that treats the period only as a lower bound between inter-arrival times is known as a sporadic task. So, if we have a sporadic task τi and we denote the release time of job k of task τi as ri,k, then ri,k+1− ri,k ≥ Ti. Figure 2 illustrates the sporadic task model. In this figure, di,k represents the deadline of job ji,k. With this relaxed model, an engine that runs at a varying RPMs can now be represented. The period using the sporadic task model would be the minimum time between the execution of the fuel amount calculation, which would happen at the maximum RPM the engine could possibly run. This time between arrivals at the maximum RPM would give us the period for our sporadic task.

2.2 Scheduling

In order to allow multiple tasks on a system at one time, we must determine how to control access to the resource(s) to resolve contention. In the case where the processor is the resource, more than one job may want to use a single processor to perform computations. Similarly, one job may be using a resource when another job arrives and therefore contends for use of the processor. A scheduling algorithm is used to determine which job(s) can use what resource(s) at any given time.

2.2.1 Static Schedulers

One desirable characteristic of a scheduling algorithm is the ability to adapt as the arrival pattern changes. This charac-

(5)

j₁ j₂

arrivals

time (a) Example of preemptive scheduling.

j₁

time j₂

j₁ j₂

arrivals

(b) Example of non-preemptive scheduling.

Figure 3. Comparison of preemptive and non-preemptive scheduling.

teristic distinguishes between static (non-adaptive) versus dynamic (adaptive) schedulers. Since dynamic schedulers are the common case, we will only briefly describe static schedulers.

Static scheduling algorithms precompute when each job will execute. Therefore, applying these algorithms requires knowledge of all future jobs’ properties, including release times, execution times, and deadlines. The static scheduler will then compute the schedule to be used at runtime. Dur- ing runtime, the exact schedule is known. So, once one job completes or a given point in time is reached the next job will begin execution.

One type of static scheduler is known as the cyclic executive, in which a sequence of jobs is executed one after the other in a recurring fashion. The jobs are not preemptible and are run to completion. A cyclic executive is typically implemented as an infinite loop that executes a set of jobs [42].

The cyclic executive model is simple to implement and validate, concurrency control is not necessary, and depen- dency constraints are taken into account by the scheduler.

However, this model does have the drawback of being very inflexible [10, 43]. For instance, if additional work is added to the loop, it is likely that the time boundaries for portions of the original work will be different. These changes will require additional, extensive testing and verification to ensure the original timing requirements are still guaranteed.

Ideally one would like the system to automatically adapt to changes in the workloads.

2.2.2 Priority Scheduling

A priority scheduler uses numeric priority values as the primary attribute to order access to a resource. In most priority scheduling policies, priority values are assigned at the job level. When multiple jobs contend to use a given resource (e.g. processor) this contention is resolved by allocating the resource to the job with the highest priority.

It is generally desired to provide the resource to the highest- priority job immediately. However, this is not always possible. This characteristic of being able to provide a resource immediately to a job is known as preemption. For instance, consider Figure 3a. Here we have two jobs, 1and

₂sharing a resource. The subscripts indicate the job’s priority and the lower the number indicates a higher priority.

So, ₁ has higher priority than ₂. On the arrival of ₁, ₂ is stopped and execution of 1is started. This interruption of one job to start another job is known as a preemption.

In this case, the higher priority job is able to preempt the lower priority job. If, 1is unable to preempt 2when it arrives as in Figure 3b, then 2is said to be non-preemptible.

A non-preemptible job or resource means that once a job begins executing with the resource it will run to completion without interruption.

Preemption may not always be desired if an operation (e.g., section of code) is required to be mutually exclusive.

To inhibit preemption some form of locking mechanism is typically used (e.g., monitors, mutexes, semaphores). How- ever, preventing preemption can result in a violation of priority scheduling assumptions known as priority inversion.

Priority inversion is a condition where a lower-priority task is executing, but at the same time a higher-priority task is not suspended but is also not executing. Using an example from [39], consider three tasks τ1, τ2, and τ3, where the subscript indicates the priority of the task. The larger the numeric priority, the higher the task’s priority. Further, consider a monitor M by which τ1 and τ3use for communication. Suppose τ1 enters M and before τ1leaves M, τ2

preempts τ1. While τ2is executing, τ3preempts τ2and τ3

attempts to enter M, but is forced to wait (τ1 is currently in M) and therefore is suspended. The next highest-priority task will be chosen to execute which is τ2. Now, τ2 will execute, effectively preventing τ3from executing, resulting in priority inversion.

2.2.2.1 Fixed-Priority Scheduling Fixed-task-priority scheduling assigns priority values to tasks and all jobs of

(6)

τ

₁

τ

2

0 5 10

time

(a) fixed priority (RM) scheduling

τ

₁

τ

2

0 5 10

time

(b) dynamic priority (EDF) scheduling

Figure 4. Fixed vs. dynamic priority scheduling.

a given task are assigned the priority of their corresponding task. The assignment of priorities to tasks can be performed using a number of different policies. One widely known policy for assigning priorities for periodic tasks is what Liu and Layland termed rate-monotonic (RM) scheduling [41].

Using this scheduling policy, the shorter the task’s period the higher the task’s priority. One assumption of this policy is that the task’s period is equal to its deadline. In order to generalize for tasks where the deadline may be less than the period, Audsley, et. al [3] introduced the deadline- monotonic scheduling policy. Rather than assigning priorities related to the period of the task, this approach schedules priorities according to the deadline of the task. Similar to RM scheduling, deadline-monotonic assigns a priority that is inversely proportional to the length of a task’s deadline.

2.2.2.2 Dynamic-Priority Scheduling As in fixed- task-priority scheduling the priority of a job does not change, however, with dynamic-priority scheduling jobs of a given task may have different priority values.

One of the best known dynamic-priority scheduling algorithms is known as earliest deadline first (EDF), in which, the highest priority job is the job that has the earliest deadline.

To illustrate the dynamic-priority vs fixed-priority scheduling, consider Figure 4. τ1and τ2are periodic tasks assigned priority values using either EDF (dynamic) or RM (fixed) priorities. τ1 has an execution time of 2 and period/deadline of 5. τ2 has an execution time of 4 and period/deadline of 8. So, at time 0 the job of τ1has a higher priority than the job of τ2 in both EDF and RM. At time 5, for RM, τ1’s job still has higher priority, however, for EDF, τ₂’s job now has a higher priority than τ₁’s, hence the priority assignment for jobs of a single task can change dy- namically.

2.3 Schedulability Tests

The tests used to determine whether timing constraints of given abstract workload models, scheduled on a given set of abstract resources using a given scheduling algorithm can be guaranteed are termed schedulability tests, or schedulability analyses.

In a real-time system one expects to guarantee the timing constraints for a given set of tasks are always met. In order to guarantee these timing constraints the work to be performed on the system, the resources available to perform this work, and the schedule of access to the resources all must considered. One possible conclusion from a schedulability analysis is that the set of tasks is schedulable, meaning that every job will complete by its associated deadline. In this case, the schedule produced is said to be feasible. An- other possible conclusion from the schedulability test is that the schedule is not feasible, meaning that it is possible that at least one job will not meet its deadline.

schedulable task sets unschedulable task sets

necessaryandsufficient (guranteed unschedulable/schedulable)

necessaryonly (guaranteed unschedulable)

sufficientonly (guaranteed schedulable)

Figure 5. Guarantees made by various schedulability tests.

A schedulability test will typically report either a positive result, indicating that the task set is guaranteed to be schedulable, or a negative result, indicating the one or more jobs of the given task set may miss their deadlines. How- ever, depending on the given schedulability test, the result may not be definite in either the positive or negative result.

The terms sufficient-only, necessary-only, and sufficient- and-necessary are commonly used to distinguish between the different types of tests as described below and illustrated in Figure 5. A schedulability test where a positive result means the task set is guaranteed to be schedulable, but a negative result means that there is still a possibility that the task set is schedulable is termed a sufficient-only test. Sim- ilarly, a test where the negative result means that the task set is certainly unschedulable, but the positive result means there is still a possibility that the task set is unschedulable is a necessary-only test. Ideally one would always strive

(7)

for tests that are necessary-and-sufficient, or exact, where a positive result means that all jobs are guaranteed to meet their deadlines and a negative result means that there is at least one scenario where a job may miss its deadline.

Liu and Layland published one of the first works on fixed-priority scheduling [41]. In their work, the critical instant theorem was formulated. The critical instant is the worst-case scenario for a given periodic task, which Liu and Layland showed occur when the task is released with all tasks that have an equal or higher priority. This creates the most difficult scenario for the task to meet its deadline because the task will experience the largest amount of interference, thereby maximizing the job’s response time.

Liu and Layland used the critical instant to develop the Critical Zone Theorem which states that for a given set of independent periodic tasks, if τiis released with all higher priority tasks and meets its first deadline, then τ_iwill meet all future deadlines, regardless of varying the task release times [41]. Using this theorem a necessary-and-sufficient test is developed by simulating all tasks at their critical instant to determine if they will meet their first deadline. If all tasks meet their first deadline, then the schedule is feasible.

A naive implementation of this approach must consider all deadline and release points between the critical instant and the deadline of the lowest priority task. Therefore, for each task τi, one must consider dDn/Tie such points, resulting in a complexity of O(Pn−1

i=0 D_n

T_i) [67].

While schedulability analyses like the one above are useful for determining whether a particular task set is schedulable, it is sometimes preferable to think of task sets in more general terms. For instance, we may want to think of task parameters in terms of ranges rather than exact values. One approach that is particularly useful is known as maximum schedulable utilization, where the test to determine the schedulability of a task set is based on its total processor utilization. Utilization of a periodic task is the fraction of processor time that the task can demand from a resource. Utilization is calculated by dividing the computa- tion time by the period, Ui= ^C_Tⁱ

i. The utilization of the task set, or total utilization, is then the sum of utilization of the individual tasks in the set, Usum = Pn−1

i=0 Ui, where n is the number of tasks in the set. Now to determine whether a task set is schedulable, one need only compare the utilization of the task set with that of the maximum schedulable utilization. As long as total utilization of the task set is less than maximum schedulable utilization then the task set is schedulable.

The maximum schedulable utilization varies depending on the scheduling policy. Considering a uniprocessor system with preemptive scheduling and tasks assigned priorities according to the RM scheduling policy, the maximum schedulable utilization is n(2ⁿ¹ − 1), and referred to as the RM utilization bound (URM) [41]. As long as

Usum ≤ n(2¹ⁿ− 1) the tasks are guaranteed to always meet their deadlines. This RM utilization bound test is sufficient, but not necessary (failure of the test does not mean the task set is necessary unschedulable). Therefore, a task set satis- fying the RM utilization test will always be schedulable, but task sets with higher utilization cannot be ensured schedulability.

While the RM utilization bound cannot guarantee any task sets above URM, one particularly useful class of task sets which can guarantee higher utilizations are those whose task periods are harmonic. These task sets can be guaranteed for utilizations up to 100% [67].

Preemptive EDF is another commonly used scheduling algorithm. The schedulability utilization bound provided for this scheduling policy is 100% on a uniprocessor [41].

This means that as long as the utilization of the task set does not exceed 100% then the task set is guaranteed to be schedulable. In fact, for a uniprocessor, the EDF scheduling algorithm is optimal, in the sense that if any feasible schedule exists than EDF can also produce a feasible schedule.

Many other scheduling algorithms and analyses exist to provide guarantees of meeting deadlines. This is especially true for the area of multiprocessor scheduling. However, the basic principles are essentially the same, given a workload model and scheduling algorithm, a schedulability test can determine whether timing constraints of a given system will be met.

3 Basic Scheduling

In this section we will cover basic terminology and methods used to allow multiple applications to reside on a single processor system. The major concern is management of resources to allow work to be performed in a flexible yet relatively predictable and analyzable manner.

3.1 Threads

A sequence of instructions executing on a processor is referred to as a thread. On a typical system, many threads coexist; however, a processor may be allocated to only one thread at a time². Therefore, for many threads to use one processor, the allocation of a processor must be rotated among the available threads.

3.2 Scheduler

Granting access to the processor is performed by the OS’s scheduler, sometimes called the dispatcher. The scheduler decides which thread at a given time will execute on the CPU.

2This is only considering a single CPU system with one processing core and no hyper-threading.

(8)

3.2.1 Switching Among Threads

To control access to the processor, the scheduler must have a way to start, stop, and resume the execution of threads on a processor. This mechanism is known as a context switch. The thread that is removed from the processor will be known as the outgoing thread and the thread that is being given the processor will be called the incoming thread.

The first step in a context switch involves saving all the information or context that will be needed to later resume the outgoing thread. This information must be saved since the incoming thread is likely to overwrite much of the context of the outgoing thread. Next the incoming thread’s context will be restored to the original state when it was paused.

At this point, processor control will be turned over to the incoming thread.

3.2.2 Choosing Threads

Each time the scheduler is invoked, it decides which thread to run next based on a number of criteria.

ready (competing for

execution time) blocked

running (executing)

Figure 6.State diagram of a thread.

3.2.2.1 Thread States One scheduling consideration is whether a thread can use the processor. At any given time, a thread is in one of three states (Figure 6). To explain these states, we will start with a thread that is not running but is ready to execute. At this point, the thread waits in the

“ready queue” for the processor, and the thread is in a ready state and considered runnable. Once the scheduler selects the thread to execute, a context switch will occur. The chosen incoming thread will transition from the ready state to the running state. The thread will then execute on the processor. While the thread is executing, the processor may be taken away from the thread even though the thread has not completed all its work. This means that the thread will transition back to the ready state. In a different scenario, a

thread in the running state may request some service, such as, reading from a file, sending a network packet, etc. Some of the requests cannot be fulfilled immediately and the system must wait for a subsystem to complete the request.

While the thread is waiting, the processor can be used by other threads. So, if the current thread cannot continue until the service is completed, the thread will transition to the blockedstate. Once in the blocked state, the thread will not execute. It is the job of the OS to change the thread from the blocked state to the ready state when the event for which it is blocked occurs.

3.2.2.2 Fairness With multiple threads on a system, one reasonable policy is to expect each thread to make similar progress. The scheduler may attempt to provide fairness among the ready threads by choosing the one that received the least amount of execution time in the recent past.

3.2.2.3 Priorities Providing fairness between all threads is not appropriate when one thread is more important or urgent than others. So, priorities are generally utilized in real-time scheduling policies.

Under simple priority scheduling the highest priority thread will occupy the processor for as long as it desires.

This means that one thread can ‘lock up’ the system, causing the system to be unresponsive to other lower priority, ready threads. Therefore, threads scheduled with priorities must be programmed with caution.

When two threads have the same priority, the scheduler can choose a thread based on which thread arrived first. Un- der fifo scheduling, earlier arriving threads have higher priority. Alternatively, with the use of round-robin scheduling, each thread at a given priority will be allotted a specific amount of time, known as a time slice. All threads at a given priority level will receive one time slice before any thread of that level receives additional time slices.

3.2.3 Regaining Control

The scheduler is the component that decides which threads will be allocated the CPU. However, a question may arise as to how the scheduler gets scheduled to obtain the CPU.

3.2.3.1 Voluntary Yield As mentioned earlier, a thread may call the OS and request services. These calls, among other things, allow a thread to become blocked and yield the processor to other threads. When the current thread becomes blocked, the scheduler code will execute and choose another thread to use the processor.

3.2.3.2 Forced Yield If a thread does not voluntarily yield the processor, we need to rely on other mechanisms for the scheduler to regain control of the processor.

(9)

The typical way is through the use of interrupts. Inter- rupts are used to communicate between devices and the processor. The interrupts signal the processor that some event has taken place. When an interrupt is raised by some device, the processor that handles the interrupt transfers execution to the corresponding interrupt handler, or interrupt service routine (ISR). An ISR can be thought of as similar to another thread on the system.

timer interrupt thread A thread B

time

Figure 7.Periodic timer interrupt.

To control the processor at some time in the future, the OS can program a timer interrupt, which is sent by a hardware component on the system. The timers are typically capable of at least two modes of operation. The legacy mechanism is periodic mode, where the timer will send interrupts repetitively at a specified interval. Figure 7 shows an example of periodic mode. The periodic timer interrupt allows threads A and B to share the processor through the inter- vention of the scheduler. The other timer interrupt mode is sometimes referred to as one-shot mode. In one-shot mode the timer is set to arrive some OS-specified time in the future. Once the timer expires and the interrupt is sent, the OS must reset the timer in order for another timer interrupt to be produced.

4 Basic Requirements for RT Implementa- tion

Using appropriate real-time scheduling analysis one can provide a guarantee that timing constraints of a set of con- ceptual activities (e.g. tasks) will be met. This assumptions that the analysis relies upon, must also be true from an implemented system. If the assumptions do not hold true for an implemented system, the guarantees made by the scheduling analysis may no longer be valid. Whether a system can support a given theoretical model relies on the system’s ability to perform accurate accounting of time, control the behavior of tasks, and to properly communicate timing parameters as detailed in this section.

4.1 Time Accounting

The validity of schedulability analysis techniques depends on there being an accurate mapping of usage of the processor to the given workload in the theoretical model.

We will refer to this mapping as time accounting. Dur- ing execution of the system, all execution time must stay within the bounds of the model. For example in the periodic task model, if some time is used on the processor, this time should correspond to some given task. Further, this time should not exceed the task’s WCET. The proper accounting of all the time on a system is difficult. This section will cover some of the more common problems that hinder a system from performing proper time accounting.

4.1.1 Variabilities in WCET

The task abstraction requires that one know the WCET of each task. To determine the WCET of a task, one approach would be to enumerate all possible code paths that a task may take and use the time associated with the longest execution time path as the WCET. In simple system such as that of a cyclic executive, this approach may work, but using a GPOS, this WCET would unlikely reflect the true WCET since tasks on such systems could have additional complex- ities such as context switching, caching, blocking due to I/O operations, and so on. We will go over some common cases that cause variabilities in a task’s WCET.

4.1.1.1 Context Switching Context switch overhead is typically small compared to the intervals of time a thread executes on a processor. However, if context switches occur often enough, this overhead becomes significant and must be accounted for in the analysis. Consider a job-level fixed-priority system where jobs cannot self suspend. If the time to perform a context switch is denoted as CS, then one needs to add 2CS to the WCET of each job of a task [42].

The reason is that each job can preempt at most one other job, and each job can incur at most two context switches:

one when starting and one at its completion. Similar rea- soning can be used to allow for self-suspending jobs where each self suspension adds two additional context switches.

Therefore, if Siis the maximum number of self-suspensions per job for task i, then the WCET should be increased by 2(Si+ 1)CS [42].

To include context switches in the analysis, one must also determine the time to perform a context switch. Ouster- hout’s empirical method [54] measures two processes communicating through a pipe. A process will create a pipe and fork off a child. Then the child and parent will switch between one and other each repeatedly performing a read and a write on the created pipe. Doing this some number

(10)

of times provides an estimate on the cost of performing a context switch.

Ousterhout’s method not only includes the cost of a context switch but also the cost of a read and a write system call on the pipe which in itself can contribute a significant amount of time. To factor out this time, McVoy and Staelin [48], measured the time of a single process performing the same number of write and read sequences on the pipe as performed by both processes previously. This measured time of only the system calls are subtracted from the time measured via Ousterhout’s method, thereby leaving only the cost of the context switches. This method is implemented in the benchmarking tool, lmbench [49].

4.1.1.2 Input and Output Operations Performing input and output operations during the time critical path of a real-time activity can create large variations in its service time. For example, accessing hard drives can last anywhere from few hundred microseconds to more than one second.

Determining the blocking time for accessing the device is not only difficult, but can increase the worst-case completion time (WCCT) of a task to such a point that the system becomes unusable. Further, the analysis of combin- ing I/O scheduling and processor scheduling becomes extremely complex and starts to reach the limits of real-time scheduling theory [5].

Since large timing variances cannot typically be tolerated for a real-time activity, it is common to ensure that these I/O operations do not occur in the time critical path. One way is to perform I/O in a separate server thread. This allows the actions that deal with the I/O devices to be scheduled with little interference on the real-time activities. Another approach is to perform I/Os as asynchronous operations, allowing the real-time threads to continue without blocking while the submitted I/O operations are performed.

One must also be aware of indirect causes of I/O operations. For example, the use of virtual memory allows a system to use more than the physical RAM on the system by storing or swapping out currently unused portions of memory on secondary storage (e.g. hard disk). However, this can cause large increases in the WCCT of a real-time activity by delaying access to data stored in the memory. If this WCCT is exceeded, the timing guarantees of the scheduling theory will be invalidated. Fortunately, many GPOSs realize the consequences of these swapping effects on time- critical activities and therefore provide APIs that prevent memory from being relocated to secondary storage (e.g., POSIX’s mlockset of APIs [29]).

Even when memory pages are not swapped to secondary storage, virtual memory address translation still takes some amount of time. This concern is addressed by Bennett and Audsley [8] by providing time bounds for using virtual ad- dressing.

While it is not common for real-time systems to allow swapping, Puaut and Hardy [57] have provided support to permit the use of swapping real-time pages. At compile time, they select page-in and page-out points that provide bounded delays for memory access. The drawback is that hardware and software support is required in order to provide the implementation, which may not be available.

4.1.1.3 Caching and Other Processor Optimizations The number of instructions, the speed to execute these instructions, caching, processor optimizations, etc. can in- troduce extremely large variabilities in the time to execute a piece of code. As processors become increasingly complicated, the difficulty in determining accurate WCETs also becomes more complicated. Many difficulties arise from instruction-level parallelism in the processor pipeline, caching, branch prediction, etc. These developments make it difficult to discover what rare sequence of events induces the WCET.

Given code for an application, there are generally three methods used to determine the WCET [53, 79]: compiler techniques [2, 26], simulators [52], and measurement techniques [40]. These methods can be effectively used together to take advantage of the strengths of each. For example, RapiTime [44, 45], a commercially available WCET analysis tool, combines measurement and static analysis. Static analysis is used to determine the overall structure of the code, and measurement techniques are used to establish the WCETs for sections of code on an actual processor.

4.1.1.4 Memory Theoretical analysis generally relies on the WCET of one task not being affected by another task.

In practice, this affect on WCET is typically not true due to contention for memory bus access. With a uniprocessor the caching effects between one application and another may affect the execution time when context switching; however, this is typically taken into account in the WCET. Now as the trend is toward more processors per system, not only is caching an issue, but also the contention for access to the memory bus. What processes are concurrently accessing which regions of memory can greatly affect the time to complete an activity. When one process accesses a region of memory this can effectively lock out another process, forc- ing that process to idle its processor until the particular region of memory becomes available. Further, processes are not the only entities competing for memory accesses, pe- ripheral devices also access memory, increasing the memory interference and making WCETs even more uncertain [56, 66].

(11)

4.1.2 System Workloads

When implementing tasks on top of a GPOS, system workloads may be created in order to support applications. These workloads contribute to the proper operation of the system, but do not directly correspond to work being performed.

Further, since they may not be the result of any particular task, they do not fit into any of the task’s execution times and can be easily overlooked. The problem is that the processor time used by the system competes with the time used by the tasks. Without properly accounting for this time in the abstract model, these system workloads can ‘steal’ execution time from other activities on the system, thereby causing missed deadlines.

4.1.2.1 Scheduling Overhead The scheduler deter- mines the mapping of tasks to processors. In order to perform this task, it uses processor time. In a GPOS, the change of task assignments to CPUs occurs when the scheduler is invoked from an interrupt or when a task self- suspends/blocks. The timer hardware provides interrupts to perform time slicing between tasks as well as other timed events. Katcher et. al [37] describe two types of scheduling interrupts, timer-driven and event-driven.

Tick scheduling [11] occurs when the timer periodically sends an interrupt to the processor. The interrupt handler then invokes the scheduler. From here, the scheduler will update the run queue by determining which tasks are available to execute. Any task that has release times at or before the current time will be put in the run queue and able to compete for the CPU at its given priority level. Perform- ing these scheduling functions consumes CPU time which should be considered in the schedulability analysis. Over- looking system code called from a timer can be detrimental to the schedulability of a system because timer handlers can preempt any thread, regardless of the thread’s priority or deadline.

4.2 Temporal Control

Temporal control ensures that the enforcement mechanisms in the implementation correctly adhere to the real- time models used in the analysis. For the processor, this includes the system’s ability to allocate the processor to a given activity in a timely manner. For example, when a job with a higher priority than that of the one currently executing on the processor is released (arrives), the preemptive scheduling model says the system should provide the processor to the higher priority job immediately. However, in practice this is not always possible.

4.2.1 Scheduling Jitter

Scheduling points are events where the scheduler evaluates what tasks should be assigned to which CPU. In an ideal scheduling algorithm scheduling actions take place at the exact points in time when some state in the system changes causing the mapping of threads to CPUs to change. In a GPOS, the scheduling points are the points in time when the CPU scheduler is invoked, such as when a task completes one of its jobs and therefore self-suspends, or the system receives an interrupt.

The difference between the ideal scheduling points in an abstract scheduling algorithm and that of the CPU scheduler is commonly called scheduling jitter. If a job is set to arrive or become runnable at time τ₁, but is not recognized by the system until time τ2, the scheduling jitter is τ2− τ1.

Minimizing scheduling jitter is important in real-time systems. Generally, the smaller the scheduling jitter, the better the theoretical results can be trusted to hold on the implemented system.

4.2.2 Nonpreemptible Sections

Another common problem in real-world systems is that of nonpreemptible sections. A nonpreemptible section is a fragment of code that must complete execution before the processor may be given to another thread. Clearly, a long enough nonpreemptible section can cause a real-time task to miss its execution time window. While accounting for nonpreemptible sections in the schedulability analysis is necessary for guaranteeing timing guarantees, it is generally preferable to design such that nonpreemptible sections are avoided as much as possible. The reason is that nonpreemptible sections increase the amount of interference a given task may encounter, potentially making the system unschedulable.

4.2.3 Non-unified Priority Spaces

When a device wishes to inform the CPU of some event, the device will interrupt the CPU, causing the execution of an interrupt handler. The interrupt handler is executed immediately without consulting the system’s scheduler, creating two separate priority spaces: the hardware interrupt priority space and the OS scheduler’s priority space, of which, the hardware interrupt scheduler always has the higher priority. Therefore, any interrupt handler, regardless of priority, may preempt an OS schedulable thread. The fact that all interrupts have higher priority than all OS schedulable threads, must be modeled as such in the theoretical analysis. The more code that runs at interrupt priority the greater the amount of interference an OS schedulable thread may experience, potentially causing OS threads to become unschedulable.

(12)

4.2.4 Temporal Isolation

Exact WCETs can be extremely difficult to determine in many cases, therefore, only estimated WCETs may be specified. If a given task overruns their allotted time budget due to their exact WCET being longer than the specified WCET, one or more other tasks may also miss their deadlines. Rather than all tasks missing their deadlines, it is generally preferable to isolate the failure of one task from other tasks on the system. This property is known as temporal isolation.

4.3 Conveyance of Task/Scheduling Policy Se- mantics

For an implemented system to adhere to a given theoretical model, one must be able to convey the characteristics of this model to the implemented system. To perform this in a GPOS, it is common to provide a set of system interfaces that inform the system of a task’s parameters.

For example, consider the periodic task model scheduled with fixed-priority preemptive scheduling. Each task is released periodically and competes at its specified priority level until its activity is completed. If a periodic task abstraction were available directly in a given OS, then the theoretical model could easily be implemented. However, in GPOSs such interfaces typically do not exist.

However, many systems adhere to the POSIX operating systems standards, which support real-time primitives to allow for implementation of a periodic task model scheduled using fixed-priority preemptive scheduling. These interfaces include functions for setting a fixed priority to a thread and allowing a thread to self-suspend when a job is completed, which map from the task model to the implementation.

These types of interfaces are critical for applications to convey their intentions and constraints to the system. They inform the OS of the abstract model parameters in order for the OS scheduler to make decisions that match the ideal scheduler. Lacking this information, the OS may make improper decisions, resulting in tasks missing their deadlines.

5 Device Drivers

Devices are used to provide system services such as sending and receiving network packets, managing storage devices, displaying video, etc. The number of these services is small compared to the variety of hardware components, which are produced by a multitude of vendors, each with many distinct operating characteristics. For instance, sound cards provide a means to produce audio signals. The user will typically provide the audio signal in a digital for- mat to the sound card and the sound card will output an analog audio signal that can be converted to sound waves

through a speaker. However, to produce sound from an application, interaction between the system and the sound card must occur. Due to the many different features, components, and designs of the different cards, specifics (e.g., timings, buffers, commands) by which communication with these cards occurs is typically different depending on the manufacturer or even model. To ease the use of devices such as sound cards, OSs abstract much of the hardware component complexity into software components known as device drivers, which are typically provided by the device manufacturer. Therefore, instead of having to know the par- ticulars of a given device, the application or OS can communicate generically with the device driver and the device driver, having knowledge of the device specifics, can communicate with the actual device.

Using device drivers in a real-time system complicates the process of guaranteeing deadlines. These devices share many of the same resources used by the real-time tasks and can cause interference when contending for these resources.

Further, many device drivers are used in the critical path of meeting deadlines. Therefore, device driver activity must be included in the schedulability analysis. The difficulty is that the device driver workloads generally do not conform to well understood real-time workload models.

5.1 CPU Time

The CPU usage of device drivers tends to be different from other real-time application tasks, and therefore fitting the usage of CPU time into known, analyzable real-time workload models can be awkward. Trying to force usage into these models tends either be invalid due to lack of OS control over scheduling, inefficient due to the limited number of implementable scheduling algorithms, or impracti- cal due to large WCETs being used for the analysis even though average case execution times may be much smaller.

Further, many of the scheduling mechanisms created for user-space applications do not extend to the device drivers.

That is, the explicit real-time constructs such as pre-emptive priority-driven resource scheduling, real-time synchroniza- tion mechanisms, etc. are not typically available to or used by device drivers.

This section will enumerate some of the temporal effects associated with device drivers and show why these can hinder the proper functioning of a real-time system. We will see how using I/O devices in a system increases the time accounting errors, reduces the amount of control over system resources, and leads to incompatibility with existing workload models.

5.1.1 Execution

Device drivers consume system resources and, therefore, compete with other activities on the system, including real-

(13)

time tasks. The contended-for system resources include CPU time, memory, and other core components of the system. For example, consider a network device driver. The end user expects a reliable, in-order network communication channel. The sending and receiving of basic data packets is handled by the card. However, execution on the processor is required to process the packets, which includes communicating with the network card, handling packet headers, and dealing with lost packets.

Since device driver CPU usage competes with real-time tasks, the CPU time consumed must be considered in the schedulability analysis. The CPU usage due to device drivers may seem negligible for relatively slow devices such as hard disks. The speed differences between the processor and the hard disk should mean that only small slivers of time will be taken from the system. Unfortunately, the competition from some other device drivers for CPU time significantly impacts the timeliness of other activities on the system. The device driver overhead can be especially large for high bandwidth devices such as network cards. According to [40], the CPU usage for a Gigabit network device driver can be as high as 70%, which is large enough to interfere with a real-time task receiving enough CPU time before its deadline.

The problem of device drivers interfering with real-time tasks is not likely to diminish over time. Devices are becoming faster and utilizing more system resources. One example is the replacement of solid-state storage for hard disk drives. The solid-state devices are much faster and can create significantly more CPU interference for other activities on the system.

To better understand the problems with device drivers in the context of real-time scheduling, we will first look at the manner in which these components consume CPU time and how this can affect the ability of a system to meet timing constraints.

Stewart [75] lists improper accounting for the use of interrupt handlers as a common pitfall when developing embedded real-time software. Interrupt handlers allow device drivers to obtain CPU time regardless of the OS’s scheduling policy. While scheduling of application threads is car- ried out using the OS scheduler, the scheduling of interrupt handlers is accomplished through interrupt controllers typically implemented in hardware. Interrupts effectively create a hierarchy of schedulers, or two priority spaces, where all interrupts have a priority higher than other OS schedulable threads on the system.

Interrupts prevent other activities from running on the system until they have completed. While an interrupt is being handled, other interrupts are commonly disabled. This produces a blocking effect for other activities that may arrive on the system. Until interrupts are re-enabled, no other threads can preempt the currently executing interrupt han-

dler. Therefore, if a high-priority job arrives while interrupts are disabled, this job will have to wait until the interrupt completes, effectively reducing the time window the has to complete its activities.

Since device drivers typically use interrupts, some, if not all, of the device driver processor time is out of the control of the OS scheduler. [62] pointed out that device drivers can in effect “steal” processor time from real-time tasks. This time stolen by device drivers can cause real-time tasks to miss their deadlines. In order to illustrate and quantify this stolen time, Regehr [60] describes how an application-level thread can monitor its own execution time without special OS support, in the implementation of a benchmark application program called Hourglass. In Hourglass, a synthetic real-time thread, which we call an hourglass thread, monitors the amount of processor time it consumes over a given time interval. The thread needs to measure the amount of processor time it receives, without the help of any OS in- ternal instrumentation. This is difficult because processor allocation is typically broken up due to time slicing, interrupt processing, awakened threads, etc., and the endpoints of these intervals of execution are not directly visible to a thread. An hourglass thread infers the times of its transitions between executing and not executing, by reading the clock in a tight loop. If the time between two successive clock values is small, no preemption occurred. However, if the difference is large, then the thread was likely preempted. Using this technique to determine preemption points, an hourglass thread can find the start and stop times of each execution interval, and calculate the amount of processor time it receives in that interval. Knowing the amount of execution time allows hourglass threads to emulate various real-time workload models. For example, periodic workloads can be emulated by having the hourglass threads alternate between states of contention for the processor and self-suspension.

More specifically, a periodic hourglass thread contends for the processor until it receives its nominal WCET, and then suspends itself until the beginning of its next period. The thread can also observe whether its deadlines are met or missed.

Given that interrupts interfere with real-time applications, interrupt service time must be included in the analysis of schedulability. [65] considered the problem of including interrupt executions whose arrival times are not known in advance with other tasks scheduled by a static schedule constructed offline. The naive approach pointed out by [65]

is to include the interrupt WCET into the execution times of all tasks on the system. However, using this mechanism is typically pessimistic and can reduce the ability to prove that a system is schedulable. Instead of adding the WCET to each task, [65] considers only adding the WCET to a task chain, which is a number of tasks that are always executed sequentially. The WCET of the interrupts is considered as

(14)

that of a higher-priority task which is considered to arrive at the start time of the chain. The point where the task chain is released is a critical instant and schedulability can then be calculated for all tasks in the chain.

Another way to include device driver CPU time in schedulability analysis is to consider interrupt execution as a task. To do this in a fixed-priority system, one could model the interrupt as a sporadic task, with the execution time being the interrupt handler’s WCET and the period being the smallest time between the arrivals of two subsequent interrupts. The priority of this interrupt task would need to be modeled as higher than any other task on the system, due to the nature of the built-in hardware scheduling mechanisms of interrupts. However, modeling the interrupts as a task with highest priority in the system may not be consis- tent with all scheduling algorithms. For instance, in an EDF scheduled system the work performed by the handler for the interrupt may have a logical deadline further in the future than other jobs. Therefore, according to the EDF scheduling policy, the interrupt should logically have a lower priority than jobs with earlier deadlines, but in fact will have a higher priority, violating the rules of the scheduling policy.

Further, executing interrupts at a real-time priority may not be required. If the interrupts are not needed by any real- time task on the system, it may make sense, if possible, to schedule the interrupt execution as the lowest priority on the system, or with other non-realtime tasks.

One possibility to gain some control over interrupts is through interrupt enabling and disabling. This can be accomplished by disabling interrupts whenever a logically higher-priority task begins execution and re-enabling interrupts when no higher-priority tasks exist. This provides one possibility for interrupt priorities to be interleaved with the real-time task priorities.

However some interrupt handlers have hard real-time requirements of their own, which demand a high priority. For example, some devices require service from the CPU in a bounded amount of time and without acknowledgment from the CPU, the device may enter an unstable state or events may be lost. Other effects such as idling the device may occur. This can greatly reduce the utilization of certain devices. Consider the hard disk device. Once a request to the hard disk is completed, new requests, if any, should be presented to the hard disk to prevent idling. Idling a hard disk is normally unacceptable due to the relatively long service times. If the hard disk is unable to query the processor about another request via interrupts, the disk may become idle, wasting time that could be used to service requests.

The OS scheduler is not always able to provide con- trolled access to shared resources (e.g. data structures) used inside interrupt handlers, therefore other mechanisms are needed to ensure proper access of these shared resources.

One common protection mechanism is to disable the in-

terrupt that may violate the access restrictions to shared resources, thereby preventing the interrupt handler executing. Disabling a single interrupt rather than all interrupts is known as interrupt masking and is typically accomplished by setting a bit in the corresponding interrupt controller reg- ister [30]. This approach, if done correctly, can be used to provide correct mutually exclusive access to shared resources, but introduces the issue of priority inversion due to locking [39, 69].

Interrupt masking introduces CPU overhead, since ma- nipulating the registers of the interrupt controller typically involves off-chip access and cause effects such as pipeline flushes. Given that very few interrupt attempts occur during the periods of masked interrupts, [76] proposed optimistic interrupt protection, which does not mask the interrupts using the hardware. To maintain critical sections, a flag is set that indicates a critical section is being entered, and it is cleared at the end of the critical section. If a hardware interrupt occurs during the critical section, an interrupt handler prologue will note that an interrupt has occurred, save the necessary system state, and update the hardware interrupt mask. The interrupted code will then continue. At the end of the critical section, a check will be performed for any deferred interrupts. If one does exist, the corresponding interrupt routine will then be executed.

In addition to maskable interrupts, systems may also contain non-maskable interrupts (NMIs) which must be included in the schedulability analysis. To complicate mat- ters, some NMIs are handled by the BIOS firmware and do not travel through the OS. The most common form of NMIs handled by the BIOS are known as System Management Interrupts (SMI) and can cause added latency to activities on the system [82]. It is important in a real-time system that one be aware of, and account for if necessary, the time taken by SMI activities.

Discretion must be used when performing computations inside interrupt handlers. For instance, Jones and Saroiu [34] provided a study of a soft modem. This study shows that performing the signal processing required for the soft modem in interrupt context is unnecessary and can prevent other activities on their system from meeting their deadlines. Therefore, one should minimize the amount of processing time consumed by interrupts and consider other possibilities.

Interrupts are not the only way to synchronize a periph- eral device and the CPU. Instead, the processor can poll the device to determine whether an event has occurred. In- terrupts and polling each have their own merits which are discussed below.

Interrupts allow the processor to detect events such as state changes of devices without constantly having to use the processor to poll the device. Further, with interrupt notification the time before detecting an event is generally

A Survey of Fitting Device-Driver Implementations into Real-Time Theoretical Schedulability Analysis