Timers and Clocks - Hardware Foundations - Scheduling and locking in multiprocessor real-time o

2.1 Hardware Foundations

2.1.4 Timers and Clocks

Precise time management is essential to the correct functioning of an RTOS. Most computing platforms therefore include one or several devices that can serve as a timer and/or clock.

Aclockis a device that measures the progress of physical time. Physical time is sometimes referred to as “monotonic time,” as opposed to “wall-clock time,” which is subject to man-made concepts such as time zones, daylight saving time, and leap seconds. Clocks rarely work in units of seconds. Instead, they typically count the occurrence of some physical event that occurs regularly with a known frequency (such as the oscillation of a crystal)—that is, a clock “ticks” at a given frequency and increments a counter on every “clock tick.” The OS can read the clock to obtain the number of events (or “ticks”) that have occurred since system startup (or since the last time the counter overflowed). This allows the OS to determine the length of an activity (such as a task

Atimeris a device that can be programmed to generate an interrupt at a future point in time. A timer does not necessarily have a notion of current time (i.e., not every timer is also a clock). Instead, knowledge of the remaining time until the interrupt will be generated is sufficient for correct operation. Similar to clocks, frequency-driven timers can be implemented by decrementing a counter every time a physical event occurs until the counter reaches zero, which is when the interrupt fires. Timers generally operate in one of two modes. One-shot timersgenerate a single interrupt and then remain inactive until reprogrammed. In contrast,periodic timersautomatically reset and generate interrupts in a regular fashion until explicitly stopped.

Periodic timers can be, and traditionally have been, used to implement clocks: whenever the periodic timer interrupt (called the system tick) occurs, the value of a “current time” variable is incremented. In Linux parlance, this notion of current time is referred to as “jiffies.” A physical clock device is thus not necessarily required. However, such emulation of a clock comes with a tradeoff in precision since the time appears to “stand still” between timer interrupts. In general, the precision of a clock depends on its resolution and accuracy.

Theresolutionof a clock is the smallest difference that can be observed between consecutive readings,i.e., the granularity of time measured by the clock. For example, suppose a periodic timer with a period of one second is used to implement the system clock. The resulting resolution is one second,i.e., it is impossible to correctly tell apart intervals that differ by less than one second.

Theaccuracyof a clock determines its error,i.e., how close the reported time is to the actual physical time. For example, suppose the OS would query the atomic clock of the National Institute of Standards and Technology (NIST) over the Internet. NIST guarantees the resolution to not exceed 200 picoseconds,i.e., the NIST’s clock will report distinct readings to queries that are separated by at least 200 picoseconds. However, the accuracy as perceived by the OS would only be in the range of tenths to hundreds of milliseconds due to unpredictable latencies on the Internet,i.e., by the time that the OS has obtained a clock reading the result is no longer accurate.2 The accuracy of a

practical clock,i.e., one that does not artificially delay observations until the next “tick,” is limited by its resolution.

Multi-round clock synchronization protocols for real-time systems exist that can greatly reduce the impact of uncertain transmission times (Kopetz and Ochsenreiter, 1987; IEEE, 2008a), but such protocols do not apply here since we are concerned with a single observation.

Clock driftdescribes the effect that a clock’s accuracy may slowly decrease over time. For example, slight changes in tick frequency may cause a clock to drift. Dealing with clock drift in practice can be a non-trivial engineering challenge, and especially so for OSs such as Linux that must work on a wide range of commodity platforms with components and clocks of varying quality and correctness. However, such techniques are beyond the scope of this dissertation; we make the simplifying assumption that clock drift is negligible across short time intervals. Platforms for which this does not hold are of questionable utility in the context of real-time systems.

Resolution and accuracy similarly apply to timers, where the resolution determines the granularity of points in time that can be programmed, and accuracy is a measure of how closely the interrupt occurs to the requested point in time.

Another criterion is the overhead involved in reading a clock or programming a timer. For example, theprogrammable interval timer(PIT), which has traditionally been used in x86-based systems, is theoretically capable of operating as a one-shot timer, but is only used as a periodic timer in practice due to the high cost of re-programming the timer. The underlying reason is that writing to the PIT’s memory-mapped registers can cause a processor to stall for many cycles while it waits for the write transaction that was issued to the memory bus to complete. Ideally, each processor should have low-overhead access to a timer and clock with high resolution and high accuracy.

Xeon L7455. In our experimental platform, each processor has access to three clocks and two timers (as illustrated in Figure 2.8 and summarized in Table 2.2).

The system has ahigh-precision event timer(HPET) that is connected to the I/O APIC. The HPET is both a clock and a timer. It contains a circuit that oscillates with a frequency of approximately

14.3MHz. On each oscillation, a 64 bit counter, which can be read by the OS, is incremented by one. The HPET also implements three independent one-shot timers with threecomparator registers. The OS can program a value in each register; an interrupt is generated when the incrementing counter equals the comparator value. As discussed above, the I/O APIC can be configured to route the HPET interrupt to a specific set of processors. The frequency of the oscillator limits the HPET to a resolution of about69.84ns. (The HPET standard requires a frequency of at least10MHz, which corresponds to a minimum resolution of100ns.) The HPET is an off-chip device,i.e., it is not part

Core 1 2128 MHz APIC bus TSC local APIC interrupt line I/O APIC inc. counter comparator 1 HPET comparator 2 comparator 3 14.31818 MHz 14.31818 MHz oscillator

bus clock signal (266 MHz) dec. counter _APIC

timer clock divider Core 24 2128 MHz TSC local APIC

dec. counter _APIC timer clock divider inc. counter ACPI PM "timer" 14.31818 MHz 3.579545 MHz oscillator

Figure 2.8: Illustration of the timers and clocks available in the multiprocessor platform underlying the case studies discussed in Chapters 4 and 7. Each processor has access to three clocks (the HPET, the ACPI PM timer, and the per-processor TSC) and two timers (the HPET and the per-processor local APIC timer). The local APIC timer and the TSC are preferable to the other alternatives due to their superior resolution and low-overhead, on-chip access.

of the processor die. The accuracy of the HPET is hence impacted by the time it takes the I/O APIC to deliver an interrupt to a local APIC.

Each core’s local APIC also contains a one-shot timer that is driven by the main bus clock signal. The local APIC timer consists of a counter (initialized by the OS) that decrements by one every2x

bus clock cycles, wherex∈ {0, . . . ,7}. The OS configuresxby means of a clock divider register. When the counter reaches zero, a local interrupt is generated. Since the local APIC is an on-chip device that is tightly integrated into the processor core, it incurs only little programming overhead. In our system, the system bus clock frequency is266MHz, which implies a minimum resolution of about3.76ns (forx= 0). It is also a fairly accurate timer since interrupt delivery is local. The local APIC has no clock functionality.

Besides the HPET, the system has two additional clocks without timer functionality. The

Advanced Configuration and Power Interface(ACPI) standard (Hewlett-Packardet al., 2010) requires each system to have an ACPIpower management timer(PM timer), which—despite its name—is a clock without timer functionality. Similarly to the HPET, it is an off-chip device and consists of an oscillator with a fixed frequency and a 32-bit counter that is incremented on every oscillation. The ACPI PM timer generates an interrupt whenever the counter overflows. The ACPI PM oscillator’s frequency of about3.58MHz implies a minimum resolution of approximately279.37ns.

Device Type Location Frequency (MHz) Resolution (ns)

HPET one-shot timer off chip 14.32 69.84

Local APIC one-shot timer on chip 266.00 3.76

HPET clock off chip 14.32 69.84

ACPI PM timer clock off chip 3.58 279.37

TSC clock on chip 2128.00 0.47

Table 2.2: Summary of the timers and clocks available in the multiprocessor platform underlying the case studies discussed in Chapters 4 and 7. Linux uses on-chip devices when available.

The highest-resolution clock in the system is the timestamp counter (TSC), which is a per- processor 64 bit register. The TSC is incremented with each processor cycle, which, at 2128 MHz (eight times the bus clock), implies a minimum resolution of only about0.47ns. Since the TSC is a register that is part of the processor, TSC accesses incur only negligible overhead. However, the TSC is subject to some accuracy limitations when conducting micro benchmarks (i.e., when measuring the execution time of short code segments) since instruction reordering could move the dependency-less TSC read ahead of other operations. This can be avoided by issuing serializing instructions prior to the TSC read (Paoloni, 2010). The TSC can also be affected by processor frequency changes and TSC values may not be comparable across processors (e.g., the processors may have been initialized at different times).

The available clocks and timers are summarized in Table 2.2. Accesses to on-chip device registers generally incur lower overheads than accesses to memory-mapped registers of off-chip devices because off-chip accesses are subject to memory bus arbitration. Similarly, interrupts from off-chip devices must be dispatched by the I/O APIC, whereas on-chip devices are typically directly connected to a local APIC.

Linux requires just one clock and one timer to function properly. It selects one device of each class during bootup and leaves the others unused. On our system, Linux chooses the local APIC timers and the TSC due to their high resolution and low overheads.

In document Scheduling and locking in multiprocessor real-time operating systems (Page 62-66)