1. What do we mean when we say that interrupts must be processed “transparently”? What does this involve and why is it necessary?
Since interrupts are asynchronous to CPU operations (that is, they can occur at any time, without warning), it is necessary that the complete run-time context of the program that was executing be preserved across the servicing of the interrupt. That is to say, the interrupted program must not experience any changes to the state of the processor (or to its program or data memory) due to interrupt handling; it should not “see” any effects other than a time lag, and should compute the same results in the presence vs. absence of an interrupt(s). For this to happen it is
necessary to save and restore not only the program counter (so the program can be resumed at the next instruction that would have been executed had the interrupt not occurred), but also the condition codes (a.k.a. status flags) and the contents of all the CPU registers. This is normally accomplished via pushing these values on the
system stack when an interrupt occurs and popping them back off after it has been serviced. If this were not done, the interrupt would not be “transparent” and the interrupted program could operate incorrectly.
2. Some processors, before servicing an interrupt, automatically save all register contents. Others automatically save only a limited amount of information. In the second case, how can we be sure that all critical data are saved and restored? What are the advantages and disadvantages of each of these approaches?
The advantage of automatically saving everything is that we are sure that it has been done and thus we know that the interrupt will be serviced in transparent fashion (as described in the answer to question 1 above). The disadvantage of this is that a given interrupt service routine may actually use only a small subset of the CPU registers, and the time spent saving and restoring all the other, unused registers is wasted.
The additional delay involved in saving a large number of registers can significantly increase the latency in responding to an interrupt; for some timing- sensitive I/O devices, it is important to keep this latency as small as possible. Some processors only automatically save the program counter and condition codes; other registers are left to be preserved with push and pop instructions inside the service routine. In that case, we save only the necessary registers and keep latency to a minimum; but the potential disadvantage is that we are dependent on the vigilance of the programmer who writes the service routine to track his/her register usage and save all required registers. If he or she fails to do this, the interrupted program could have its run-time context corrupted.
3. Explain the function of a watchdog timer. Why do embedded control processors usually need this type of mechanism?
A watchdog timer is a mechanism that can be used to reset a system if it “locks up”, without requiring the intervention of a human user (which is not always possible, especially in embedded systems). To implement this mechanism, a counter that runs off the system clock (or some derivative of it) is initialized and allowed to count up toward a maximum count (or down toward zero). Software running on the system is assigned the task of periodically resetting the counter to its initial value. If the counter ever “rolls over”, presumably due to a system software failure, that rollover event is detected by hardware and used to generate a reset signal to reinitialize and recover the system to a known state. Embedded control processors, unlike general-purpose CPUs, are not normally mounted in a case with a convenient reset button within reach of a user’s finger. Due to the embedded location, it may be difficult or impossible to perform a manual reset; a watchdog timer may be the only mechanism that allows the system to be restarted in the event of a crash.
4. How are vectored and autovectored interrupts similar and how are they different? Can they be used in the same system? Why or why not? What are their advantages and
disadvantages vs. nonvectored interrupts?
Vectored and autovectored interrupts are similar in that both use an interrupt number to index into a table that contains the addresses of the various interrupt service routines. (The value read from the table is loaded into the
program counter and execution of the service routine begins from that point.) The only difference between the two techniques is that hardware devices provide their own interrupt numbers (via the system bus during the interrupt acknowledge cycle) in a system with vectored interrupts, while an autovectoring scheme uses interrupt numbers internally generated by the CPU based on the priority level of an incoming interrupt.
Yes, both of these techniques can be used in the same system; the Motorola 680x0 family of CPUs is a prime example. The “smarter” devices can provide their own vectors, while a special hardware signal or timeout mechanism can alert the CPU to the need to generate autovectors for other devices. While nonvectored interrupts are slightly easier to implement in hardware as compared to vectored or autovectored interrupts, the additional complexity required of the software (to identify the source of each interrupt and execute the correct code to handle it), the corresponding increase in the latency to service an interrupt, and the limitations it places on the design of the memory system are significant drawbacks.
5. Given the need for user programs to access operating system services, why are traps a better solution than conventional subprogram call instructions?
The main “problem” with a typical subprogram call instruction is that it generally requires a target address that is explicitly specified using one of the machine’s addressing modes; that is to say, we must know where a given routine resides in memory in order to call it. While this is normally not a problem for user- written code or procedures in a link library, we often do not know the location of routines that are part of the operating system. Their location may vary from one
specific system or OS version to another. Also, code that performs system functions such as I/O usually needs to be executed at a system privilege level, while called procedures normally execute with the same privilege level as the code that called them. Traps, since they make use of the same vectoring mechanism as interrupts or other exceptions, allow OS routines to be accessed implicitly, without the
programmer having to know the exact location of the code he or she wishes to execute. By executing a specific trapping instruction, the desired routine can be executed at a system privilege level with control returning (at the proper privilege level) to the user program that called it.
6. Compare and contrast program-controlled I/O, interrupt-driven I/O, and DMA-based I/O. What are the advantages and disadvantages of each? Describe scenarios that would favor each particular approach over the others.
In a system with program-controlled I/O, the CPU executes code to poll the various hardware devices to see when they require service, then executes more code to carry out the data transfers. This is the simplest way to handle I/O, requiring no extra hardware support; but the need for the CPU to spend time polling devices complicates the software and detracts from system performance. This approach would only be favored in a very low-cost, embedded system where the CPU is not doing much other than I/O and the goal is to keep the hardware as simple and inexpensive as possible.
In a system with interrupt-driven I/O, the devices use hardware interrupt request lines to notify the CPU when they need servicing. The CPU then executes instructions to transfer the data (as it would in a system using program-controlled I/O). This approach doesn’t eliminate the need for CPU involvement in moving data and also involves a bit more hardware complexity, but support for interrupt processing is already built into virtually every microprocessor so the additional cost is minimal. The upside of this technique is that the CPU never has to waste time
polling devices. System software is simplified by having separate interrupt service routines for each I/O device, and devices are typically serviced with less latency than if interrupts were not used. This approach is good for many systems, especially general-purpose machines that have a wide variety of I/O devices with different speeds, data transfer volumes, and other characteristics.
DMA-based I/O is carried out by a hardware DMA controller that is separate from the system CPU. When the CPU determines (often by receiving an interrupt) that a transfer of data to or from an I/O device needs to take place, it initializes the DMA controller with the particulars of the transfer; the DMA controller then carries out the operation, transferring data directly between the chosen device and memory, without further intervention by the CPU. This approach has the highest hardware cost of the three, since it requires an extra system component; it also requires the overhead of the CPU having to set up the DMA controller for each transfer. However, DMA is very efficient, especially when large blocks of data are frequently transferred. Its use would be favored in a
general-purpose or (especially) a high-performance system with high-speed devices that can benefit significantly from large block I/O operations.
7. Systems with “separate I/O” have a second address space for I/O devices as opposed to memory and also a separate category of instructions for doing I/O operations as opposed to memory data transfers. What are the advantages and disadvantages of this method of handling I/O? Name and describe an alternative strategy and discuss how it exhibits a different set of pros and cons.
Separate I/O has the advantage of a unique address space for I/O devices; because of this, there are no “holes” in the main memory address space where I/O device interface registers have been decoded. The full physical memory address space is available for use by memory devices. Also, I/O operations are easily distinguished from memory operations by their use of different machine language
instructions. On the other hand, hardware complexity (and possibly cost) is increased slightly and the additional instructions required for I/O make the instruction set architecture a bit more complex.
The alternative, memory-mapped I/O, shares a single physical address space between memory and I/O devices. This keeps the hardware and instruction set simpler while sacrificing the distinct functionality of I/O instructions as well as the complete, contiguous address space that would otherwise be available to memory. Given the widespread use of virtual memory in all but the simplest of systems, the pros and cons of either approach are not as noteworthy as they once were and either approach can be made to work well.
8. Given that many systems have a single bus which can be controlled by only one bus master at a time (and thus the CPU cannot use the bus for other activities during I/O transfers), explain how a system that uses DMA for I/O can outperform one in which all I/O is done by the CPU.
On the face of it, it would seem that DMA I/O would provide little or no advantage in such a system, since only one data transfer can occur at a time
regardless of whether the CPU or DMAC is initiating it. However, DMA still has a considerable advantage for a couple of important reasons. One of these is that, due to the widespread use of on-chip instruction and data cache, it is likely that the CPU can continue to execute code for some time (in parallel with I/O activities) even without any use of the system bus. The second reason is that even if the CPU “stalls out” for lack of ability to access code or data in main memory, the I/O operation itself is done more efficiently than it would be if the CPU performed it. Instead of reading a value from a buffer in memory and then writing it to an I/O device interface (or vice versa), the CPU (which would be the middleman in the transaction) gets out of the way and the two transactions are replaced with one direct data transfer between memory and the device in question.
9. Compare and contrast the channel processors used in IBM mainframes with the PPUs used in CDC systems.
The channel processors used in IBM mainframes were simple von Neumann machines with their own program counters, register sets, (simpler) instruction set architecture, etc. They communicated with the main system processor(s) by reading and writing a shared area of main memory. CDC’s Peripheral Processing Units were complete computers dedicated to I/O operations. The PPUs had their own separate memory and were architecturally similar to the main system processor (although they lacked certain capabilities, such as hardware support for floating- point arithmetic, that were not useful for I/O). In addition to controlling I/O devices they performed other operations such as buffering, checking, formatting, and
translating data.
10. Fill in the blanks below with the most appropriate term or concept discussed in this chapter:
Exception - a synchronous or asynchronous event that occurs, requiring the attention of
the CPU to take some action
Service routine (handler) - a special program that is run in order to service a device,
take care of some error condition, or respond to an unusual event
Stack - when an interrupt is accepted by a typical CPU, critical processor status
information is usually saved here
Non-maskable interrupt - the highest priority interrupt in a system; one that will never
be ignored by the CPU
Reset - a signal that causes the CPU to reinitialize itself and/or its peripherals so that the
system starts from a known state
Vectoring - the process of identifying the source of an interrupt and locating the service
routine associated with it
which is read by the processor in order to determine which handler should be executed
Trap - another name for a software interrupt, this is a synchronous event occurring inside
the CPU because of program activity
Abort - on some systems, the “Blue Screen Of Death” can result from this type of
software-related exception
Device interface registers - these are mapped in a system’s I/O address space; they
allow data and/or control information to be transferred between the system bus and an I/O device
Memory-mapped I/O - a technique that features a single, common address space for
both I/O devices and main memory
Bus master - any device that is capable of initiating transfers of data over the system bus
by providing the necessary address, control, and/or timing signals
Direct Memory Access Controller (DMAC) - a hardware device that is capable of
carrying out I/O activities after being initialized with certain parameters by the CPU
Burst mode DMA - a method of handling I/O where the DMAC takes over exclusive
control of the system bus and performs an entire block transfer in one operation
Input/Output Processor (IOP) (also known as Peripheral Processor or Front-End Processor) - an independent, programmable processor that is used in some systems to