HUMBOLDT-UNIVERSITÄT ZU BERLIN INSTITUT FÜR INFORMATIK
COMPUTER ARCHITECTURE
Lecture 17
Input/Output
Sommersemester 2002
Leitung: Prof. Dr. Miroslaw Malek
www.informatik.hu-berlin.de/rok/caINPUT/OUTPUT
• Input/Output problem
• Secondary memory technology (magnetic, optical) • I/O device selection (addressing)
• I/O protocols
• Data transfer mechanism • Synchronization mechanism
INPUT/OUTPUT
•
Historically a neglected subject within computer architecture
– Many benchmarks ignore I/O
•
Getting attention in the last few years
– Increasing gap between CPU, Memory and I/O speeds – A bottleneck in high-end machines
– Relative cost of "peripherals" is increasing (Very Large Scale Insanity)
Cost distribution
!INPUT/OUTPUT PROBLEM
Factors that make interfacing difficult
•
The encoding of the transmitted word must be that which is
employed by the I/O device.
•
Operating Rates
– The CPU and Main Memory operate at many times the speed of I/O devices
•
Timing and Control
– Exchange of status signals between CPU and device. – Rate of transmission from device to CPU or vice-versa.
•
Communication Link (Word Length)
– There are at least 25 different word lengths used in computers. The word lengths vary from 4 to 128 bits. The separation (or combination) of words (because of word length) into characters, bytes or other units presents a "word assembly" problem.
WORD-LENGTH DIFFERENCES
The output "word" must be the correct word length for the output
device.
•
Transmission by:
– Serial-by-Bit
– Serial-by-Character (Byte) - (quasiparallel) – Serial-by-Word - (parallel) DEVICE 1 DEVICE 2 1 2 4 3 1 2 3 4 1 2 3 4 Device 1 Computer or Input device Device 2 Output device or Computer x T1 T2 T3 T4
DESIGN
Design depends on
• Device Operating Speed
• Device Proximity to Processor
– Local – Remote
• Link Cost
– Remote (Communications Cost/Speed)
• Control is embedded in message train – Parallel: Prevention of Skewing
– Serial: Clocking and Synchronization
• Errors
– Factors – Prevention
IN SUMMARY
AN INPUT/OUTPUT PROBLEM
INVOLVES
•
ENCODING/CODES
•
OPERATING RATES
•
TIMING AND CONTROL
•
COMMUNICATION LINK STRUCTURES (WORD LENGTH)
TIMING AND CONTROL
•
Central processor may simultaneously communicate with one or
more of its external devices.
– seldom with all
– usually in no set pattern (randomness)
TYPICAL SEQUENCE OF CPU
•
Select desired external device
•
Determine the device's status
•
Signal device to connect itself to the processor
•
Receive acknowledgment from device that it is connected
•
Request device to initiate input (or output) and begin data exchange
•
Recognize "ready signal". Each data unit is read (or output) by the
device and the device signals completion of the step by giving a
"ready to exchange next data unit" signal
•
Repeat previous operation until an end of message condition is
detected
FACTORS CONTRIBUTING TO ERRORS (1)
• ENVIRONMENT– DIRT, MOISTURE (OPTICAL, MAGNETIC, MECHANICAL)
– TEMPERATURE/HUMIDITY
– ELECTROMAGNETIC RADIATION
– ELECTRIC POWER SURGES
• COMPONENT AGING
– CIRCUIT PARAMETERS DRIFT
– MECHANICAL WEAR
FACTORS CONTRIBUTING TO ERRORS (2)
•
SYSTEM "BUGS“
– UNANTICIPATED SEQUENCE OF INSTRUCTIONS AND CODE COMBINATIONS
– INCORRECT MEMORY ALLOCATION, I/O BUFFER SIZE EXCEEDED
– INCOMPLETELY PLANNED AND TESTED SYSTEM MODULE COMBINATIONS
•
USER MISTAKES
– MISSEQUENCING OF PROGRAMS – INCORRECT PROCEDURES
INPUT/OUTPUT ORGANIZATION
ADDRESSING
Input/Output Function
•
Select I/O Device
•
Exchange Data "units" with Device (Data Transfer)
•
Synchronize (Coordinate) timing of I/O operations
Addressing of I/O Devices
•
Each Device assigned
– Identification code, or – Address
• Within address capability of the machine • Usually block assignment
• Logical circuits (Memory not enable)
•
CPU sends address on address line, Device responds
•
Data Transfer
DATA TRANSFER
1 Program controlled I/O addressable buffer register
•
Use normal op-codes
•
Interface
– Address Decoding – Control Circuits
– Data Register, Status Register
2 Block transfer of data to Main Memory space. Direct Memory Access
(DMA)
• Concept is to provide circuitry to transfer data, a word at a time, consistent with the device speed and automatically sequence the transfer using registers in the DMA controller.
3 I/O Using DMA requires a program
– Load registers
– Load function (Read/Write) – Issue GO command
DATA TRANSFER (continued)
4 Connection of DMA
(control with memory means memory has to be shared between
CPU and I/O devices)
– A memory bus controller must be provided to coordinate memory usage
– Cycle stealing is the process of interweaving I/O priorities between microoperations in the execution of an instruction
SYNCHRONIZATION
POLLING
• The CPU must have some means to coordinate its external devices • The CPU has to know the status of devices and when events occur • Basically two methods are used; Polling (status checking) and
interrupts
Polling (status checking)
• Data Lines • Control Lines • Check Status
– (1) Holding Input – (2) Ready for Output
INTERRUPTS
•
General term used in a loose sense for any infrequent or
exceptional event that causes a CPU to make a temporary
transfer of control from its current program to another program
that services the event
•
I/O interrupt are used to:
– request CPU to initiate a new I/O operation – signal completion of I/O operation
– signal occurrence of hardware / software – errors
INTERRUPT
A typical interrupt sequence is as follows:
•
The CPU executes a program sequence
•
A special signal, Interrupt Request, is received by the CPU
•
The CPU acknowledges the interrupt and stops
execution,(usually after an instruction cycle) of its current
program and stores registers in memory (at a minimum the
Program Counter (PC) and the Prorgram Status Word (PSW)
•
The CPU's program counter (PC) is set to a new address where
an Interrupt Service Routine resides
•
The CPU performs execution as normal
– Note: that another interrupt sequence could be initiated while the CPU is performing the service routine
INTERRUPT (continued)
•
The CPU may return to the original program sequence when a
special return from interrupt (RTI) instruction is executed. In such
case the registers saved in Step c are restored and the program
counter (PC) address is then held when the interrupt was
acknowledged at Step c
•
The concept of an Interrupt is general. Interrupts may be initiated
from
– Internal operation codes – Arithmetic or logical errors – External events
•
Interrupts greatly facilitate operating systems where control needs
to be transferred back to the operating system when various events
occur
•
Several interrupts may happen within a short period of time and we
need methods of handling the interrupts through priority systems
SIMPLE INTERRUPT STRUCTURE
CAR 1 0 S R 0 1 R S 0 1 R Sto control logic for automatic subroutine jump ENI - enable interrupt CPU Channel ENI Device number encoder other control units Channel interrupt flip flop Enable interrupt flip flop Channel Device Control unit AND OR other control units Device flip flop interrupt Device request Device selector decoder Device command decoder
INTERRUPT HANDLING
•
There are several types and sources of interrupts
•
They have different priorities
•
Need to screen interrupts
– use INTEnable; INTDisable commands
•
Need to service acknowledged interrupt
– first identify interrupting device
•
Vectored interrupt: IO device provides address of interrupt
service routine (and other information)
– Automatically disable other interrupts before starting interrupt service routine
•
What if interrupting device hangs?
•
Nested interrupts - a higher priority interrupt can be
acknowledged and serviced from within the routine of a lower
priority interrupt
A SYSTEM WITH VECTORED I/O INTERRUPTS
CPU INT REQ 3 INT REQ 2 INT REQ 1 INT REQ 0IO port 0 IO port 1 IO port 2 IO port 3
IO
device A device BOutput device CInput
Data bus
LOCATION OF THE INTERRUPT SERVICING PROGRAM IN
MAIN MEMORY
0 1 2 3 4 2 0 1 7 0 2 4 0 G o to 4 G o to 2 0 G o to 1 7 0 G o to 2 4 0 D e v ic e A in p u t r o u tin e D e v ic e A o u tp u t r o u tin e D e v ic e B s e r v ic e r o u tin e D e v ic e C s e rv ic e ro u tin eDEVICE CONTROL UNIT
TYPICAL I/O INTERFACE
FUNCTIONS (I/O)
•
SELECT I/O DEVICE
•
EXCHANGE "DATA UNITS" WITH DEVICE (DATA TRANSFER)
•
SYNCHRONIZE (COORDINATE) TIMING OF I/O OPERATIONS.
Address decoder Data and status register Control circuit Input devices I/O Interface
device selector bus data and status bus
I/O control bus
I/O Bus
Address lines Data lines Control lines
DEVICE CONTROL UNIT
There are four types of I/O buses to all I/O control devices
•
I/O Data Bus
– Data I/O Data Register (IODR)
•
I/O Device Selector Bus
– Device Code Selector Circuit
•
I/O Command Bus
– Command Decoder
•
Available Status Bus
– Line through which timing signals are sent to CPU
– Usually, I/O Data Bus is combined with the Available Status Bus
•
Device Selector Decoder (AND gate with inverters)
DEVICE CONTROL UNIT (cont.)
•
Device Available Flip Flop
– Sets the available Status bus to ”0" while Device operation logic is on – Sets the Status bus to "1" when sensor on the device indicates action
is completed
•
Status codes Device
– Ready – Busy
– Disconnected – Power Not On
– Operation Completed
– Parity Error Detected during transmission – Tape not mounted etc.
•
I/O Channels
– Selector (High Speed) – Multiplexer (Byte)
– Block Multiplexers
•
Channel Programs
I/O PROCESSING METHODS
1. Program controlled
2. Direct Memory Access (DMA)
3. Selector
4. Character Mux (Byte)
5. Block mux (Burst)
MEMORY CPU
CHANNEL CONTROL
UNIT
PROGRAMMED I/O
Begin LDA OSEL 1 DEVICE CODE
STA SELECT SELECT DEVICE REGISTER LDA # -10
STA CNT SET COUNT=-10
WAIT TST OSTATUS CHECK OUTPUT STATUS REGISTER,
BPL WAIT IF STATUS PLUS WAIT
STATUS WORD
LDA CHAR+ PICK UP CHARACTER, INCREMENT ADDRESS STA OBUFF STORE CHARACTER IN OUTPUT BUFFER
INC CNT CNT=CNT+1, -10+1 = -9 ETC.
BNZ WAIT OUTPUT NEXT CHARACTER
(BRANCH NON ZERO)
TYPICAL I/O INTERFACE
I/O I/O 1 n CPU MEMORY I/O BUSDMA ACCESS
•
IO operation initiated by CPU
•
Example: to transfer a block of data, need four instructions:
– Load MAR
– Load word count – Read/Write
– GO
•
On task completion DMA informs CPU through an interrupt
•
IO operation initiated by I/O device
– DMA request sent to CPU
– Request granted at the next DMA breakpoint
CYCLE STEALING
•
Both CPU and DMA controller need the system bus to access
memory. Who gets priority?
•
DMA block transfer: an entire block is transferred in a single
continuous burst
– needed for magnetic-disk drives etc. Where data transmission cannot be stopped or slowed down without loss of data
– supports maximum IO data-transmission rate – may starve CPU for relatively long periods
•
Cycle stealing
– DMA steals memory cycles from CPU, transferring one or a few words at a time before returning control
– Thus memory and CPU bus transactions are interwoven – Reduces interference in CPU's activities
BUS ORGANIZATION FOR DIRECT MEMORY
ACCESS
A) Single-bus structure Control register and circuits Memory address counter register Memory data buffer Word counter Main memory CPU device DMA controller BusBUS ORGANIZATION FOR DIRECT MEMORY
ACCESS
B) Two-bus structure with a "floating" DMA controller
Main memory DMA controller I/O device I/O device CPU I/O Bus Memory bus
CIRCUITRY REQUIRED FOR DIRECT MEMORY
ACCESS (DMA)
Main memory Address Data AR AC IR Control unit DMA request DMA acknowledge Control unit CPU device DC IOAR IODR I/ODMA AND INTERRUPT BREAKPOINTS DURING AN
INSTRUCTION CYCLE
Instruction cycle CPU cycle Fetchinstruction Decodeinstruction Fetchoperand Executeinstruction Storeresult
Interrupt breakpoint DMA
I/O BUSSES
Control Unit I/O Control Unit Detect (D) Selector Decoder S0 S1 S2 S3 S4 S5 Data Bus Command Bus Selector Bus Available Status Bus 0 1 2 3 4 5 S S S S S S D = • • • • •I/O SUBSYSTEMS IN MAINFRAMES
Main memory banks
Memory control unit
Channel
1 Channel2 Channel3 Channel4
Magnetic disk control unit
Magnetic disk control unit
Printer and reader control unit Tape storage control unit etc. To tape units Printer 1 Printer 2 Disk unit 1 Disk unit 2 Disk unit 1 Disk unit 2 Disk unit N Disk unit N
ORGANIZATION OF A SELECTOR CHANNEL
Device address register Byte count register Parallel to byte serial conversion To I/O control units Byte-serial interface Memory data buffer 16, 32 or 64 bits parallel interface To main memory Channel control Memory data address registerORGANIZATION OF A MULTIPLEXOR
CHANNEL
To I/O Control Units Character Buffer Status Hardwired memory address Subchannel 1 Subchannel 1 Subchannel n Channel control Memory data buffer to Main MemoryCONVENTIONAL COMPUTER
ARCHITECTURE
IFC IFC IFC IFC IFC IFC IFC
Main Memory L L M H H H H H H H H CPU Console Central processor Human Operator Multiplexor Channel Selector Channel Selector Channel MCS ... ... ... IFC
IFC - Interface Controller MCS - Multi Channel Switch L - Low-speed device M - medium-speed device
TYPES OF CHANNELS
Selector
- exclusive I/O path for a single High-speed (H) program selected device
Character Multiplexor
- momentary I/O path for a single Medium-speed (M) program selected device (Burst Mode) or
- time shared character interleaved path for several Low-speed (L) devices (Byte Mode).
Block Multiplexor
- momentary exclusive path to a single High-speed (H) device (Selector Mode) or - provides a time shared, block interleave path for several High-speed (H) or
CHARACTER MULTIPLEXOR
A A A A B B B B C C C C C A B C C A B C A A A A B B B B C C C C C C C C C C C C To Main MemoryByte Multiplex Mode
I/O
I/O
Byte Burst Mode
To Main Memory
INPUT/OUTPUT INSTRUCTIONS
Functions
1. Select a particular device.
2. Specify the first address in memory to or from which data are to be transferred. 3. Specify the number of words which are to be transferred.
4. Select the read or write (R/W) function.
Instructions
EXTERNAL FUNCTION X
The contents of address X contains the code for one of the I/O devices which is connected to the I/O register and also a code for the operation to be performed.
READ X
Transfer the contents of the I/O register to memory location X.
CONNECT X
This is another form of the external function instruction. If a machine has several
channels, this instruction may also specify which channel is used as well as the device to be connected and the function.
DISCONNECT X
This disconnects the device from the computer and terminates the I/O operation.
COPY STATUS X
Transfer the contents of the I/O control register, IOCR, to memory location X. This contains the status of the I/O device.
STATUS REQUEST X
This requests the status of a device to be placed into the IOCR. The COPY STATUS X instruction then transfers the status to a location in memory where it can be examined by a program.
READ X,Y
Transfer X number of words from a device which has been connected to the CPU by a previous CONNECT X instruction into consecutive memory locations starting with
location Y. This instruction is usually used with a computer which has a direct memory
USE PRIORITY ARBITRATION CIRCUIT
CPU
INTR
Device 1 Device 2 Device p INTA INTR 0 INTA p 1 1 Priority arbiration circuit
THE INTEL 8255 PROGRAMMABLE PERIPHERAL
INTERFACE CIRCUIT
8 8 8 8 4 4 8 A CA C B B IO devices Data buffers Control register Control logic Data buffer 8-bit internal bus 8080 data bus A A01 READ WRITE address lines CPUVME bus Future Bus Multibus II IPI SCSI Bus width (signals) 128 96 96 16 8 Address/data multiplexed
Not multiplexed Multiplexed Multiplexed N/A N/A
Data width (primary)
16 to 32 bits 32 bits 32 bits 16 bits 8 bits
Transfer size Single or multiple Single or multiple Single or multiple
Single or multiple Single or multiple Number of bus
masters
Multiple Multiple Multiple Single Multiple
Split transaction No Optional Optional Optional Optional
Clocking Asynchronous Asynchronous Synchronous Asynchronous Either
Bandwidth, 0-ns access memory, single word
25.0 MB/sec 37.0 MB/sec 25.0 MB/sec 5.0 MB/sec or
1.5 MB/sec Bandwidth, 150-ns
access
memory, single word
12.9 MB/sec 15.5 MB/sec 10.0 MB/sec 25.0 MB/sec 5.0 MB/sec or
1.5 MB/sec
VME bus Future Bus Multibus II IPI SCSI Bandwidth, 0-ns access memory, multiple words (infinite block length)
27.9 MB/sec 95.2 MB/sec 40.0 MB/sec 25.0 MB/sec 5.0 MB/sec or
1.5 MB/sec Bandwidth, 150-ns access memory, multiple words (infinite block length)
13.6 MB/sec 20.8 MB/sec 13.5 MB/sec 25.0 MB/sec 5.0 MB/sec or
1.5 MB/sec Maximum number of devices 21 20 21 8 7 Maximum bus length
0.5 meter 0.5 meter 0.5 meter 50 meters 25 meters
Standard IEEE 1014 IEEE ANSI/IEEE
1296
ANSI X3.129 ANSI X3.131
TYPICAL BUS STANDARDS (cont.)
•The first three were defined originally as memory buses and the last two as I/O buses. For the CPU-memory buses the bandwidth calculations assume a fully loaded bus and are given to both single-word transfers and block transfers of unlimited length; measurements are shown both ignoring memory latency and assuming 150-ns access time. Bandwidth assumes the average distance of a transfer is on-third of the backplane length. (Data in the first three columns is from Borril [1986]), the Bandwidth for the I/O buses is given as their maximum data transfer rate.
SUMMARY
- matching CPU and I/O speeds remains to be a problem
- adding separate CPU’s on channels ad devices
- remarkable technology advances
(e.g., flat screen displays, but a 100 year old keyboard remains to be a popular input device)
- I/O remains to be the most expensive part of computer systems
Current challenges
- further miniaturization
- access to information at any place and time at high speeds (e.g., wearable computers)