Randal E. Bryant
Carnegie Mellon University
CS:APP Chapter 4
Computer Architecture
Instruction Set
Architecture
http://csapp.cs.cmu.edu
Instruction Set Architecture
ISA Compiler OS CPU Design Circuit Design Chip Layout Application ProgramJeu d’instructions du processeur, ou plutôt du modèle de processeur.
X86 Evolution: Programmer’s View
Name
Date
Transistors
8086
1978
29K
16-bit processor. Basis for IBM PC & DOS
Limited to 1MB address space. DOS only gives you 640K
386
1985
275K
Extended to 32 bits. Added “flat addressing”
Capable of running Unix
Linux/gcc uses no instructions introduced in later models
X86 Evolution: Programmer’s View
Name
Date
Transistors
Pentium III
1999
8.2M
Added “streaming SIMD” instructions for operating on 128-bit vectors of 1, 2, or 4 byte integer or floating point data
Our fish machines
Pentium 4
2001
42M
Added 8-byte formats and 144 new instructions for streaming SIMD mode
X86 Evolution: Clones
Advanced Micro Devices (AMD)
Historically
zAMD has followed just behind Intel zA little bit slower, a lot cheaper
Recently
zRecruited top circuit designers from Digital Equipment Corp. zExploited fact that Intel distracted by IA64
zNow are close competitors to Intel
New Species: IA64
Name
Date
Transistors
Itanium
2001
10M
Extends to IA64, a 64-bit architecture
Radically new instruction set designed for high performance
Will be able to run existing IA32 programs zOn-board “x86 engine”
Joint project with Hewlett-Packard
Itanium 2
2002
221M
%eax %ecx %edx %ebx %esi %edi %esp %ebp
Y86 Processor State
Program Registers
z Same 8 as with IA32. Each 32 bits
Program Counter
z Indicates address of instruction
Program registers Condition codes PC Memory OF ZF SF
%eax %ecx %edx %ebx %esi %edi %esp %ebp
Condition codes
z Single-bit flags set by arithmetic or logical instructions
» OF: Overflow ZF: Zero SF:Negative
z Il manque CF : Carry (retenue). On travaille donc sur des entiers relatifs (“avec signe”).
z Correspondance avec la notation NZVC :
» N = SF Z = ZF V = OF Program registers Condition codes PC Memory OF ZF SF
%eax %ecx %edx %ebx %esi %edi %esp %ebp
Memory
Program registers Condition codes PC Memory OF ZF SFByte-addressable storage array
Machine Words
Machine Has “Word Size”
Nominal size of integer-valued data z Including addresses
Most current machines are 32 bits (4 bytes) z Limits addresses to 4GB
z Becoming too small for memory-intensive applications
High-end systems are 64 bits (8 bytes) z Potentially address ≈ 1.8 X 1019 bytes Machines support multiple data formats
z Fractions or multiples of word size z Always integral number of bytes
Word-Oriented Memory
Organization
Addresses Specify Byte
Locations
Address of first byte in word
Addresses of successive words differ by 4 (32-bit) or 8 (64-bit) 0000 0001 0002 0003 0004 0005 0006 0007 0008 0009 000a 000b 32-bit
Words Bytes Addr.
000c 000d 000e 000f 64-bit Words Addr = ?? Addr = ?? Addr = ?? Addr = ?? Addr = ?? Addr = ?? 0000 0004 0008 000c 0000 0008
Byte Ordering Example
Big Endian
Least significant byte has highest address
Little Endian
Least significant byte has lowest address
Example
Variable n has 4-byte representation 0x01234567
Address given by &n is 0x100
0x100 0x101 0x102 0x103 01 23 45 67 0x100 0x101 0x102 0x103 67 45 23 01 Big Endian Little Endian 01 23 45 67 67 45 23 01
Y86 Instructions
Format
1--6 bytes of information read from memory z Can determine instruction length from first byte
z Not as many instruction types, and simpler encoding than with IA32
Each accesses and modifies some part(s) of the program state
PC incrémenté selon la longueur de l’instruction, pour pointer sur l’instruction suivante
Encoding Registers
Each register has 4-bit ID
Same encoding as in IA32
IA32 utilise seulement 3 bits pour coder un registre
Register ID 8 indicates “no register”
Will use this in our hardware design in multiple places %eax %ecx %edx %ebx %esi %edi %esp %ebp 0 1 2 3 6 7 4 5
Instruction Example
Addition Instruction
Add value in register rA to that in register rB z Store result in register rB
z Note that Y86 only allows addition to be applied to register data
Set condition codes based on result
e.g., addl %eax,%esi Encoding: 60 06 Two-byte encoding
z First indicates instruction type
z Second gives source and destination registers
addl rA, rB 6 0 rA rB
Encoded Representation Generic Form
Arithmetic and Logical Operations
Refer to generically as“OPl”
Encodings differ only by “function code”
Set condition codes as side effect
Manquent (entre autres) : inc (incrémentation), dec, sh (shift = décalage) … addl rA, rB 6 0 rA rB subl rA, rB 6 1 rA rB andl rA, rB 6 2 rA rB xorl rA, rB 6 3 rA rB Add
Subtract (rA from rB)
And
Exclusive-Or
Arithmetic and Logical Operations (2)
Dans les architectures RISC (Reduced Instruction Set Computer) les opérations portent sur 3 registres :
• deux registres sources pour les opérandes
• un registre destination pour le résultat
Attention : ici (x86 ou Y86) le second registre sert aussi de destination, et le second opérande est donc écrasé.
Move Operations
Like the IA32 movl instruction
Simpler format for memory addresses
rrmovl rA, rB 2 0 rA rB Register --> Register
Immediate --> Register irmovl V, rB 3 0 8 rB V Register --> Memory rmmovl rA, D(rB) 4 0 rA rB D Memory --> Register mrmovl D(rB), rA 5 0 rA rB D
Move Instruction Examples
irmovl $0xabcd, %edx
movl $0xabcd, %edx 30 82 cd ab 00 00
IA32 Y86 Encoding
rrmovl %esp, %ebx
movl %esp, %ebx 20 43
mrmovl -12(%ebp),%ecx
movl -12(%ebp),%ecx 50 15 f4 ff ff ff rmmovl %esi,0x41c(%esp)
movl %esi,0x41c(%esp)
—
movl $0xabcd, (%eax)
—
movl %eax, 12(%eax,%edx)
—
movl (%ebp,%eax,4),%ecx
Load
mrmovl 12(%ebp),%ecx mrmovl (%ebp),%ecx mrmovl 0x1b8,%ecx
le registre ebp sert de pointeur adresse absolue de lecture
cas général (déplacement)
Store
rmmovl %esi,0x41c(%esp) rmmovl %esi,(%esp)
rmmovl %esi,0x41c
le registre esp sert de pointeur adresse absolue d’écriture
cas général (déplacement)
Adressage par déplacement
mrmovl 12(%ebp),%ecx cas général (déplacement)
rmmovl %esi,0x41c(%esp) cas général (déplacement) On charge le mot d’adresse ebp + 12 dans le registre ecx.
Load = charger = lire .
On sauvegarde le registre esi à l’adresse esp + 0x41c . Store = sauve(garde)r = écrire .
Jump Instructions
Refer to generically as “jXX”
Encodings differ only by “function code”
Based on values of condition codes
Same as IA32 counterparts
Encode full destination address
z Unlike PC-relative
addressing seen in IA32
jmp Dest 7 0
Jump Unconditionally
Dest
jle Dest 7 1
Jump When Less or Equal
Dest
jl Dest 7 2
Jump When Less
Dest
je Dest 7 3
Jump When Equal
Dest
jne Dest 7 4
Jump When Not Equal
Dest
jge Dest 7 5
Jump When Greater or Equal
Dest
jg Dest 7 6
Jump When Greater
Sauts conditionnels
Jump When Less : saut effectué si le résultat de la dernière
opération (addl, subl, andl, xorl) est négatif; ne pas oublier
l’overflow !
• ( SF = 1 et OF = 0 ) ou ( SF = 0 et OF = 1 )
• en résumé SF ≠ OF
Jump When Equal : saut effectué si le résultat de la dernière
opération est nul, soit ZF = 1
Jump When Less or Equal : SF ≠ OF or ZF = 1
Jump When Greater or Equal : SF = OF
Jump When Not Equal : ZF = 0
Jump When Greater : SF = OF and ZF = 0
Sauts (suite et fin)
Un saut est effectué en modifiant PC .
Les instructions rrmovl, irmovl, rmmovl, mrmovl ne modifient
jamais les codes de condition.
Opérations logiques andl, xorl : ZF et SF ajustés selon résultat de l’opération, OF = 0 (clear).
¾ andl %eax, %eax # ne modifie pas le registre
je ... # saut effectué si eax = 0 jl ... # saut effectué si eax < 0
¾ xorl %eax, %ebx
je ... # saut effectué si eax = ebx (mais ce test écrase ebx)
Miscellaneous Instructions
Don’t do anything
Stop executing instructions
IA32 has comparable instruction, but can’t execute it in user mode
We will use it to stop the simulator
nop 0 0
Code for
sum
0x401040 <sum>: 0x55 0x89 0xe5 0x8b 0x45 0x0c 0x03 0x45 0x08 0x89 0xec 0x5d 0xc3Object Code
Assembler
Translates .s into .o Binary encoding of each instruction
Nearly-complete image of executable code
Missing linkages between code in different files
Linker
Resolves references between files
Combines with static run-time libraries
z E.g., code for malloc, printf
Some libraries are dynamically linked z Linking occurs when program begins
• Total of 13 bytes • Each instruction 1, 2, or 3 bytes • Starts at address 0x401040
text text binary binary Compiler (gcc -S) Assembler (gcc or as) Linker (gcc or ld) C program (p1.c p2.c) Asm program (p1.s p2.s)
Object program (p1.o p2.o)
Executable program (p)
Static libraries (.a)
Turning C into Object Code
Code in files p1.c p2.c Compile with command: gcc -O p1.c p2.c -o p
zUse optimizations (-O)
Disassembled
00401040 <_sum>: 0: 55 push %ebp 1: 89 e5 mov %esp,%ebp 3: 8b 45 0c mov 0xc(%ebp),%eax 6: 03 45 08 add 0x8(%ebp),%eax 9: 89 ec mov %ebp,%esp b: 5d pop %ebp c: c3 ret d: 8d 76 00 lea 0x0(%esi),%esiDisassembling Object Code
Disassembler
objdump -d p
Useful tool for examining object code
Analyzes bit pattern of series of instructions
Pseudo code objet
Simulateur Y86
Loop: mrmovl (%ecx),%esi # get *Start addl %esi,%eax # add to sum irmovl $4,%ebx
addl %ebx,%ecx # Start++ irmovl $-1,%ebx
addl %ebx,%edx #
Count--jne Loop # Stop when 0
Fichier source .ys
0x057: 506100000000 |Loop: mrmovl (%ecx),%esi 0x05d: 6060 | addl %esi,%eax 0x05f: 308304000000 | irmovl $4,%ebx 0x065: 6031 | addl %ebx,%ecx 0x067: 3083ffffffff | irmovl $-1,%ebx 0x06d: 6032 | addl %ebx,%edx 0x06f: 7457000000 | jne Loop Fichier .yo Assembleur yas
Fichier .yo
0x057: 506100000000 |Loop: mrmovl (%ecx),%esi 0x05d: 6060 | addl %esi,%eax 0x05f: 308304000000 | irmovl $4,%ebx 0x065: 6031 | addl %ebx,%ecx 0x067: 3083ffffffff | irmovl $-1,%ebx 0x06d: 6032 | addl %ebx,%edx 0x06f: 7457000000 | jne Loop
1. Etiquette
=
adresse
calculée par l’assembleur
(vrai pour tout assembleur)
CISC Instruction Sets
Complex Instruction Set Computer Dominant style through mid-80’s
Stack-oriented instruction set
Use stack to pass arguments, save program counter
Explicit push and pop instructions
Arithmetic instructions can access memory
addl %eax, 12(%ebx,%ecx,4)
z requires memory read and write z Complex address calculation
Condition codes
Set as side effect of arithmetic and logical instructions
Philosophy
RISC Instruction Sets
Reduced Instruction Set Computer Internal project at IBM, later popularized by Hennessy (Stanford) and Patterson (Berkeley)
Fewer, simpler instructions
Might take more to get given task done
Can execute them with small and fast hardware
Register-oriented instruction set
Many more (typically 32) registers
Use for arguments, return pointer, temporaries
Only load and store instructions can access memory
Similar to Y86 mrmovl and rmmovl
MIPS Instruction Examples
Op Ra Rb Offset Op Ra Rb Rd 00000 Fn R-R Op Ra Rb Immediate R-I Load/Storeaddu $3,$2,$1 # Register add: $3 = $2+$1
addu $3,$2, 3145 # Immediate add: $3 = $2+3145 sll $3,$2,2 # Shift left: $3 = $2 << 2
lw $3,16($2) # Load Word: $3 = M[$2+16]
Op Ra Rb Offset
Branch
CISC vs. RISC
Original Debate
Strong opinions!
CISC proponents---easy for compiler, fewer code bytes
RISC proponents---better for optimizing compilers, can make run fast with simple chip design
Current Status
For desktop processors, choice of ISA not a technical issue z With enough hardware, can make anything run fast
z Code compatibility more important
For embedded processors, RISC makes sense z Smaller, cheaper, less power
Summary
Y86 Instruction Set Architecture
Similar state and instructions as IA32
Simpler encodings
Somewhere between CISC and RISC
How Important is ISA Design?
Less now than before
z With enough hardware, can make almost anything go fast
Intel is moving away from IA32
z Does not allow enough parallel execution z Introduced IA64
» 64-bit word sizes (overcome address space limitations)
» Radically different style of instruction set with explicit parallelism » Requires sophisticated compilers