CS:APP Chapter 4 Computer Architecture Instruction Set Architecture

(1)

Randal E. Bryant

Carnegie Mellon University

CS:APP Chapter 4

Computer Architecture

Instruction Set

Architecture

http://csapp.cs.cmu.edu

(2)

Instruction Set Architecture

ISA Compiler OS CPU Design Circuit Design Chip Layout Application Program

Jeu d’instructions du processeur, ou plutôt du modèle de processeur.

(3)

X86 Evolution: Programmer’s View

Name

Date

Transistors

8086

1978

29K

16-bit processor. Basis for IBM PC & DOS

Limited to 1MB address space. DOS only gives you 640K

386

1985

275K

Extended to 32 bits. Added “flat addressing”

Capable of running Unix

Linux/gcc uses no instructions introduced in later models

(4)

X86 Evolution: Programmer’s View

Name

Date

Transistors

Pentium III

1999

8.2M

Added “streaming SIMD” instructions for operating on 128-bit vectors of 1, 2, or 4 byte integer or floating point data

Our fish machines

Pentium 4

2001

42M

Added 8-byte formats and 144 new instructions for streaming SIMD mode

(5)

X86 Evolution: Clones

Advanced Micro Devices (AMD)

Historically

zAMD has followed just behind Intel zA little bit slower, a lot cheaper

Recently

zRecruited top circuit designers from Digital Equipment Corp. zExploited fact that Intel distracted by IA64

zNow are close competitors to Intel

(6)

New Species: IA64

Name

Date

Transistors

Itanium

2001

10M

Extends to IA64, a 64-bit architecture

Radically new instruction set designed for high performance

Will be able to run existing IA32 programs zOn-board “x86 engine”

Joint project with Hewlett-Packard

Itanium 2

2002

221M

(7)

%eax %ecx %edx %ebx %esi %edi %esp %ebp

Y86 Processor State

Program Registers

z Same 8 as with IA32. Each 32 bits

Program Counter

z Indicates address of instruction

Program registers Condition codes PC Memory OF ZF SF

(8)

Condition codes

z Single-bit flags set by arithmetic or logical instructions

» OF: Overflow ZF: Zero SF:Negative

z Il manque CF : Carry (retenue). On travaille donc sur des entiers relatifs (“avec signe”).

z Correspondance avec la notation NZVC :

» N = SF Z = ZF V = OF Program registers Condition codes PC Memory OF ZF SF

(9)

Memory

Program registers Condition codes PC Memory OF ZF SF

Byte-addressable storage array

(10)

Machine Words

Machine Has “Word Size”

Nominal size of integer-valued data z Including addresses

Most current machines are 32 bits (4 bytes) z Limits addresses to 4GB

z Becoming too small for memory-intensive applications

High-end systems are 64 bits (8 bytes) z Potentially address ≈ 1.8 X 1019 _bytes Machines support multiple data formats

z Fractions or multiples of word size z Always integral number of bytes

(11)

Word-Oriented Memory

Organization

Addresses Specify Byte

Locations

Address of first byte in word

Addresses of successive words differ by 4 (32-bit) or 8 (64-bit) 0000 0001 0002 0003 0004 0005 0006 0007 0008 0009 000a 000b 32-bit

Words Bytes Addr.

000c 000d 000e 000f 64-bit Words Addr = ?? Addr = ?? Addr = ?? Addr = ?? Addr = ?? Addr = ?? 0000 0004 0008 000c 0000 0008

(12)

Byte Ordering Example

Big Endian

Least significant byte has highest address

Little Endian

Least significant byte has lowest address

Example

Variable n has 4-byte representation 0x01234567

Address given by &n is 0x100

0x100 0x101 0x102 0x103 01 23 45 67 0x100 0x101 0x102 0x103 67 45 23 01 Big Endian Little Endian 01 23 45 67 67 45 23 01

(13)

Y86 Instructions

Format

1--6 bytes of information read from memory z Can determine instruction length from first byte

z Not as many instruction types, and simpler encoding than with IA32

Each accesses and modifies some part(s) of the program state

PC incrémenté selon la longueur de l’instruction, pour pointer sur l’instruction suivante

(14)

Encoding Registers

Each register has 4-bit ID

Same encoding as in IA32

IA32 utilise seulement 3 bits pour coder un registre

Register ID 8 indicates “no register”

Will use this in our hardware design in multiple places %eax %ecx %edx %ebx %esi %edi %esp %ebp 0 1 2 3 6 7 4 5

(15)

Instruction Example

Addition Instruction

Add value in register rA to that in register rB z Store result in register rB

z Note that Y86 only allows addition to be applied to register data

Set condition codes based on result

e.g., addl %eax,%esi Encoding: 60 06 Two-byte encoding

z First indicates instruction type

z Second gives source and destination registers

addl rA, rB 6 0 rA rB

Encoded Representation Generic Form

(16)

Arithmetic and Logical Operations

Refer to generically as

“OPl”

Encodings differ only by “function code”

Set condition codes as side effect

Manquent (entre autres) : inc (incrémentation), dec, sh (shift = décalage) … addl rA, rB 6 0 _{rA rB} subl rA, rB 6 1 _{rA rB} andl rA, rB 6 2 _{rA rB} xorl rA, rB 6 3 _{rA rB} Add

Subtract (rA from rB)

And

Exclusive-Or

(17)

Arithmetic and Logical Operations (2)

Dans les architectures RISC (Reduced Instruction Set Computer) les opérations portent sur 3 registres :

• deux registres sources pour les opérandes

• un registre destination pour le résultat

Attention : ici (x86 ou Y86) le second registre sert aussi de destination, et le second opérande est donc écrasé.

(18)

Move Operations

Like the IA32 movl instruction

Simpler format for memory addresses

rrmovl rA, rB 2 0 rA rB Register --> Register

Immediate --> Register irmovl V, rB 3 0 8 _rB _V Register --> Memory rmmovl rA, D(rB) 4 0 rA rB D Memory --> Register mrmovl D(rB), rA 5 0 rA rB D

(19)

Move Instruction Examples

irmovl $0xabcd, %edx

movl $0xabcd, %edx 30 82 cd ab 00 00

IA32 Y86 Encoding

rrmovl %esp, %ebx

movl %esp, %ebx 20 43

mrmovl -12(%ebp),%ecx

movl -12(%ebp),%ecx 50 15 f4 ff ff ff rmmovl %esi,0x41c(%esp)

movl %esi,0x41c(%esp)

—

movl $0xabcd, (%eax)

—

movl %eax, 12(%eax,%edx)

—

movl (%ebp,%eax,4),%ecx

(20)

Load

mrmovl 12(%ebp),%ecx mrmovl (%ebp),%ecx mrmovl 0x1b8,%ecx

le registre ebp sert de pointeur adresse absolue de lecture

cas général (déplacement)

Store

rmmovl %esi,0x41c(%esp) rmmovl %esi,(%esp)

rmmovl %esi,0x41c

le registre esp sert de pointeur adresse absolue d’écriture

cas général (déplacement)

(21)

Adressage par déplacement

mrmovl 12(%ebp),%ecx cas général (déplacement)

rmmovl %esi,0x41c(%esp) cas général (déplacement) On charge le mot d’adresse ebp + 12 dans le registre ecx.

Load = charger = lire .

On sauvegarde le registre esi à l’adresse esp + 0x41c . Store = sauve(garde)r = écrire .

(22)

Jump Instructions

Refer to generically as “jXX”

Encodings differ only by “function code”

Based on values of condition codes

Same as IA32 counterparts

Encode full destination address

z Unlike PC-relative

addressing seen in IA32

jmp Dest 7 0

Jump Unconditionally

Dest

jle Dest 7 1

Jump When Less or Equal

Dest

jl Dest 7 2

Jump When Less

Dest

je Dest 7 3

Jump When Equal

Dest

jne Dest 7 4

Jump When Not Equal

Dest

jge Dest 7 5

Jump When Greater or Equal

Dest

jg Dest 7 6

Jump When Greater

(23)

Sauts conditionnels

Jump When Less : saut effectué si le résultat de la dernière

opération (addl, subl, andl, xorl) est négatif; ne pas oublier

l’overflow !

• ( SF = 1 et OF = 0 ) ou ( SF = 0 et OF = 1 )

• en résumé SF ≠ OF

Jump When Equal : saut effectué si le résultat de la dernière

opération est nul, soit ZF = 1

Jump When Less or Equal : SF ≠ OF or ZF = 1

Jump When Greater or Equal : SF = OF

Jump When Not Equal : ZF = 0

Jump When Greater : SF = OF and ZF = 0

(24)

Sauts (suite et fin)

Un saut est effectué en modifiant PC .

Les instructions rrmovl, irmovl, rmmovl, mrmovl ne modifient

jamais les codes de condition.

Opérations logiques andl, xorl : ZF et SF ajustés selon résultat de l’opération, OF = 0 (clear).

¾ andl %eax, %eax # ne modifie pas le registre

je ... # saut effectué si eax = 0 jl ... # saut effectué si eax < 0

¾ xorl %eax, %ebx

je ... # saut effectué si eax = ebx (mais ce test écrase ebx)

(25)

Miscellaneous Instructions

Don’t do anything

Stop executing instructions

IA32 has comparable instruction, but can’t execute it in user mode

We will use it to stop the simulator

nop 0 0

(26)

Code for

sum

0x401040 <sum>: 0x55 0x89 0xe5 0x8b 0x45 0x0c 0x03 0x45 0x08 0x89 0xec 0x5d 0xc3

Object Code

Assembler

Translates .s into .o

Binary encoding of each instruction

Nearly-complete image of executable code

Missing linkages between code in different files

Linker

Resolves references between files

Combines with static run-time libraries

z E.g., code for malloc, printf

Some libraries are dynamically linked z Linking occurs when program begins

• Total of 13 bytes • Each instruction 1, 2, or 3 bytes • Starts at address 0x401040

(27)

text text binary binary Compiler (gcc -S) Assembler (gcc or as) Linker (gcc or ld) C program (p1.c p2.c) Asm program (p1.s p2.s)

Object program (p1.o p2.o)

Executable program (p)

Static libraries (.a)

Turning C into Object Code

Code in files p1.c p2.c

Compile with command: gcc -O p1.c p2.c -o p

zUse optimizations (-O)

(28)

Disassembled

00401040 <_sum>: 0: 55 push %ebp 1: 89 e5 mov %esp,%ebp 3: 8b 45 0c mov 0xc(%ebp),%eax 6: 03 45 08 add 0x8(%ebp),%eax 9: 89 ec mov %ebp,%esp b: 5d pop %ebp c: c3 ret d: 8d 76 00 lea 0x0(%esi),%esi

Disassembling Object Code

Disassembler

objdump -d p

Useful tool for examining object code

Analyzes bit pattern of series of instructions

(29)

Pseudo code objet

Simulateur Y86

Loop: mrmovl (%ecx),%esi # get *Start addl %esi,%eax # add to sum irmovl $4,%ebx

addl %ebx,%ecx # Start++ irmovl $-1,%ebx

addl %ebx,%edx #

Count--jne Loop # Stop when 0

Fichier source .ys

(30)

Fichier .yo

0x057: 506100000000 |Loop: mrmovl (%ecx),%esi 0x05d: 6060 | addl %esi,%eax 0x05f: 308304000000 | irmovl $4,%ebx 0x065: 6031 | addl %ebx,%ecx 0x067: 3083ffffffff | irmovl $-1,%ebx 0x06d: 6032 | addl %ebx,%edx 0x06f: 7457000000 | jne Loop

1. Etiquette

=

adresse

calculée par l’assembleur

(vrai pour tout assembleur)

(31)

CISC Instruction Sets

Complex Instruction Set Computer

Dominant style through mid-80’s

Stack-oriented instruction set

Use stack to pass arguments, save program counter

Explicit push and pop instructions

Arithmetic instructions can access memory

addl %eax, 12(%ebx,%ecx,4)

z requires memory read and write z Complex address calculation

Condition codes

Set as side effect of arithmetic and logical instructions

Philosophy

(32)

RISC Instruction Sets

Reduced Instruction Set Computer

Internal project at IBM, later popularized by Hennessy (Stanford) and Patterson (Berkeley)

Fewer, simpler instructions

Might take more to get given task done

Can execute them with small and fast hardware

Register-oriented instruction set

Many more (typically 32) registers

Use for arguments, return pointer, temporaries

Only load and store instructions can access memory

Similar to Y86 mrmovl and rmmovl

(33)

(34)

MIPS Instruction Examples

Op Ra Rb Offset Op Ra Rb Rd 00000 Fn R-R Op Ra Rb Immediate R-I Load/Store

addu $3,$2,$1 # Register add: $3 = $2+$1

addu $3,$2, 3145 # Immediate add: $3 = $2+3145 sll $3,$2,2 # Shift left: $3 = $2 << 2

lw $3,16($2) # Load Word: $3 = M[$2+16]

Op Ra Rb Offset

Branch

(35)

CISC vs. RISC

Original Debate

Strong opinions!

CISC proponents---easy for compiler, fewer code bytes

RISC proponents---better for optimizing compilers, can make run fast with simple chip design

Current Status

For desktop processors, choice of ISA not a technical issue z With enough hardware, can make anything run fast

z Code compatibility more important

For embedded processors, RISC makes sense z Smaller, cheaper, less power

(36)

Summary

Y86 Instruction Set Architecture

Similar state and instructions as IA32

Simpler encodings

Somewhere between CISC and RISC

How Important is ISA Design?

Less now than before

z With enough hardware, can make almost anything go fast

Intel is moving away from IA32

z Does not allow enough parallel execution z Introduced IA64

» 64-bit word sizes (overcome address space limitations)

» Radically different style of instruction set with explicit parallelism » Requires sophisticated compilers