• No results found

CS:APP Chapter 4 Computer Architecture Instruction Set Architecture

N/A
N/A
Protected

Academic year: 2021

Share "CS:APP Chapter 4 Computer Architecture Instruction Set Architecture"

Copied!
36
0
0

Loading.... (view fulltext now)

Full text

(1)

Randal E. Bryant

Carnegie Mellon University

CS:APP Chapter 4

Computer Architecture

Instruction Set

Architecture

http://csapp.cs.cmu.edu

(2)

Instruction Set Architecture

ISA Compiler OS CPU Design Circuit Design Chip Layout Application Program

Jeu d’instructions du processeur, ou plutôt du modèle de processeur.

(3)

X86 Evolution: Programmer’s View

Name

Date

Transistors

8086

1978

29K

„ 16-bit processor. Basis for IBM PC & DOS

„ Limited to 1MB address space. DOS only gives you 640K

386

1985

275K

„ Extended to 32 bits. Added “flat addressing”

„ Capable of running Unix

„ Linux/gcc uses no instructions introduced in later models

(4)

X86 Evolution: Programmer’s View

Name

Date

Transistors

Pentium III

1999

8.2M

„ Added “streaming SIMD” instructions for operating on 128-bit vectors of 1, 2, or 4 byte integer or floating point data

„ Our fish machines

Pentium 4

2001

42M

„ Added 8-byte formats and 144 new instructions for streaming SIMD mode

(5)

X86 Evolution: Clones

Advanced Micro Devices (AMD)

„ Historically

zAMD has followed just behind Intel zA little bit slower, a lot cheaper

„ Recently

zRecruited top circuit designers from Digital Equipment Corp. zExploited fact that Intel distracted by IA64

zNow are close competitors to Intel

(6)

New Species: IA64

Name

Date

Transistors

Itanium

2001

10M

„ Extends to IA64, a 64-bit architecture

„ Radically new instruction set designed for high performance

„ Will be able to run existing IA32 programs zOn-board “x86 engine”

„ Joint project with Hewlett-Packard

Itanium 2

2002

221M

(7)

%eax %ecx %edx %ebx %esi %edi %esp %ebp

Y86 Processor State

„ Program Registers

z Same 8 as with IA32. Each 32 bits

„ Program Counter

z Indicates address of instruction

Program registers Condition codes PC Memory OF ZF SF

(8)

%eax %ecx %edx %ebx %esi %edi %esp %ebp

Condition codes

z Single-bit flags set by arithmetic or logical instructions

» OF: Overflow ZF: Zero SF:Negative

z Il manque CF : Carry (retenue). On travaille donc sur des entiers relatifs (“avec signe”).

z Correspondance avec la notation NZVC :

» N = SF Z = ZF V = OF Program registers Condition codes PC Memory OF ZF SF

(9)

%eax %ecx %edx %ebx %esi %edi %esp %ebp

Memory

Program registers Condition codes PC Memory OF ZF SF

„Byte-addressable storage array

(10)

Machine Words

Machine Has “Word Size”

„ Nominal size of integer-valued data z Including addresses

„ Most current machines are 32 bits (4 bytes) z Limits addresses to 4GB

z Becoming too small for memory-intensive applications

„ High-end systems are 64 bits (8 bytes) z Potentially address 1.8 X 1019 bytes „ Machines support multiple data formats

z Fractions or multiples of word size z Always integral number of bytes

(11)

Word-Oriented Memory

Organization

Addresses Specify Byte

Locations

„ Address of first byte in word

„ Addresses of successive words differ by 4 (32-bit) or 8 (64-bit) 0000 0001 0002 0003 0004 0005 0006 0007 0008 0009 000a 000b 32-bit

Words Bytes Addr.

000c 000d 000e 000f 64-bit Words Addr = ?? Addr = ?? Addr = ?? Addr = ?? Addr = ?? Addr = ?? 0000 0004 0008 000c 0000 0008

(12)

Byte Ordering Example

Big Endian

„ Least significant byte has highest address

Little Endian

„ Least significant byte has lowest address

Example

„ Variable n has 4-byte representation 0x01234567

„ Address given by &n is 0x100

0x100 0x101 0x102 0x103 01 23 45 67 0x100 0x101 0x102 0x103 67 45 23 01 Big Endian Little Endian 01 23 45 67 67 45 23 01

(13)

Y86 Instructions

Format

„ 1--6 bytes of information read from memory z Can determine instruction length from first byte

z Not as many instruction types, and simpler encoding than with IA32

„ Each accesses and modifies some part(s) of the program state

„ PC incrémenté selon la longueur de l’instruction, pour pointer sur l’instruction suivante

(14)

Encoding Registers

Each register has 4-bit ID

„ Same encoding as in IA32

„ IA32 utilise seulement 3 bits pour coder un registre

Register ID 8 indicates “no register”

„ Will use this in our hardware design in multiple places %eax %ecx %edx %ebx %esi %edi %esp %ebp 0 1 2 3 6 7 4 5

(15)

Instruction Example

Addition Instruction

„ Add value in register rA to that in register rB z Store result in register rB

z Note that Y86 only allows addition to be applied to register data

„ Set condition codes based on result

„ e.g., addl %eax,%esi Encoding: 60 06 „ Two-byte encoding

z First indicates instruction type

z Second gives source and destination registers

addl rA, rB 6 0 rA rB

Encoded Representation Generic Form

(16)

Arithmetic and Logical Operations

„ Refer to generically as

“OPl”

„ Encodings differ only by “function code”

„ Set condition codes as side effect

„ Manquent (entre autres) : inc (incrémentation), dec, sh (shift = décalage) … addl rA, rB 6 0 rA rB subl rA, rB 6 1 rA rB andl rA, rB 6 2 rA rB xorl rA, rB 6 3 rA rB Add

Subtract (rA from rB)

And

Exclusive-Or

(17)

Arithmetic and Logical Operations (2)

Dans les architectures RISC (Reduced Instruction Set Computer) les opérations portent sur 3 registres :

deux registres sources pour les opérandes

un registre destination pour le résultat

Attention : ici (x86 ou Y86) le second registre sert aussi de destination, et le second opérande est donc écrasé.

(18)

Move Operations

„ Like the IA32 movl instruction

„ Simpler format for memory addresses

rrmovl rA, rB 2 0 rA rB Register --> Register

Immediate --> Register irmovl V, rB 3 0 8 rB V Register --> Memory rmmovl rA, D(rB) 4 0 rA rB D Memory --> Register mrmovl D(rB), rA 5 0 rA rB D

(19)

Move Instruction Examples

irmovl $0xabcd, %edx

movl $0xabcd, %edx 30 82 cd ab 00 00

IA32 Y86 Encoding

rrmovl %esp, %ebx

movl %esp, %ebx 20 43

mrmovl -12(%ebp),%ecx

movl -12(%ebp),%ecx 50 15 f4 ff ff ff rmmovl %esi,0x41c(%esp)

movl %esi,0x41c(%esp)

movl $0xabcd, (%eax)

movl %eax, 12(%eax,%edx)

movl (%ebp,%eax,4),%ecx

(20)

Load

mrmovl 12(%ebp),%ecx mrmovl (%ebp),%ecx mrmovl 0x1b8,%ecx

le registre ebp sert de pointeur adresse absolue de lecture

cas général (déplacement)

Store

rmmovl %esi,0x41c(%esp) rmmovl %esi,(%esp)

rmmovl %esi,0x41c

le registre esp sert de pointeur adresse absolue d’écriture

cas général (déplacement)

(21)

Adressage par déplacement

mrmovl 12(%ebp),%ecx cas général (déplacement)

rmmovl %esi,0x41c(%esp) cas général (déplacement) On charge le mot d’adresse ebp + 12 dans le registre ecx.

Load = charger = lire .

On sauvegarde le registre esi à l’adresse esp + 0x41c . Store = sauve(garde)r = écrire .

(22)

Jump Instructions

„ Refer to generically as “jXX”

„ Encodings differ only by “function code”

„ Based on values of condition codes

„ Same as IA32 counterparts

„ Encode full destination address

z Unlike PC-relative

addressing seen in IA32

jmp Dest 7 0

Jump Unconditionally

Dest

jle Dest 7 1

Jump When Less or Equal

Dest

jl Dest 7 2

Jump When Less

Dest

je Dest 7 3

Jump When Equal

Dest

jne Dest 7 4

Jump When Not Equal

Dest

jge Dest 7 5

Jump When Greater or Equal

Dest

jg Dest 7 6

Jump When Greater

(23)

Sauts conditionnels

‰Jump When Less : saut effectué si le résultat de la dernière

opération (addl, subl, andl, xorl) est négatif; ne pas oublier

l’overflow !

( SF = 1 et OF = 0 ) ou ( SF = 0 et OF = 1 )

en résumé SF OF

‰Jump When Equal : saut effectué si le résultat de la dernière

opération est nul, soit ZF = 1

‰Jump When Less or Equal : SF OF or ZF = 1

‰Jump When Greater or Equal : SF = OF

‰Jump When Not Equal : ZF = 0

‰Jump When Greater : SF = OF and ZF = 0

(24)

Sauts (suite et fin)

‰Un saut est effectué en modifiant PC .

‰Les instructions rrmovl, irmovl, rmmovl, mrmovl ne modifient

jamais les codes de condition.

‰Opérations logiques andl, xorl : ZF et SF ajustés selon résultat de l’opération, OF = 0 (clear).

¾ andl %eax, %eax # ne modifie pas le registre

je ... # saut effectué si eax = 0 jl ... # saut effectué si eax < 0

¾ xorl %eax, %ebx

je ... # saut effectué si eax = ebx (mais ce test écrase ebx)

(25)

Miscellaneous Instructions

„ Don’t do anything

„ Stop executing instructions

„ IA32 has comparable instruction, but can’t execute it in user mode

„ We will use it to stop the simulator

nop 0 0

(26)

Code for

sum

0x401040 <sum>: 0x55 0x89 0xe5 0x8b 0x45 0x0c 0x03 0x45 0x08 0x89 0xec 0x5d 0xc3

Object Code

Assembler

„ Translates .s into .o

„ Binary encoding of each instruction

„ Nearly-complete image of executable code

„ Missing linkages between code in different files

Linker

„ Resolves references between files

„ Combines with static run-time libraries

z E.g., code for malloc, printf

„ Some libraries are dynamically linked z Linking occurs when program begins

Total of 13 bytesEach instruction 1, 2, or 3 bytesStarts at address 0x401040

(27)

text text binary binary Compiler (gcc -S) Assembler (gcc or as) Linker (gcc or ld) C program (p1.c p2.c) Asm program (p1.s p2.s)

Object program (p1.o p2.o)

Executable program (p)

Static libraries (.a)

Turning C into Object Code

„ Code in files p1.c p2.c

„ Compile with command: gcc -O p1.c p2.c -o p

zUse optimizations (-O)

(28)

Disassembled

00401040 <_sum>: 0: 55 push %ebp 1: 89 e5 mov %esp,%ebp 3: 8b 45 0c mov 0xc(%ebp),%eax 6: 03 45 08 add 0x8(%ebp),%eax 9: 89 ec mov %ebp,%esp b: 5d pop %ebp c: c3 ret d: 8d 76 00 lea 0x0(%esi),%esi

Disassembling Object Code

Disassembler

objdump -d p

„ Useful tool for examining object code

„ Analyzes bit pattern of series of instructions

(29)

Pseudo code objet

Simulateur Y86

Loop: mrmovl (%ecx),%esi # get *Start addl %esi,%eax # add to sum irmovl $4,%ebx

addl %ebx,%ecx # Start++ irmovl $-1,%ebx

addl %ebx,%edx #

Count--jne Loop # Stop when 0

Fichier source .ys

0x057: 506100000000 |Loop: mrmovl (%ecx),%esi 0x05d: 6060 | addl %esi,%eax 0x05f: 308304000000 | irmovl $4,%ebx 0x065: 6031 | addl %ebx,%ecx 0x067: 3083ffffffff | irmovl $-1,%ebx 0x06d: 6032 | addl %ebx,%edx 0x06f: 7457000000 | jne Loop Fichier .yo Assembleur yas

(30)

Fichier .yo

0x057: 506100000000 |Loop: mrmovl (%ecx),%esi 0x05d: 6060 | addl %esi,%eax 0x05f: 308304000000 | irmovl $4,%ebx 0x065: 6031 | addl %ebx,%ecx 0x067: 3083ffffffff | irmovl $-1,%ebx 0x06d: 6032 | addl %ebx,%edx 0x06f: 7457000000 | jne Loop

1. Etiquette

=

adresse

calculée par l’assembleur

(vrai pour tout assembleur)

(31)

CISC Instruction Sets

„ Complex Instruction Set Computer

„ Dominant style through mid-80’s

Stack-oriented instruction set

„ Use stack to pass arguments, save program counter

„ Explicit push and pop instructions

Arithmetic instructions can access memory

„ addl %eax, 12(%ebx,%ecx,4)

z requires memory read and write z Complex address calculation

Condition codes

„ Set as side effect of arithmetic and logical instructions

Philosophy

(32)

RISC Instruction Sets

„ Reduced Instruction Set Computer

„ Internal project at IBM, later popularized by Hennessy (Stanford) and Patterson (Berkeley)

Fewer, simpler instructions

„ Might take more to get given task done

„ Can execute them with small and fast hardware

Register-oriented instruction set

„ Many more (typically 32) registers

„ Use for arguments, return pointer, temporaries

Only load and store instructions can access memory

„ Similar to Y86 mrmovl and rmmovl

(33)
(34)

MIPS Instruction Examples

Op Ra Rb Offset Op Ra Rb Rd 00000 Fn R-R Op Ra Rb Immediate R-I Load/Store

addu $3,$2,$1 # Register add: $3 = $2+$1

addu $3,$2, 3145 # Immediate add: $3 = $2+3145 sll $3,$2,2 # Shift left: $3 = $2 << 2

lw $3,16($2) # Load Word: $3 = M[$2+16]

Op Ra Rb Offset

Branch

(35)

CISC vs. RISC

Original Debate

„ Strong opinions!

„ CISC proponents---easy for compiler, fewer code bytes

„ RISC proponents---better for optimizing compilers, can make run fast with simple chip design

Current Status

„ For desktop processors, choice of ISA not a technical issue z With enough hardware, can make anything run fast

z Code compatibility more important

„ For embedded processors, RISC makes sense z Smaller, cheaper, less power

(36)

Summary

Y86 Instruction Set Architecture

„ Similar state and instructions as IA32

„ Simpler encodings

„ Somewhere between CISC and RISC

How Important is ISA Design?

„ Less now than before

z With enough hardware, can make almost anything go fast

„ Intel is moving away from IA32

z Does not allow enough parallel execution z Introduced IA64

» 64-bit word sizes (overcome address space limitations)

» Radically different style of instruction set with explicit parallelism » Requires sophisticated compilers

References

Related documents