• No results found

C programs in (address) space and (run-)time. Systems Programming. C, assembler, and machine code. Where is my data and why do I have to know?

N/A
N/A
Protected

Academic year: 2021

Share "C programs in (address) space and (run-)time. Systems Programming. C, assembler, and machine code. Where is my data and why do I have to know?"

Copied!
14
0
0

Loading.... (view fulltext now)

Full text

(1)

Systems Programming

02. C Programs in Space and Time

Alexander Holupirek

Database and Information Systems Group Department of Computer & Information Science

University of Konstanz

Summer Term 2008

1

C programs in (address) space and (run-)time

Where is my data and why do I have to know?

I C is closely related to the machine. Before talking about pointers, storage allocation etc. some background knowledge about address space, (virtual) memory and its allocation during program execution comes in handy

I Knowledge about the memory layout of a program is quite helpful when debugging

I Knowledge about what is happening inside the machine on program execution is fundamental, to both, debugging programs and, in first place, writing clean code

2

Repetition Computer Architecture

Storage Classes

From Source Code To Executable Code

Construction of an Executable

Relocation Process

3

C, assembler, and machine code

int a, b;

a = b * b;

mov 0x403030,%eax imul 0x403030,%eax mov %eax,0x403020

4012ee a1 4012ef 30 4012f0 30 4012f1 40 4012f2 00 4012f3 0f 4012f4 af 4012f5 05 4012f6 30 4012f7 30 4012f8 40 4012f9 00 4012fa a3 4012fb 20 4012fc 30 4012fd 40 4012fe 00 ausführbarer Binärcode (hexa- dezimal dargestellt)

Intel iA32-Assembler-Quellcode

Maschinenbefehle bzw.

Prozessorinstruktionen

Adresse

Inhalt (je 1 Byte) C-Quellcode

4

(2)

C, assembler, and machine code

int a=4, b;

int main(void) { if (a>5)

b=1;

else b=0;

}

8048344: 83 3d 94 94 04 08 05 cmpl $0x5,0x8049494 804834b: 7e 0c jle 8048359

804834d: c7 05 8c 95 04 08 01 movl $0x1,0x804958c 8048354: 00 00 00

8048357: eb 0a jmp 8048363

8048359: c7 05 8c 95 04 08 00 movl $0x0,0x804958c 8048360: 00 00 00

8048363: c9 ...

Speicher- adresse

Speicherinhalt (=Maschinenbefehl)

C-Quellcode Ausführbarer Binärcode Assembler-Quellcode

a liegt auf Adresse 0x8049494 b liegt auf Adresse 0x804958c

Zahlenwerte in Binär- und Assemblercode sind alle hexadezimal zu verstehen

5

Address Space

0

max.

0x10000000

0x1000000f 0x10000010

Datenblock

0x50000000 0x50000001

16 Byte

Größe des Datenblocks Startadresse des

Datenblocks

Letzte Byteadresse des Datenblocks

Adresse des ersten Byte nach dem Datenblock

Tiefstmögliche Adresse (»Speicherbeginn«)

Höchstmögliche Adresse (»Speicherende«)

Speicheradressen Speicherinhalte

Adressen einzelner Byte

0x56 0xfc

6

Byte Ordering

Adr.

Adressraum

Daten (4 Byte):

MSB LSB

d3 d2 d1 d0

0

n

max.

Big-Endian-System Little-Endian-System

Adr. Inhalt MSB

LSB

Mit der Adresse n wird auf die 4 Byte großen Daten im Programm zugegriffen n

n+1 n+2 n+3

d3 d2 d1 d0

Adr. Inhalt d0 d1 d2 d3 n

n+1 n+2 n+3

LSB

MSB

MSB = Most Significant Byte (höchstwertiges Byte) LSB = Least Significant Byte (niedrigstwertiges Byte)

Alignment Rules

Goal: Optimal Performance

I Determine the address locations for variables and instructions

I Great impact on compiler, assembler, linker tools

Adressraum

Adressen (hexadezimal)

0x35 0x36 0x37 0x38 Daten-

Langwort (misaligned)

Datenbus

Adressoffsets (Byteadressen)

1. Zugriff

2. Zugriff

Langwortgrenzen auf dem Bus

Langwortgrenzen (ohne Rest durch 4 teilbar) im Adressraum +0

0x34 +1 0x35

+2 0x36

+3 0x37

0x38 0x39 0x3a 0x3b

(3)

Alignment Rules (cont.)

For derived types16 (constructed from the basic types) alignment rules apply to each single component:

struct artikel {char name[5];

int anzahl;

double preis;};

alignment(1) alignment(4)

Alignment rules may be influenced through compiler directives

(-malign-int aligns variables on 32-bit boundaries producing code that runs somewhat faster on processors with 32-bit busses at the expense of memory)

16arrays, functions, pointers, structures, unions (we will discuss them later)

9

Repetition Computer Architecture

Storage Classes

From Source Code To Executable Code

Construction of an Executable

Relocation Process

10

Storage Classes

Placement of data in memory depends on storage class

I An object, such as a variable, is a location in storage, and its interpretation depends on two main attributes: its storage class and its type

I The storage class determines the lifetime of the storage associated with the identified object

I The types determines the meaning of the values found in the identified object.

I In C we have two storage classes: automatic and static

I Storage class specifiers (auto, extern, register, static) together with the context of an object’s declaration, specify its storage class

11

Automatic Storage Class

Automatic Objects

I auto and register give the declared objects automatic storage class, and may be used only within functions

I They are local to a block17, discarded on exit from the block

I Declarations within a block create automatic objects if no storage class specification is mentioned or auto is used

I Initialization of automatic objects is performed each time the block is entered at the top (if a jump into the block is

executed the initializations are not performed)

I Objects declared register are automatic, and are (if possible) stored in fast registers of the machine

I For register the address operator ’&’ is not allowed

17aka “compound statement”, such as the body of a function

12

(4)

Static Storage Class

Static Objects

I May be local to a block or external to all blocks

I In both cases, they retain their values across exit from and reentry to functions and blocks

I Within a block, static objects are declared with static

I Objects declared outside of all blocks (at the same level as function definitions) are always static

I On the outer level, the keyword static makes them local to a particular translation unit (internal linkage)

I They are global to an entire program by omitting an explicit storage class, or by using extern (external linkage)

13

Storage Class and Sections

Intermediate Summary

I A program executed does not only use storage for its instructions, but additionally needs space for, e.g., variables

I Variables may be temporary, dynamically allocated, or static (i.e., permanent in terms of storage allocation), initialized or uninitialized, declared as constant (const) and thus read-only

I Placement of data in memory depends on its storage class

I During the translation process the compiler uses sections to divide the address space into logical units

I Details vary with operating systems and compiler used

14

Typical Program Organisation

A typical program divides naturally in sections

Code machine instructions, should be unmodifiable, size is known after compilation, does not change (.text)

Data I static data

I initialized (.data) /uninitialized (.bbs)

I constant address in memory

I permanent life time

I dynamic data

I stack or heap

I storage space not known

I volatile life time

Program Sections

.text

.data

.bss

PROM oder RAM

RAM

RAM Adressraum

schreibgeschützt

PROM:

Programmable Read Only Memory (im Betrieb nicht beschreibbarer Speicherbaustein)

RAM:

Random Access Memory (Speicher mit wahlfreiem Zugriff)

(5)

Virtual Memory and Segments

Virtual Memory

I Whenever a process is created, the kernel provides a chunk of physical memory which can be located anywhere

I Through the magic of virtual memory (VM), the process believes it has all the memory on the computer

Typically the VM space is laid out in a similar manner:

I Text Segment (.text)

I Initialized Data Segment (.data)

I Uninitialized Data Segment (.bss)

I The Stack

I The Heap

17

A Program in Memory

Code, Konstanten initialisierte Daten nicht initialisierte Daten

Heap 0

Adressen

aus ausführbarer Datei geladen bei Prozessstart bereitgestellt und mit 0 initialisiert (gelöscht) bei Prozessstart bereitgestellt, für dynamische Speicherallozierung,

bei Prozessstart bereitgestellt, wächst zu tieferen Adressen (bzw. zu höheren Adr.;

wächst dem Stapel entgegen

prozessorabhängig) Stack

static data dynamic data

18

Different Memory Layouts

Code, Konstanten initialisierte Daten nicht initialisierte Daten

Heap 0

Adressen

Stack

Code, Konstanten initialisierte Daten nicht initialisierte Daten

Heap Stack 0

Adressen

(A) Lösung auf PC (iA32) (B) Stack umgekehrt wachsend

Programm- startadresse

19

Memory Segments

Text Segment The text segment contains the actual code

(including constants) to be executed. It’s usually sharable, so multiple instances of a program can share the text segment to lower memory requirements. This segment is usually marked read-only so a program can’t modify its own instructions.

Initialized Data Segment This segment contains global variables which are initialized by the programmer.

Uninitialized Data Segment Also named .bss (block started by symbol) which was an operator used by an old assembler.

This segment contains uninitialized global variables. All variables in this segment are initialized to 0 or NULL pointers before the program begins to execute.

20

(6)

Memory Segments (cont.)

The Stack The stack is a collection of stack frames which we will discuss later. When a new frame needs to be added (as a result of a newly called function), the stack grows downward.

The Heap Dynamic memory, where storage can be (de-)allocated via C’s free(3)/malloc(3). The C library also gets dynamic memory for its own personal workspace from the heap as well. As more memory is requested “on the fly”, the heap grows upward.

21

Variable Placement and Life Time (Code)

int a ;

s t a t i c int b ; v o i d

f u n c (v o i d) {

c h a r c ; s t a t i c int d ; }

int

m a i n (v o i d) {

int e ;

int * pi = (int*) m a l l o c (s i z e o f(int));

f u n c ();

f u n c ();

f r e e ( pi );

r e t u r n ( 0 ) ; }

22

Variable Placement and Life Time (Code)

int a ; /* P e r m a n e n t l i f e t i m e */

s t a t i c int b ; /* dito , but r e d u c e d s c o p e */

v o i d f u n c (v o i d) {

c h a r c ; /* o n l y for the l i f e t i m e of f u n c () */

/* but 2 x ; v i s i b l e o n l y in f u n c () */

s t a t i c int d ; /* i ’ m unique , e x i s t o n c e at a s t a b l e */

/* address , v i s i b l e o n l y in f u n c () */

} int

m a i n (v o i d) {

int e ; /* l i f e t i m e of m a i n () */

int * pi = (int*) m a l l o c (s i z e o f(int)); /* n e w b o r n */

f u n c ();

f u n c ();

f r e e ( pi ); /* RIP , pi p o i n t s to an i n v a l i d a d d r e s s */

r e t u r n ( 0 ) ; }

Variable Placement and Life Time (Diagram)

t=0: Programmausführung wird gestartet, d.h., Ausführungsum- gebung ist bereits initialisiert t=x: beliebiger Zeitpunkt während der Programmausführung Code

Daten

Halde (Heap)

Stapel (Stack) Adresse

0

max.

PC(t=0) PC(t=x)

pi

SP(t=0) SP(t=x)

1. Instruktion 2. Instruktion 3. Instruktion 4. Instruktion ...

a b

c pi e

int d

(7)

Variable Placement

Variables (outside a function) Globally declared variables go to the Uninitialized Data Segment if they are not initialized, to Initialized Data Segment otherwise. Necessary for the OS to decide if storage has to be loaded with initialization data from the executable binary.

Variables (inside a function) Implicit assumption of auto, go to The Stack. Declared as static, see above.

Constants (const) Text Segment

Function Parameters Are pushed on The Stack or stored in registers. If pointers are passed, data is elsewhere.

25

Repetition Computer Architecture

Storage Classes

From Source Code To Executable Code

Construction of an Executable

Relocation Process

26

From source code to executable code

Translation Steps (multi-phase compilation)

Compilation HLL source code to assembler source code Assembly Assembler source code to object code

Linking Object code to executable code

Compilers and assemblers create object files containing the generated binary code and data for a source file. Linkers combine multiple object files into one, loaders take object files and load them into memory.

Goal: An executable binary file (a.out)

From high-level language (HLL) source code to executable code, i.e., concrete processor instructions in combination with data.

27

Translation steps using gcc(1)

Präprozessor Compiler Assembler Binder

*.c/*.cc/*.cpp

*.s

*.s

*.o

*.o/*.a

a.out Eingabe-

Ausgabe-

Quellcode C/C++ Assembler-Quellcode

Assembler-Quellcode Objektdatei Ausführbare Datei (= Objektdatei, ladbar)

Objektdatei,

*.i/*.ii

Vorverarbeiteter

Bibliotheksdatei

dateien

dateien

C/C++-Quellcode (ungebunden)

Objektdatei (ungebunden)

28

(8)

File suffixes and their meaning

For any given input file, the file name suffix determines what kind of compilation is done (see gcc(1)) for more details and suffixes:

suffix compilation step

.c C source code which must be preprocessed .i C source code which should not be preprocessed .h Header file to be turned into a precompiled header .s Assembler code

.o An object file to be fed straight into linking

29

Creation of an executable file

= Operation

= Eingang oder

= Kommando

(Filename).o

a.out ld

gas Assemblieren

(Filename).s gcc Kompilieren (Filename).c

Object/Library Files

Binden

Ausgang

30

The C Preprocessor

The C preprocessor performs . . .

I Inclusion of named files

I Macro Substitution

I Conditional Compilation

File Inclusion

A control line of the form

# i n c l u d e f i l e n a m e

causes the replacement of that line by the entire contents of the file filename.

Note

The characters in the name filename must not include > or \n, and the effect is undefined if it contains any of ", ’, \ , or /*.

Location

The named file is searched for in a sequence of implementation- dependent places (often starting in /usr/include).

(9)

Macro Substitution

A control line of the form

# d e f i n e i d e n t i f i e r token - s e q u e n c e

causes the preprocessor to replace subsequent instances of the identifier with the given sequence of tokens.

Example

# d e f i n e E X I T _ F A I L U R E 1

# d e f i n e E X I T _ S U C C E S S 0

# d e f i n e S _ I R W X U 0 0 0 0 7 0 0 /* RWX m a s k for o w n e r */

# d e f i n e S _ I R U S R 0 0 0 0 4 0 0 /* R for o w n e r */

# d e f i n e S _ I W U S R 0 0 0 0 2 0 0 /* W for o w n e r */

# d e f i n e S _ I X U S R 0 0 0 0 1 0 0 /* X for o w n e r */

33

Macro Substitution (cont.)

A control line of the form

# d e f i n e i d e n t i f i e r ( i d e n t i f i e r - l i s t ) token - s e q u e n c e

where there is no space between the first identifier and the ’(’, is a macro definition with parameters given by the identifier list.

Example

# d e f i n e S _ I S D I R ( m ) (( m & 0 1 7 0 0 0 0 ) == 0 0 4 0 0 0 0 ) /* d i r e c t o r y */

# d e f i n e S _ I S C H R ( m ) (( m & 0 1 7 0 0 0 0 ) == 0 0 2 0 0 0 0 ) /* c h a r sp . */

# d e f i n e S _ I S B L K ( m ) (( m & 0 1 7 0 0 0 0 ) == 0 0 6 0 0 0 0 ) /* b l o c k sp . */

# d e f i n e S _ I S R E G ( m ) (( m & 0 1 7 0 0 0 0 ) == 0 1 0 0 0 0 0 ) /* r e g u l a r */

# d e f i n e S _ I S F I F O ( m ) (( m & 0 1 7 0 0 0 0 ) == 0 0 1 0 0 0 0 ) /* f i f o */

34

Macro Substitution (cont.)

A control line of the form

# u n d e f i d e n t i f i e r

causes the identifier’s preprocessor definition to be forgotten. It is not erroneous to apply #undef to an unknown identifier.

Example

/*

* S o m e h e a d e r f i l e s may d e f i n e an abs m a c r o .

* If defined , u n d e f it to p r e v e n t a s y n t a x e r r o r

* and i s s u e a w a r n i n g .

* # w a r n i n g is a p r a g m a ( i m p l e m e n t a t i o n - d e p e n d e n t a c t i o n )

*/

# i f d e f abs

# u n d e f abs

# w a r n i n g abs m a c r o c o l l i d e s w i t h abs () p r o t o t y p e , u n d e f i n i n g

# e n d i f

35

Conditional Inclusion

Parts of a program may be compiled conditionally

Example

# i f n d e f N U L L

# i f d e f _ _ G N U G _ _

# d e f i n e N U L L _ _ n u l l

# e l s e

# d e f i n e N U L L 0 L

# e n d i f

# e n d i f

36

(10)

Predefined Names

Several identifiers are predefined, and expand to produce special information. They, and also the preprocessor expression operator defined, may not be undefined or redefined.

LINE A decimal constant containing the current source line number FILE A string literal containing the name of the file being compiled DATE A string literal containing the data of compilation ’Mmm dd yyyy’

TIME A string literal containing the data of compilation ’hh:mm:ss’

STDC The constant 1. It is intended that this identifier be defined to be 1 only in standard-conforming implementations

37

Compilation

HLL-Quellcode Compiler

Assembler-Quellcode

Übersetzungsliste mit Text

Text

Text evtl. temporäre Dateien

Kompilation

Fehlermeldungen

38

Assembly

Assembler-

Assemblierung

Assembler

Maschinencode und

Übersetzungsliste mit Fehler- Text

Objektformat

Text evtl. temporäre Dateien

Quellcode

Zusatzinformationen

meldungen und Symboltabelle

Linking

Binden

Binder (Linker)

Absoluter Code oder relozier-

Link Map (Adressraum- benutzung), Symbolliste

Binärcode od.

Text evtl. temporäre Dateien

Objektformat

Objektformat

Bibliotheksobjektformat Maschinencode und Zusatzinfo.

Maschinencode und Zusatzinfo.

Maschinencode und Zusatzinfo. library search

Objektformat

barer Code mit Zusatzinfo.

(11)

Repetition Computer Architecture

Storage Classes

From Source Code To Executable Code

Construction of an Executable

Relocation Process

41

Program Section In Virtual Memory

Sektion .text (Code):

Sektion .data (init. Daten) 0

xx

0 yy

Adressraum 0

0x08048244

0x08049370

0xffffffff

Nach Kompilation Nach Bindung

Jede Sektion beginnt bei Adr. 0, Sektionen Alle Sektionen sind im Adress- sind »logische. Adressräume« des Compilers raum »absolut« platziert

42

Linking an Executable Binary

OBJ1

OBJ2

OBJ3

.data1

.text2 .bss2

.text3 .data3 .bss3

.text1 .bss1

.text1 .text2 .text3 .data1 .data3 .bss1 .bss2 .bss3 Eingabedaten: ungebundene Objektdateien

Verarbeitungsresultat: ausführbare Datei (gebunden, reloziert) Bindung (linking)

OBJtotal

.text: Code

.data: initialisierte Variablen .bss: nicht initialisierte Variablen

I Each object code (compiled seperately) starts at address 0

I Linking them together involves

I centralization of sections

I relocation of adresses

43

Relocation Records

I Once sections are placed subsequently, relocation can start

I Executable code contains embedded addresses

I Static data, function calls, jump targets

I On relocation those have to be changed inside the code

I Without a relocation table this is not possible

I A relocation record holds the relative address of a symbol (name of a variable, a function etc.)

R E L O C A T I O N R E C O R D S FOR [. t e x t ]:

O F F S E T T Y P E V A L U E

0 0 0 0 0 0 1 a R _ 3 8 6 _ 3 2 b 0 0 0 0 0 0 2 3 R _ 3 8 6 _ 3 2 a 0 0 0 0 0 0 2 9 R _ 3 8 6 _ 3 2 b

44

(12)

Source File: compile.c

int a = 1; /* G l o b a l v a r i a b l e , i n i t i a l i z e d - > . d a t a */

int b ; /* G l o b a l v a r i a b l e , u n i n i t i a l i z e d - > . bss */

int

m a i n (v o i d) {

s t a t i c int c ; /* Local , s t a t i c v a r i a b l e - > . bss */

b = 5;

c = b + a + 16;

r e t u r n c ; }

I Compile a relocatable object file

cc -c compile.c (creates compile.o)

I Linking an executable binary (one-step compilation) cc compile.c -o compile

45

Analysis of Object Files (compile.o)

$ f i l e c o m p i l e . o

ELF 32 - bit LSB r e l o c a t a b l e , I n t e l 80386 , v e r s i o n 1 , not s t r i p p e d

$ o b j d u m p - x c o m p i l e . o

c o m p i l e . o : f i l e f o r m a t elf32 - i 3 8 6 c o m p i l e . o

a r c h i t e c t u r e : i386 , f l a g s 0 x 0 0 0 0 0 0 1 1 : H A S _ R E L O C , H A S _ S Y M S

s t a r t a d d r e s s 0 x 0 0 0 0 0 0 0 0 S e c t i o n s :

Idx N a m e S i z e VMA LMA F i l e off A l g n

0 . t e x t 0 0 0 0 0 0 5 a 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 4 2 * * 2 C O N T E N T S , ALLOC , LOAD , RELOC , R E A D O N L Y , C O D E 1 . d a t a 0 0 0 0 0 0 0 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 9 0 2 * * 2

C O N T E N T S , ALLOC , LOAD , D A T A

2 . bss 0 0 0 0 0 0 0 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 9 4 2 * * 2 A L L O C

3 . r o d a t a 0 0 0 0 0 0 0 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 9 4 2 * * 0 C O N T E N T S , ALLOC , LOAD , R E A D O N L Y , D A T A

46

Object File: compile.o (cont.)

S Y M B O L T A B L E :

0 0 0 0 0 0 0 0 l df * ABS * 0 0 0 0 0 0 0 0 c o m p i l e . c 0 0 0 0 0 0 0 0 l d . t e x t 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 l d . d a t a 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 l d . bss 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 l O . bss 0 0 0 0 0 0 0 4 c .0 0 0 0 0 0 0 0 0 l d . r o d a t a 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 g O . d a t a 0 0 0 0 0 0 0 4 a 0 0 0 0 0 0 0 0 g F . t e x t 0 0 0 0 0 0 5 a m a i n 0 0 0 0 0 0 0 4 O * COM * 0 0 0 0 0 0 0 4 b R E L O C A T I O N R E C O R D S FOR [. t e x t ]:

O F F S E T T Y P E V A L U E

0 0 0 0 0 0 1 a R _ 3 8 6 _ 3 2 b 0 0 0 0 0 0 2 3 R _ 3 8 6 _ 3 2 a 0 0 0 0 0 0 2 9 R _ 3 8 6 _ 3 2 b 0 0 0 0 0 0 3 1 R _ 3 8 6 _ 3 2 . bss 0 0 0 0 0 0 3 6 R _ 3 8 6 _ 3 2 . bss 0 0 0 0 0 0 4 c R _ 3 8 6 _ 3 2 . r o d a t a

c o m p i l e . o : f i l e f o r m a t elf32 - i 3 8 6 D i s a s s e m b l y of s e c t i o n . t e x t :

0 0 0 0 0 0 0 0 < main >:

0: 55 p u s h % ebp

1: 89 e5 mov % esp ,% ebp

3: 83 ec 18 sub $0x18 ,% esp

6: 83 e4 f0 and $ 0 x f f f f f f f 0 ,% esp

9: b8 00 00 00 00 mov $0x0 ,% eax

e : 29 c4 sub % eax ,% esp

10: a1 00 00 00 00 mov 0 x0 ,% eax

15: 89 45 e8 mov % eax ,0 x f f f f f f e 8 (% ebp )

18: c7 05 00 00 00 00 05 m o v l $0x5 ,0 x0

1 f : 00 00 00

22: a1 00 00 00 00 mov 0 x0 ,% eax

27: 03 05 00 00 00 00 add 0 x0 ,% eax

2 d : 83 c0 10 add $0x10 ,% eax

30: a3 00 00 00 00 mov % eax ,0 x0

35: a1 00 00 00 00 mov 0 x0 ,% eax

3 a : 8 b 55 e8 mov 0 x f f f f f f e 8 (% ebp ) ,% edx

3 d : 3 b 15 00 00 00 00 cmp 0 x0 ,% edx

43: 74 13 je 58 < m a i n +0 x58 >

45: 83 ec 08 sub $0x8 ,% esp

48: ff 75 e8 p u s h l 0 x f f f f f f e 8 (% ebp )

4 b : 68 00 00 00 00 p u s h $ 0 x 0

50: e8 fc ff ff ff c a l l 51 < m a i n +0 x51 >

55: 83 c4 10 add $0x10 ,% esp

58: c9 l e a v e

(13)

c o m p i l e . o : f i l e f o r m a t elf32 - i 3 8 6 D i s a s s e m b l y of s e c t i o n . t e x t :

0 0 0 0 0 0 0 0 < main >:

int b ; /* G l o b a l v a r i a b l e , u n i n i t i a l i z e d - > . bss */

int

m a i n (v o i d) {

0: 55 p u s h % ebp

... 6 m o r e l i n e s ...

15: 89 45 e8 mov % eax ,0 x f f f f f f e 8 (% ebp )

s t a t i c int c ; /* Local , s t a t i c v a r i a b l e - > . bss */

b = 5;

18: c7 05 00 00 00 00 05 m o v l $0x5 ,0 x0

1 f : 00 00 00

c = b + a + 16;

22: a1 00 00 00 00 mov 0 x0 ,% eax

27: 03 05 00 00 00 00 add 0 x0 ,% eax

2 d : 83 c0 10 add $0x10 ,% eax

30: a3 00 00 00 00 mov % eax ,0 x0

r e t u r n c ;

35: a1 00 00 00 00 mov 0 x0 ,% eax

}

... 10 m o r e l i n e s ...

49

Executable Binary File: compile

c o m p i l e : f i l e f o r m a t elf32 - i 3 8 6 c o m p i l e

a r c h i t e c t u r e : i386 , f l a g s 0 x 0 0 0 0 0 1 1 2 : EXEC_P , H A S _ S Y M S , D _ P A G E D

s t a r t a d d r e s s 0 x 1 c 0 0 0 4 0 8 S e c t i o n s :

Idx N a m e S i z e VMA LMA F i l e off A l g n

...

9 . t e x t 0 0 0 0 0 2 1 4 1 c 0 0 0 4 0 8 1 c 0 0 0 4 0 8 0 0 0 0 0 4 0 8 2 * * 2 C O N T E N T S , ALLOC , LOAD , R E A D O N L Y , C O D E

...

12 . d a t a 0 0 0 0 0 0 1 4 3 c 0 0 1 0 0 8 3 c 0 0 1 0 0 8 0 0 0 0 1 0 0 8 2 * * 2 C O N T E N T S , ALLOC , LOAD , D A T A

...

20 . bss 0 0 0 0 0 1 8 4 3 c 0 0 3 1 0 0 3 c 0 0 3 1 0 0 0 0 0 0 1 1 0 0 2 * * 5 A L L O C

S Y M B O L T A B L E :

3 c 0 0 3 1 4 0 l O . bss 0 0 0 0 0 0 0 4 c .0 3 c 0 0 3 2 8 0 g O . bss 0 0 0 0 0 0 0 4 b 1 c 0 0 0 5 c 0 g F . t e x t 0 0 0 0 0 0 5 a m a i n 3 c 0 0 1 0 1 8 g O . d a t a 0 0 0 0 0 0 0 4 a

50

1 c 0 0 0 5 c 0 < main >:

int b ; /* G l o b a l v a r i a b l e , u n i n i t i a l i z e d - > . bss */

int

m a i n (v o i d) {

1 c 0 0 0 5 c 0 : 55 p u s h % ebp

1 c 0 0 0 5 c 1 : 89 e5 mov % esp ,% ebp

1 c 0 0 0 5 c 3 : 83 ec 18 sub $0x18 ,% esp

1 c 0 0 0 5 c 6 : 83 e4 f0 and $ 0 x f f f f f f f 0 ,% esp

1 c 0 0 0 5 c 9 : b8 00 00 00 00 mov $0x0 ,% eax

1 c 0 0 0 5 c e : 29 c4 sub % eax ,% esp

1 c 0 0 0 5 d 0 : a1 00 31 00 3 c mov 0 x 3 c 0 0 3 1 0 0 ,% eax

1 c 0 0 0 5 d 5 : 89 45 e8 mov % eax ,0 x f f f f f f e 8 (% ebp )

s t a t i c int c ; /* Local , s t a t i c v a r i a b l e - > . bss */

b = 5;

1 c 0 0 0 5 d 8 : c7 05 80 32 00 3 c 05 m o v l $0x5 ,0 x 3 c 0 0 3 2 8 0 1 c 0 0 0 5 d f : 00 00 00

c = b + a + 16;

1 c 0 0 0 5 e 2 : a1 18 10 00 3 c mov 0 x 3 c 0 0 1 0 1 8 ,% eax 1 c 0 0 0 5 e 7 : 03 05 80 32 00 3 c add 0 x 3 c 0 0 3 2 8 0 ,% eax

1 c 0 0 0 5 e d : 83 c0 10 add $0x10 ,% eax

1 c 0 0 0 5 f 0 : a3 40 31 00 3 c mov % eax ,0 x 3 c 0 0 3 1 4 0 r e t u r n c ;

1 c 0 0 0 5 f 5 : a1 40 31 00 3 c mov 0 x 3 c 0 0 3 1 4 0 ,% eax }

51

Repetition Computer Architecture

Storage Classes

From Source Code To Executable Code

Construction of an Executable

Relocation Process

52

(14)

Relocation Of An Assembler Instruction

During the linking process relocated addresses are injected in the code, for example the assignment b = 5;

B e f o r e r e l o c a t i o n ( r e l o c a t a b l e ‘ c o m p i l e . o ‘):

18: c7 05 00 00 00 00 05 m o v l $0x5 ,0 x0

1 c 0 0 0 5 d 8 : c7 05 80 32 00 3 c 05 m o v l $0x5 ,0 x 3 c 0 0 3 2 8 0 A f t e r r e l o c a t i o n ( e x e c u t a b l e ‘ compile ‘):

The proper address for b can be found in the symbol table.

S Y M B O L T A B L E : ( c o m p i l e )

3 c 0 0 3 2 8 0 g O . bss 0 0 0 0 0 0 0 4 b

I The symbol table for compile yields 3c003280 for variable b

53

Relocation Of An Assembler Instruction (cont.)

? How to find the right places in the machine code to perform the substitutions?

I Linker has relocation record (relative address) of b

R E L O C A T I O N R E C O R D S FOR [. t e x t ]: ( c o m p i l e . o ) 0 0 0 0 0 0 1 a R _ 3 8 6 _ 3 2 b

I Linker has absolute address of main from symbol table

S Y M B O L T A B L E : ( c o m p i l e )

3 c 0 0 3 2 8 0 g O . bss 0 0 0 0 0 0 0 4 b 1 c 0 0 0 5 c 0 g F . t e x t 0 0 0 0 0 0 5 a m a i n

54

Relocation Of An Assembler Instruction (cont.)

Putting it all together:

R E L O C A T I O N R E C O R D S FOR [. t e x t ]: ( c o m p i l e . o ) 0 0 0 0 0 0 1 a R _ 3 8 6 _ 3 2 b ( r e l a t i v e o f f s e t ) S Y M B O L T A B L E : ( c o m p i l e )

3 c 0 0 3 2 8 0 g O . bss 0 0 0 0 0 0 0 4 b ( abs . a d d r e s s of b ) 1 c 0 0 0 5 c 0 g F . t e x t 0 0 0 0 0 0 5 a m a i n ( abs . a d d r e s s of m a i n )

Computing the address where substitution must be performed:

1 c 0 0 0 5 c 0 + 0 0 0 0 0 0 1 a = 1 c 0 0 0 5 d a

18: c7 05 00 00 00 00 05 m o v l $0x5 ,0 x0

1 c 0 0 0 5 d 8 : c7 05 80 32 00 3 c 05 m o v l $0x5 ,0 x 3 c 0 0 3 2 8 0

References

Related documents

Before Now Handwriting Signature Digital Signed PDF Party A’s Digital Signature Party B’s Digital Signature PDF417 Original Document with Digest Digital signed QR

[r]

Since cerebral amyloid angiopathy is an almost invariable pathological finding in Alzheimer’s disease, we hypothesized that MRI-visible perivascular spaces in the

Thus, from the style of its architecture and decoration, Ribāṭ-i Māhī is closely related to Ribāṭ-i Šaraf, and in turn, both caravanserais show a very close similarity in

(2015).Farm to fork: Cairo’s food supply and distribution during the Mamluk sultanate (1250-1517) [Master’s thesis, the American University in Cairo].. AUC

To convert a dataflow network specified in RVC-CAL to C++/OpenCL Code, the network file, the source directory of the project and the output directory where the generated code

NVCC CPU Code C CUDA Key Kernels CUDA object files Rest of C Application CPU object files Linker CPU-GPU Executable..

2015 Major Scholarship Application-New York Water Environment Association, (NYWEA) Environmental Engineering, Civil Engineering with an environmental minor, Chemical Engineering