Compiler Design
Lecture 13: Code generation : Memory management and
function call
(EaC Chapter 6&7)
Christophe Dubach 22 February 2021
Table of contents
Memory management
Static Allocation & Alignment Stack allocation
Address of expressions
Function calls
Static versus Dynamic
Static allocation: storage can be allocated directly by the compiler by simply looking at the program at compile-time. This implies that the compiler can infer storage size information.
Dynamic allocation: storage needs to be allocated at run-time due to unknown size or function calls.
Heap, Stack, Static storage
high address low address stack heap text data static allocation dynamic allocation Static storage: Text: instructions Data global variables string literals global arrays of fixed size . . .
Dynamic Storage: Heap: malloc Stack:
local variables
function arguments/return values saved registers
register spilling (register allocation)
Example
high address low address stack heap text data static allocation dynamic allocation c h a r c ; data i n t a r r [ 4 ] ; data v o i d f o o ( ) { i n t a r r 2 [ 3 ] ; stack i n t* p t r = heap ( i n t*) m a l l o c ( s i z e o f ( i n t ) * 2 ) ; . . . { i n t b ; stack . . . b a r ( ” h e l l o ” ) ; data } . . . } 5Memory management
Static Allocation & Alignment
Scalars
Typically
int and pointer types (e.g. char*, int *, void*) are 32 bits (4 byte).
charis 1 byte
However, it depends on thedata alignmentof the architecture.
A singlechar might occupy 4 bytes if data alignment is 4 bytes.
Example (static allocation):
. d a t a c : . s p a c e 1 # c h a r i : . s p a c e 4 # int . t e x t lb $t0, c lw $t1, i # e r r o r !
Miss-aligned load ( forbidden!)
. d a t a c : . s p a c e 4 # c h a r i : . s p a c e 4 # int . t e x t lb $t0, c lw $t1, i # all g o o d !
With padding addedé
Structures
In a C structure, all values are aligned to the data alignment of the
architecture (unlesspackeddirective is used).
Example C code (static allocation) s t r u c t m y S t r u c t t { c h a r c ; i n t x ; } ; s t r u c t m y S t r u c t t ms ; . . .
In this example, it is as if the value c uses 4 bytes of data.
. d a t a m s _ m y S t r u c t _ t _ c : . s p a c e 4 m s _ m y S t r u c t _ t _ x : . s p a c e 4 . t e x t ... 7
Arrays
In contrast to structs, arrays are always compact in C with extra padding added in the end if required.
Example C code (static allocation): c h a r a r r [ 7 ] ;
i n t i ; . . .
Corresponding assembly code:
. d a t a arr : . s p a c e 8 i : . s p a c e 4 . t e x t ... 8
Memory management
Stack allocation
Stack variable allocation
The compiler needs to keep track of local variables in functions. These cannot be allocated statically (i.e. text section).
i n t f o o ( i n t i ) { i n t a ; a = 1 ; i f ( i ==0) f o o ( 1 ) ; p r i n t ( a ) ; a = 2 ; } v o i d b a r ( ) { f o o ( 0 ) ; }
If a allocated statically, what is printed?
1 2
Local variables must be stored on the stack!
How to keep track of local variables on the stack? ⇒ Could use the stack pointer.
* Problem: stack pointer can move.
e.g. dynamic memory allocatin on the stack
Solution: use another pointer, theframe pointer
Frame pointer
The frame pointer must be initialised to the value of the stack
pointer, just when entering the function.
Access to variables allocated on the stack can then be determined as
a fixed offset from the frame pointer.
The frame pointer(FP) always points to the beginning of the local variables of the current function, just after the arguments (if any).
The stack pointer(SP) always points
at the top of the stack (points to the last element pushed).
FP
SP call stack foo stack frame bar stack frame return address local variables saved registers (return value) frame pointer (arguments) 11
Memory management
Address of expressions
Visitor for generating addresses
Sometimes, the compiler needs to know the address of an expression (e.g. assignment, address-of operator):
s t r u c t v e c t { i n t x ; i n t y ; } ; v o i d f o o ( ) { i n t i ; s t r u c t v e c t v ; i n t a r r [ 1 0 ] ; i = 2 ; v . x = 3 ; a r r [ 2 ] = 5 ; } 12
Visitor for generating addresses
We can write a specialized visitor to produce the address of an expression.
AddrGen Visitor R e g i s t e r v i s i t B i n O p ( BinOp bo ) { e r r o r ( ” C a n n o t r e q u e s t a d d r e s s f o r BinOP ” ) ; } R e g i s t e r v i s i t I n t L i t e r a l ( I n t L i t e r a l i t ) { e r r o r ( ” C a n n o t r e q u e s t a d d r e s s f o r I n t L i t e r a l ” ) ; } R e g i s t e r v i s i t V a r E x p r ( V a r E x p r v ) { R e g i s t e r r e s R e g = n e w V i r t u a l R e g i s t e r ( ) ; i f ( v . vd . i s S t a t i c A l l o c a t e d ( ) ) e m i t ( ” l a ” , r e s R e g , v . vd . l a b e l ) ; e l s e i f ( v . cd . i s S t a c k A l l o c a t e d ( ) ) . . . r e t u r n r e s R e g ; } . . . 13
The AddrGen visitor will be called from the “normal” code generator visitor when needed:
Code generator visitor
. . . R e g i s t e r v i s i t A s s i g n ( A s s i g n a ) { R e g i s t e r a d d r R e g = a . l h s . a c c e p t ( new AddrGen ( ) ) ; R e g i s t e r v a l R e g = a . r h s . a c c e p t ( t h i s ) ; e m i t ( ” sw ” , v a l R e g , a d d r R e g ) ; r e t u r n n u l l ; // d i f f e r e n t f r o m C } 14
Last words
Examples above have assumed variable stores an interger.
In case a variable represents an array or struct, you have to be careful: array are passed by reference
(treat them exactly like pointers, no big deal) struct are passed by values
Assigning between two structs means copying field by field. Hence your code generator must check the type of variables when encountering an assignment and handle structures correctly.
Similar problem for struct and function call. Arguments and return values that are struct are passed by values.
Code with struct assignment: s t r u c t v e c t { i n t x ; i n t y ; } ; v o i d f o o ( ) { s t r u c t v e c t v1 ; s t r u c t v e c t v2 ; v1 = v2 ; } Equivalent to: s t r u c t v e c t { i n t x ; i n t y ; } ; v o i d f o o ( ) { s t r u c t v e c t v1 ; s t r u c t v e c t v2 ; v1 . x = v2 . x ; v1 . y = v2 . y ; } ÿ Pro tip
Instead of handling this complexity in the code generator, write a pass
that runsbefore code generationto transform the AST to “inline” the
struct assignments field by field.
Function calls
i n t b a r ( i n t a ) { r e t u r n 3+a ; } v o i d f o o ( ) { . . . b a r ( 4 ) . . . } foois thecaller
baris thecallee
What happens during a function call? The caller needs to pass the arguments
to the callee
The callee needs to pass the return value to the caller
But also:
The values stored in registers needs to be saved somehow.
Need to remember where we came
from to return to thecall site.
Possible convention: precall:
pass the arguments via registers or push on the stack
reserve space on stack for return value (if needed)
push return address on the stack postreturn:
restore return address from the stack read the return value from dedicated
register or stack reset stack pointer
precall postreturn call return ... ... prologue epilogue ... function foo function bar FP
SP call stack foo stack frame bar stack frame return address local variables saved registers (return value) frame pointer (arguments) 18
prologue:
push frame pointer onto the stack initialise the frame pointer
save all the saved registers onto the stack reserve space on the stack for local
variables epilogue:
restore saved registers from the stack restore the stack pointer
restore the frame pointer from the stack
precall postreturn call return ... ... prologue epilogue ... function foo function bar FP
SP call stack foo stack frame bar stack frame return address local variables saved registers (return value) frame pointer (arguments) 19
ÿ Other conventions are possible.
To simplify (for your project), we suggest you:
save all the registers used by a function onto the stack; pass all arguments and return value via the stack
(this is needed anyway when there are more than four arguments).
Example (callee) i n t b a r ( i n t a ) { i n t b ; r e t u r n 3+a ; } b a r : a d d i $sp, $sp, −4 # sw $fp, ($sp) # p u s h f r a m e p o i n t e r on t h e s t a c k move $fp, $sp # i n i t i a l i s e t h e f r a m e p o i n t e r a d d i $sp, $sp, −4 # sw $t0, ($sp) # p u s h $t0 onto the s t a c k a d d i $sp, $sp, −4 # sw $t1, ($sp) # p u s h $t1 onto the s t a c k a d d i $sp, $sp, −4 # r e s e r v e s p a c e on s t a c k f o r b l i $t0, 3 # l o a d 3 i n t o $t0 l w $t1, 1 2 ($fp) # l o a d f i r s t a r g u m e n t f r o m s t a c k
add $t0, $t0, $t1 # add $t0 and f i r s t argument
sw $t0, 8 ($fp) # c o p y t h e r e t u r n v a l u e on s t a c k l w $t0, −4($fp) # r e s t o r e $t0 l w $t1, −8($fp) # r e s t o r e $t1 a d d i $sp, $sp, 16 # r e s t o r e s t a c k p o i n t e r l w $fp, ($fp) # r e s t o r e t h e f r a m e p o i n t e r j r $ra # j u m p s t o r e t u r n a d d r e s s 21
Example (caller) v o i d f o o ( ) { . . . b a r ( 4 ) . . . } f o o : . . . l i $t0, 4 # a d d i $sp, $sp, −4 # sw $t0, ($sp) # p u s h a r g u m e n t on s t a c k a d d i $sp, $sp, −4 # r e s e r v e s p a c e on s t a c k f o r r e t u r n v a l u e a d d i $sp, $sp, −4 # sw $ra, ($sp) # p u s h r e t u r n a d d r e s s on s t a c k j a l b a r # c a l l f u n c t i o n l w $ra, ($sp) # r e s t o r e r e t u r n a d d r e s s f r o m t h e s t a c k l w $t0, 4 ($sp) # r e a d r e t u r n v a l u e f r o m s t a c k a d d i $sp, $sp, 12 # r e s e t s t a c k p o i n t e r . . . 22
Final words
Beware of returning structs! Needs to be passed byvalue.
C code example: s t r u c t v e c t { i n t x ; i n t y ; } s t r u c t v e c t f o o ( ) { s t r u c t v e c t v ; v . x = 0 ; v . y = 1 ; r e t u r n v ; } v o i d b a r ( ) { s t r u c t v e c t t myvec ; myvec = f o o ( ) ; }
Transform into equivalent C code: s t r u c t v e c t { i n t x ; i n t y ; } v o i d f o o ( s t r u c t v e c t * r e s u l t ){ s t r u c t v e c t v ; v . x = 0 ; v . y = 1 ; * r e s u l t = v ; } v o i d b a r ( ) { s t r u c t v e c t t myvec ; s t r u c t v e c t t r e s u l t ; f o o (& r e s u l t ) ; myvec = r e s u l t ; } 23
ÿ Pro tip
Again, instead of handling this complexity in the code generator, write
a pass that runsbefore code generationto transform the AST to deal
with function returning struct.
Then run the pass that transforms the AST to deal with assignment between structs.
Next lecture
Very naive register allocator.