• No results found

Intel Assembler. Project administration. Non-standard project. Project administration: Repository

N/A
N/A
Protected

Academic year: 2021

Share "Intel Assembler. Project administration. Non-standard project. Project administration: Repository"

Copied!
12
0
0

Loading.... (view fulltext now)

Full text

(1)

Lecture 14

Project, Assembler and Exam

Emma S¨ oderberg

Revised by Emma S¨ oderberg on March 5, 2013.

Based on slides by G¨ orel Hedin and Lennart Andersson.

EDA180: Compiler Construction F14-1

Compiler phases and program representations

Lexical analysis (scanning)

Syntactic analysis (parsing)

Semantic analysis Frontend

Analysis

Immediate code generation

Optimization

Machine code generation

Backend

Synthesis Tokens

AST Attributed

AST

Intermediate code

Intermediate code Source

code

Machine code

Today

I Project

I Intel assembler

I Exam

I Repetition

I Beyond..

EDA180: Compiler Construction F14-2

Course Project

Build a compiler for your language

Standard project

In teams of 2 persons.

Prerequisites:

I Approved assignments

I Assignment supervisor may grant postponement Design a small procedural language:

I integer and boolean types

I variables, constants, expressions, statements, ...

I block structure with nested procedures

I parameters, return values, recursion

I name analysis

I type analysis

I intermediate code generation

I assembly code generation

(2)

Non-standard project

Design a language of your choice.

Must be accepted by project supervisor in advance.

Should be approximately the same size as the standard project.

Typical requirements:

I non-trivial grammar

I non-trivial name analysis

I significant semantic computations

I translation to some intermediate code

I translation to native code

EDA180: Compiler Construction F14-5

Project administration

Estimated work load: 40 hours (20-80) Administration

I Report your project group to your assignment supervisor.

I Your assignment supervisor will be your project supervisor.

I Book a meeting with your supervisor.

I Three tasks: Design, Front end, Back end.

I Three deadlines: March 24, April 22, and May 6.

(also on the course webpage) Project supervisors:

I Niklas Fors, [email protected].

I Jesper ¨ Oqvist, [email protected].

EDA180: Compiler Construction F14-6

Project administration: Repository

Git (recommended):

I Private repository – don’t assist plagiarism. View the section on ”Cooperation or Plagiarism” on the department web page.

Note that this excludes GitHub.

I BitBucket:

I

Private Git repositories

I

Academic license

I

Used by your supervisor

I Set up your own and give access to your supervisor.

Subversion:

I We can set up a repository for you.

Intel Assembler

Generate assembler from ICode

Tools: as, ld, gcc

(3)

Intel 386/486/Pentium processor architecture

General-purpose registers:

I EAX, EBX, ECX, EDX, ESI, EDI

I ESP – stack pointer

I EBP – base pointer Instruction pointer: EIP

Segment registers: ECS, EDS, EES, ESS Flags register:

I EFLAGS – 32 bits used to store results of comparisons.

EDA180: Compiler Construction F14-9

Register structure

Structure of the EAX register (bits):

31 24 23 16 15 8 7 0

AH AL

AX EAX

I AL,AH – 8-bit registers.

I AX – a 16-bit register.

I EAX – extended AX to 32 bits.

I EBX, ECX, and EDX have the same structure.

EDA180: Compiler Construction F14-10

Program example

.data # allocating memory

n: .long 234 # the number

length: .long 0 # the result

ten: .long 10 # the divisor

.text # instructions

.global _start # make _start globally known _start: movl $0, %ebx # use ebx as counter

movl n, %eax # copy number to eax nextdigit:

movl $0, %edx # prepare for long division idivl ten # divide combined edx:eax by 10

# quotient to eax addl $1, %ebx # add 1 to counter cmpl $0, %eax # compare eax to 0 jg nextdigit # jump if eax>0

movl %ebx, length # copy counter to memory

Variables may have predetermined locations in memory and be referred to by name.

Memory

Memory size:

I Every byte (b, 8 bits) has an address, 0, 1, . . .

I word (w, 16 bits)

I long (l, 32 bits) In the project:

I All variables reside on the stack.

I Memory for the stack is allocated by ld (default 2Mb).

I You will not need a .data segment!

(4)

Useful operand forms

Operand Refers to

$1448 constant 1448 (base 10) nextdigit label address

%eax value in eax

(%ebp) value at address contained in ebp 4(%ebp) value at 4 bytes after address in ebp (%ebp,%eax,4) value at ebp+4*eax

The last three forms refer to values in main memory.

EDA180: Compiler Construction F14-13

Useful instructions

Instruction Operands Effect

movl rmc32, rm32 rm32 ← rmc32

addl rmc32, rm32 rm32 ← rm32+rmc32

subl rmc32, rm32 rm32 ← rm32-rmc32

negl rm32 rm32 ← -rm32

idivl rm32 eax ← edx:eax/rm32

edx ← remainder

notl rm32 rm32 ← ! rm32, bitwise, false = 0 andl rmc32, rm32 rm32 ← rm32 & rmc32, bitwise orl rmc32, rm32 rm32 ← rm32 | rmc32, bitwise cmpl rmc32

1

, rmc32

2

compare by computing

rmc32

2

-rmc32

1

leal m32, r32 r32 ← address denoted by m32

Operand types: r – register, m – memory, c – constant An instruction can have at most one memory (m) operand.

EDA180: Compiler Construction F14-14

Conditional and jump instructions

The result of comparisons (compl) end up in the EFLAGS register and may be used by succeeding instructions.

Condition codes (cc) set by the compl instruction:

l le e ne g ge

< ≤ = 6= > ≥ Jumps may be conditional:

jmp dest jump unconditionally je dest jump if equal

jg dest jump if greater

jcc dest jump if cc (conditional code) Other conditional instructions:

setcc rm8 rm8 = cc ? 1 : 0 cmovcc rm32, r32 r32 = rm32 if cc

Stack instructions

Instruction Operand Effect

pushl rmc32 push value in rmc32

popl rm32 pop to rm32

Example:

pushl %ebx

Stack before:

value

← towards address 0 Stack after:

ebx value value

(5)

Procedure calls

Instruction Operands Effect

call c32 push return address and jump

ret pop return address and jump

int c32 interrupt to kernel

Example:

call p # will push address of next instruction ...

p:

...

ret # will pop address and jump

EDA180: Compiler Construction F14-17

C compiler conventions

I Arguments are pushed on the stack in reverse order in the caller’s activation record.

I Caller pops arguments after return.

I Callee must restore EBX, ESI, EDI, ESP, and EBP before returning.

I EAX is used for return values.

EDA180: Compiler Construction F14-18

Debugging assembler

The ddd debugger (gdb):

Inspect memory Inspect registers Step

through program

The Exam

(6)

The exam

Regular exam: Wednesday March 13, 8-13, Sparta:D.

Next exam: Friday August 30, 8-13, Victoriastadion 1A.

One week advance registration is required for the August exam.

Allowed material at the exam:

I Manual page on JastAdd syntax.

I ICode reference.

I Dictionary between English and your native language.

Bonus points from the seminar exercises:

I Are counted at both the above examination dates, but not next year.

Prerequisites for writing the exam:

I Approved assignments.

I Assignment supervisor may grant postponement.

EDA180: Compiler Construction F14-21

Old exams

See the course web site, but note that . . .

I from 2008 a slightly different intermediate code is used.

I in 2003 and earlier, a slightly different JastAdd notation was used.

Now, walk-through of the exam from 2007-03-06 . . .

EDA180: Compiler Construction F14-22

Exam: Problem 1 – Lexical analysis

According to the Java Language Specification, an identifier is an unlimited-length sequence of Java letters and Java digits, the first of which must be a Java letter. Assume that a Java letter is one of a–z, A–Z, and that a Java digit is one of 0–9.

According to the Java code conventions a class identifier should start with a capital letter, a method name should start with a small letter, and all letters should be capital in a constant name.

a. Specify regular expressions for class and method identifiers according to the Java code conventions. You may use [a-z], but not more complex ranges like [a-zA-Z], as a regular expression denoting the language of all strings with one character from the specified range.

b. An identifier cannot have the same spelling as the null literal.

Construct a DFA recognizing class and method identifiers according to the Java code conventions and the literal null with distinct final states.

Exam: Problem 2 – Parsing

A qualified identifier in Java adheres to the grammar qualifiedID → qualifiedID ”.” qualifiedID qualifiedID → ID

where ID is an identifier token.

a. This grammar is ambiguous. Provide a string that has two different parse trees and draw the trees.

b. Construct an equivalent grammar on canonical form that is unambiguous.

c. Consider the language of all strings generated by the first grammar followed by a $ token. Construct a canonical LL(1) grammar for this language and present the LL(1) table.

d. Specify an equivalent EBNF grammar for the first grammar that is not

recursive and requires just 1 token lookahead.

(7)

Exam: Problem 3 – Semantic analysis

Consider the following fragment of an abstract grammar.

ProcedureDecl ::= Type <ID> Parameters Stmt;

abstract Stmt;

Assignment: Stmt ::= <ID> Expr;

IfStmt: Stmt ::= Expr Then:Stmt Else:Stmt;

Return: Stmt ::= Expr;

StmtList: Stmt ::= Stmt*;

a. Every execution path through the procedure block must terminate with a return statement. Construct a .jadd file with a method that checks this. Note that the following concrete program should not generate an error message.

integer fac(integer n) { if (n==0) {

return 1;

} else {

return n*fac(n-1);

} }

EDA180: Compiler Construction F14-25

Exam: Problem 3 – Semantic analysis

b. Assume that there is a traversing visitor:

class TraversingVisitor implements Visitor { ...

Object visit(IfStmt node, Object data) { node.getExpr().accept(this, data);

node.getThen().accept(this, data);

node.getElse().accept(this, data);

return null;

} ...

}

Construct a subclass of this class that provides a method

static int numberOfReturns(ProcedureDecl node)

that will return the number of return statements in the node argument.

EDA180: Compiler Construction F14-26

Exam: Problem 4 – Code generation and run-time system

You are going to generate intermediate code for the printR procedure in

void main() { int n;

void printR(int k); { if (k >= 0) {

printR(k-1);

print(k);

} }

n = read();

printR(n);

}

Exam: Problem 4 – Code generation and run-time system

a. Introduce a Print instruction in ICode that can be used for the print statement in the example. You should specify the abstract and the context-free grammars.

b. What code should be generated for printR? Assume the same activation record layout as in the lectures, i.e. header, local variables, and temporaries, and that arguments are pushed on the stack by the caller.

You must not replace the recursive calls by iteration. You must use a labeling scheme that would avoid name clashes in more complex examples.

c. Draw a diagram showing the stack of activation records just before

k=0 is printed for the case that n=2. You should indicate where the

dynamic and static links point and the values of variables, parameters,

and temporaries. The static links should be correct even if they are not

used in this example.

(8)

Repetition

F14-F01

EDA180: Compiler Construction F14-29

F14: Machine code generation

Overall knowledge about:

I Machine architecture with CPU, registers, and memory.

EDA180: Compiler Construction F14-30

F13: Optimization

SSA form (Static Single Assignment)

I A powerful representation for optimization.

Typical optimizations at the intermediate code level:

I Dominance analysis.

I Copy propagation.

I Constant propagation.

I . . .

Typical optimizations at the machine code level:

I Register allocation.

I Instruction scheduling (to take advantage of pipelining).

F12: Memory Management

Overall knowledge:

I The difference between manual and automatic memory management.

I Terminology: fragmentation, memory leak, dangling pointer, compaction, root pointer, . . .

I Main ideas in the main algorithms: reference counting, mark-sweep, copying, generation-based, conservative, incremental, . . .

I Main benefits and drawbacks of the different algorithms.

You don’t have to:

I Memorize the details of the algorithms.

(9)

F11: Intermediate Code

You should know:

I What different kinds of intermediate code are there?

I Why temporary variables are needed and how they are handled.

I Advantages of using intermediate code.

I Difference between intermediate code and machine code.

I Difference between a virtual machine and a real machine.

I Translate a program to ICode.

I How to implement code generation based on the AST.

You don’t have to:

I Memorize the details of ICode — you may use the ICode reference on the exam.

EDA180: Compiler Construction F14-33

F10: Run-time systems

You should know:

I Terminology: activation record, stack, stack pointer, frame pointer, static link, dynamic link, return address, object, heap, heap pointer, . . .

I How procedure calls work, with parameter and return value transmission.

I How object creation works.

I How local and non-local variables in procedures are accessed.

I How different kinds of variables are accessed in an OO language.

I What v-tables are and how they are used in OO languages for method calls.

I Draw the execution state at a given point in a given program.

EDA180: Compiler Construction F14-34

F9: Attribute grammars

You should understand:

I General idea.

I What is the difference between inherited and synthesized attributes?

You should be able to:

I Compute values for synthesized and inherited attribute for a given attribute grammar.

I Make name analysis using synthesized and inherited attributes.

F8: Name and type analysis

You should know:

I Terminology: name analysis, type analysis, scope, block,

homogeneous blocks, declaration-before-use, bindings, symbol table, . . .

I Different kinds of scope rules.

I The difference between IdDecls and IdUses.

I How to implement name analysis based on the AST.

I Typical kinds of errors that can occur during compilation, and what

different compiler phases they are identified in.

(10)

F7: LR parsing

You should understand:

I The principles for how an LR parser works, LR items.

I Why LR is more powerful than LL.

I Typical kinds of unambiguous grammars that can be handled by an LR parser but not by an LL parser.

I Shift and reduce actions.

I What is meant by a Shift/Reduce or Reduce/Reduce conflict?

EDA180: Compiler Construction F14-37

F6: AST computations, AOP, The visitor pattern

You should know:

I The Visitor pattern and how to use it.

I Intertype declarations (static Aspect-oriented programming) and how to use them.

I The benefits and drawbacks of these techniques, compared to each other and compared to writing tangled code.

I Implement various computations using Visitors and Intertype declarations, e.g., unparsing, metrics, interpretation, name analysis, type checking, computation of information needed for code generation, . . .

EDA180: Compiler Construction F14-38

F5: Nullable, First and Follow, ... Abstract syntax trees

You should know:

I The principles for how an LL parser works.

I Intuitive definitions: nullable, FIRST, FOLLOW.

I Construct the nullable, FIRST, and FOLLOW tables for any CFG.

I Construct the LL(1) table for a CFG.

I decide if a grammar is LL(1) or not.

I The difference between a parse tree and an abstract syntax tree.

I The difference between a CFG and an abstract grammar.

I How to design an object-oriented abstract grammar with good names.

F5: Nullable, First and Follow, ... Abstract syntax trees

You should know:

I Write down an abstract grammar using the JastAdd notation.

I How to build ASTs using semantic actions.

I How to build the AST when an LL parser is used.

You don’t have to:

I Memorize the API for generated JastAdd classes — you may use the JastAdd manual page on the exam.

I Memorize the JJTree way for building ASTs.

(11)

F4: LL Parsing

You should know:

I The different names for LL parsing.

I How to implement an LL parser by hand using recursive procedures.

I Typical kinds of grammars that an LL(1) parser cannot accept.

I Given a CFG with some of these typical problems, construct an equivalent CFG that is LL(1).

I What is the difference between local lookahead and global lookahead?

I What the “dangling else” problem is and how to handle it in an LL parser generator.

I Why it is sometimes useful to extend a CFG by an EOF-rule, and how to do it.

EDA180: Compiler Construction F14-41

F4: LL Parsing

You should know:

I What is meant by ambiguous and unambiguous grammars.

I Given an ambiguous grammar for expressions, construct an equivalent unambiguous grammar (given associativity and precedence rules).

I Typical kinds of unambiguous grammars that cannot be handled by an LL(1) parser.

I When could such grammars be LL(k)?

I Construct equivalent grammars that are LL(1).

EDA180: Compiler Construction F14-42

F3: Context-free grammars and Parsing

You should know:

I How to design a clear and simple CFG for a language (disregarding ambiguities, non-LL-ness, etc.).

I Terminology: terminals, nonterminals, productions, start symbol.

I The formal definition of a CFG, G = (N, T , P, S ), and what it means.

I The different notation forms for CFGs.

I Given a grammar on EBNF form, how to construct an equivalent grammar on canonical form, and vice versa.

I What is meant by (leftmost/rightmost) derivation.

I Show that a string belongs to a given language REs.

I Typical notation for regular expressions.

I The difference between REs and CFGs.

F2: Regular expressions and Scanning

You should know:

I Typical kinds of tokens and non-tokens.

I How to define typical tokens and non-tokens using regular expressions.

I What typical ambiguities may occur for a set of token definitions?

I How can such ambiguities be resolved?

I What a finite automaton (FA) is.

I The difference between a deterministic and nondeterministic FA.

I How to translate an NFA to a DFA.

I How to implement a scanner based on FAs, including handling

ambiguities between regular expressions.

(12)

F1: Introduction

You should know:

I The typical phases in a compiler.

I The typical representations of a program inside a compiler.

I The separation into analysis and synthesis.

I The separation into front end and back end.

I Typical applications of compiler construction techniques (in addition to the typical source-to-machine code compiler).

EDA180: Compiler Construction F14-45

Beyond ..

Examples of compiler-related research:

I Development of programming editors – textual and graphical.

I Evaluation of reference attributes – incremental/parallel.

I Optimizing compilers for multiprocessors.

I . . .

Examples of compiler-related Master’s thesis projects:

I Extend the Java language – Java 7, Lambda expressions . . .

I Develop IDE for the Modelica Language (Modelon/Ideon)

I Optimize the JModelica compiler (Modelon/Ideon)

I . . .

Let us now if you are interested in a Master’s thesis or PhD thesis project!

EDA180: Compiler Construction F14-46

References

Related documents

Frihandsblåst glas, sandblästrad/Free-blown glass, sandblasted Formgivare/Designer: Galla Theodosis &amp; Liisa Poskiparta Glasblåsare/Blowers: Simon Moore, Liam

A further factor which minimises government homeless veteran numbers is that as the statutory homeless- ness statistics are collated from local authority homelessness team

The tomato POLα gene is homologous to the Arabidopsis thaliana POLα gene (also called INCURVATA 2 or ICU2) which encodes the catalytic subunit of the DNA polymerase α

ر یاه نارحب لح مدع زا یشان یاهدمایپ دنناوت یم یدش دنشاب راذگریثأت دارفا یدرف نیب و یدرف یاهدرکراک رب ناوت یم ار فلتخم یاه هنیمز رد یدرکراکدب نیا هک هشوخ تیصخش

We extracted information (data collection) from patients, including demographics, presenting blood pressure, heart rate, ISS score, length of stay, death, in-hospital complica-

These materials and the information contained herein are provided by Deloitte Financial Advisory Services LLP (“Deloitte FAS”) and are intended to provide general information on

• Size of central scar [Suzuki 2000; Sakurai 2004] • Pattern of central scar [Maeshima 2002] • Percentage of lepidic growth [Minami 2005] • Percentage/presence of papillary

Tony spent several years with CCH Small Firm Services, a leader in the Tax and Accounting industry, Tony managed CCH’s high value service bureau clients, trained IRS and