Implementation - LR Parsers - Compiler Construction - Free Computer, Programming, Mathematics,

7.3 LR Parsers

7.3.5 Implementation

In order to carry out the parsing practically, a table of the left sides and lengths of the right sides of all productions (other than chain productions), as well as parser actions to be invoked at connection points, must be known to the transition function. The transition function is

7.3 LR Parsers 147 partitioned in this way to ease the storage management problems. Because of cost we store the transition function as a packed data structure and employ an access routine that locates the value

f

(

q;

) given (

q;

). Some systems work with a list representation of the (sparse) transition matrix; the access may be time consuming if such a scheme is used, because lists must be searched.

The access time is reduced if the matrix form of the transition function is retained, and the storage requirements are comparable to those of the list method if as many rows and columns as possible are combined. In performing this combination we take advantage of the fact that two rows can be combined not only when they agree, but also when they are compatible according to the following denition:

7.8 Definition

Consider a transition matrix

f

(

q;

). Two rows

q;q

Q

are compatible if, for each column

, either

f

(

q;

) =

f

(

q

;

) or one of the two entries is a don't-care entry.

Compatibility is dened analogously for two columns

;

V

. We shall only discuss the combination of rows here.

We inspect the terminal transition matrix, the submatrix of

f

(

q;

) with

T

, separately from the nonterminal transition matrix. Often dierent combinations are possible for the two submatrices, and by exploiting them separately we can achieve a greater storage reduction. This can be seen in the case of Figure 7.18a, which is an implementation of the transition matrix of Figure 7.17. In the terminal transition matrix rows 0, 4, 5 and 6 are compatible, but none of these rows are compatible in the nonterminal transition matrix.

In order to increase the number of compatible rows, we introduce a Boolean failure matrix,

F

[

q;t

q

Q

t

T

. This matrix is used to lter the access to the terminal transition matrix:

f

(

q;t

) =

ifF

[

q;t

]

thenerrorelse

entry in the transition matrix

;

For this purpose we dene

F

[

q;t

] as follows:

F

[

q;t

] = (

true

f

(

q;t

) =

ERROR

false

otherwise

Figure 7.18b shows the failure matrix derived from the terminal transition matrix of Fig- ure 7.18a. Note that the failure matrix may also contain don't-care entries, derived as discussed at the end of Section 7.3.2. Row and column combinations applied to Figure 7.18b reduce it from 96 to 44.

With the introduction of the failure matrix, all previous error entries become don't-care entries. Figure 7.18c shows the resulting compression of the terminal transition matrix. The nonterminal transition matrix is not aected by this process; in our example it can be compressed by combining both rows and columns as shown in Figure 7.18d. Each matrix requires an access map consisting of two additional arrays specifying the row (column) of the matrix to be used for a given state (symbol). For grammars of the size of the LAX grammar, the total storage requirements are generally reduced to 5-10% of their original values.

We have a certain freedom in combining the rows of the transition matrix. For example, in the terminal matrix of Figure 7.18a we could also have chosen the grouping f(0,4,5,6,9),(1,2,7,8)g. In general these groupings dier in the nal state count; we must therefore examine a number of possible choices. The task of determining the minimum number of rows reduces to a problem in graph theory: We construct the (undirected) incompati- bility graph

I

= (

Q;D

) for our state set

Q

, in which two nodes

q

and

q

0 are connected if the rows are incompatible. Minimization of the number of rows is then equivalent to the task of

coloring the nodes with a minimum number of colors such that any pair of nodes connected by a branch are of dierent colors. (Graph coloring is discussed in Section B.3.3.) Further compression may be possible as indicated in Exercises 7.12 and 7.13.

i

( ) + * #

E T F

0 -6 4 . . . . 1 2 2 1 . 5 * 2 . . . 5 6 * 4 -6 4 . . . . 7 8 8 5 -6 4 . . . . 9 9 6 -6 4 . . . . -4 7 -7 5 . 8 . . -7 5 6 . 9 . . +2 +2 6 +2

a) Transition matrix for Figure 7.17 with shift-reduce transitions

i

( ) + * #

false false true true true true

true false

false

true true true false false false

false false true true true true

false false

true

true true false false false true

true true false false false false

b) Uncompressed failure matrix for (a)

i

( ) + * #

0,1,2,4,5,6,7,8 -6 4 -7 5 6 *

9 +2 +2 6 +2

c) Compressed terminal transition matrix

E TF

0,1,2 1 2

4 7 8

5 9

6,7,8,9 -4

d) Compressed nonterminal transition matrix Figure 7.18: Table Compression

7.4 Notes and References

LL(1) parsing in the form of recursive descent was, according to McClure[1972], the most frequently-used technique in practice. Certainly its exibility and the fact that it can be hand-coded contribute to this popularity.

LR languages form the largest class of languages that can be processed with deterministic pushdown automata. Other techniques (precedence grammars, (

m;n

)-bounded context gram-

7.4 Notes and References 149 mars or Floyd-Evans Productions, for example) either apply to smaller language classes or do not attain the same computational eciency or error recovery properties as the techniques treated here. Operator precedence grammars have also achieved signicant usage because one can easily construct parsers by hand for expressions with inx operators. Ahoand Ullman

[1972] give quite a complete overview of the available parsing techniques and their optimal implementation.

Instead of obtaining the LALR(1) parser from the LR(1) parser by merging states, one could begin with the SLR(1) parser and determine the exact right context only for those states in which the transition function is ambiguous. This technique reduces the computation time, but unfortunately does not generalize to an algorithm that eliminates all chain productions.

Construction 7.7 requires a redundant eort that can be avoided in practice. For example, the closure of a situation [

X

B

;] depends only upon the nonterminal

B

if the lookahead set is ignored. The closure can thus be computed ahead of time for each

B

N

, and only the lookahead sets must be supplied during parser construction. Also, the repeated construction of the follower state of an LALR(1) state that develops from the combination of two LR(1) states with distinct lookahead sets can be simplied. This repetition, which results from the marking of states as not yet examined, leaves the follower state (specied as a set of situations) unaltered. It can at most add lookahead symbols to single situations. This addition can also be accomplished without computing the entire state anew.

Our technique for chain production elimination is based upon an idea of Pager [1974]. Use of the failure matrix to increase the number of don't-care entries in the transition matrix was rst proposed byJoliat[1973, 1974].

Exercises

7.1 Consider a grammar with embedded connection points. Explain why transformations of the grammar can be guaranteed to leave the invocation sequence of the associated parser actions invariant.

7.2 State the LL(1) condition in terms of the extended BNF notation of Section 5.1.3. Prove that your statement is equivalent to Theorem 7.2.

7.3 Give an example of a grammar in which the graph of

LAST

contains a cycle. Prove that

FOLLOW

(

A

) =

FOLLOW

(

B

) for arbitrary nodes

A

and

B

in the same strongly connected subgraph.

7.4 Design a suitable internal representation of a grammar and program the generation algorithm of Section 7.2.3 in terms of it.

7.5 Devise an LL(1) parser generation algorithm that accepts the extended BNF notation of Section 5.1.3. Will you be able to achieve a more ecient parser by operating upon this form directly, or by converting it to productions? Explain.

7.6 Consider the interpretive parser of Figure 7.9.

(a) Dene additional operation codes to implement connection points, and add the appropriate alternatives to the case statement. Carefully explain the interface conventions for the parser actions. Would you prefer a dierent kind of parse table entry? Explain.

(b) Some authors provide special operations for the situations [

X

B

] and [

X

tB

]. Explain how some recursion can be avoided in this manner, and write appropriate alternatives for the case statement.

(c) Once the special cases of (b) are recognized, it may be advantageous to provide extra operations identical to 4 and 5 of Figure 7.9, except that the conditions are reversed. Why? Explain.

(d) Recognize the situation [

X

t

] and alter the code of case 4 to absorb the processing of the 2 operation following it.

(e) What is your opinion of the value of these optimizations? Test your predictions on some language with which you are familiar.

7.7 Show that the following grammar is LR(1) but not LALR(1):

Z

A

aBcB

A

B

A

D

B

b

B

Ff

D

dE

E

FcA

E

FcE

F

b

7.8 Repeat Exercise 7.5 for the LR case. Use the algorithm of Section 7.3.4.

7.9 Show that

FIRST

(

A

) can be computed by any marking algorithm for directed graphs that obtains a `spanning tree',

B

, for the graph.

B

has the same node set as the original graph,

G

, and its branch set is a subset of that of

G

7.10 Consider the grammar with the following productions:

Z

AXd

Z

BX

Z

C

A

B

A

C

B

CXb

C

c

X

(a) Derive an LALR(1) parser for this grammar.

(b) Delete the reductions by the chain productions

A

B

and

A

C

7.11 Use the techniques discussed in Section 7.3.5 to compress the transition matrix produced for Exercise 7.8.

7.12 [Andersonet al., 1973] Consider a transition matrix for an LR parser constructed by one of the algorithms of Section 7.3.2.

(a) Show that for every state

q

there is exactly one symbol

z

(

q

) such that

f

(

q

;a

) implies

a

z

(

q

(b) Show that, in the case of shift-reduce transitions introduced by the algorithms of Sections 7.3.3 and 7.3.4, an unambiguous symbol

z

(

A

) exists such that

f

(

q;a

) = `shift and reduce

A

' implies

a

z

(

A

c

have sequential numbers

c

i

= 0

;

;:::

Thus it suces to store only the relative number

i

in the transition matrix; the base

c

0 is only given once for each column. In exactly the same manner, a list of the reductions in a row can be assigned to this row and retain only the appropriate index to this list in the transition matrix.

(d) Make these alterations in the transition matrix produced for Exercise 7.8 before beginning the compression of Exercise 7.11, and compare the result with that obtained previously.

7.4 Notes and References 151 7.13 Bell [1974] Consider an

mn

transition matrix,

t

, in which all unspecied entries are don't-cares. Show that the matrix can be compressed into a

pq

matrix

c

, two length-

m

arrays

f

and

u

, and two length-

n

arrays

g

and

by the following algorithm: Initially

f

i =

g

i = 1, 1

i

m

, 1

j

n

, and

k

= 1. If all occupied columns of the

i

th _{row of}

_t

_{uniformly contain the value}

_r

_{, then set}

_f

_i _:=

_k

_:=

_k

_{+ 1,}

_u

_i _:=

_r

and delete the

i

th _{row of}

_t

_{. If the}

_j

th _{column is uniformly occupied, delete it also and}

set

g

j :=

k

+ 1,

j :=

r

. Repeat this process until no uniformly-occupied row

or column remains. The remaining matrix is the matrix

c

. We then enter the row (column) number in

c

of the former

i

th _{row (}

_j

th _{column) into}

_u

_i ₍

_j_{). The following}

relation then holds:

t

i;j =

if

f

< g

then

u

else if

f

< g

then

else

f

g

j =1 *)

c

i;j;

(Hint: Show that the size of

c

is independent of the sequence in which the rows and columns are deleted.)

Attribute Grammars

Semantic analysis and code generation are based upon the structure tree. Each node of the tree is `decorated' with attributes describing properties of that node, and hence the tree is often called an attributed structure tree for emphasis. The information collected in the attributes of a node is derived from the environment of that node; it is the task of semantic analysis to compute these attributes and check their consistency. Optimization and code generation can be also described in similar terms, using attributes to guide the transformation of the tree and ultimately the selection of machine instructions.

Attribute grammars have proven to be a useful aid in representing the attribution of the structure tree because they constitute a formal denition of all context-free and context- sensitive language properties on the one hand, and a formal specication of the semantic analysis on the other. When deriving the specication, we need not be overly concerned with the sequence in which the attributes are computed because this can (with some restrictions) be derived mechanically. Storage for the attribute values is also not reected in the specication. We begin by assuming that all attributes belonging to a node are stored within that node in the structure tree; optimization of the attribute storage is considered later.

Most examples in this chapter are included to show constraints and pathological cases; practical examples can be found in Chapter 9.

8.1 Basic Concepts of Attribute Grammars

An attribute grammar is based upon a context-free grammar

G

= (

N;T;P;Z

). It associates a set

A

(

X

) of attributes with each symbol,

X

, in the vocabulary of

G

. Each attribute represents a specic (context-sensitive) property of the symbol

X

, and can take on any of a specied set of values. We write

X:a

to indicate that attribute

a

is an element of

A

(

X

Each node in the structure tree of a sentence in

L

(

G

) is associated with a particular set of values for the attributes of some symbol

X

in the vocabulary of

G

. These values are established by attribution rules

R

(

p

) = f

X

:a

f

(

X

:b;:::;X

:c

)g for the productions

p

X

:::X

n used to construct the tree. Each rule denes an attribute

X

:a

in terms of attributes

X

:b;:::;X

:c

of symbols in the same production. (Note that in this chapter

we use upper-case letters to denote vocabulary symbols, rather than using case to distinguish terminals from nonterminals. The reason for this is that any symbol of the vocabulary may have attributes, and the distinction between terminals and nonterminals is generally irrelevant for attribute computation.)

In addition to the attribution rules, a condition

B

(

X

:a;:::;X

:b

) involving attributes of

symbols occurring in

p

may be given.

B

species the context condition that must be fullled if a syntactically correct sentence is correct according to the static semantics and therefore

rule

assignment ::= name ':=' expression .

attribution

name.environment assignment.environment; expression.environment assignment.environment; name.postmode name.primode; expression.postmode

if

name.primode = ref_int_type

then

int_type

else

real_type

;

rule

expression ::= name addop name .

attribution

name[1].environment expression.environment; name[2].environment expression.environment; expression.primode

if

coercible (name[1].primode, int_type)

and

coercible (name[2].primode, int_type )

then

int_type

else

real_type

; addop.mode expression.primode ;

name[1].postmode expression.primode ; name[2].postmode expression.primode ;

condition

coercible (expression.primode, expression.postmode);

rule

addop ::= '+'.

attribution

addop.operation

if

addop.mode = int_type

then

int_addition

else

real_addition

;

rule

name ::= identifier .

attribution

name.primode defined_type (identifier.symbol , name.environment );

condition

coercible (name.primode , name.postmode );

Figure 8.1: Simplied LAX Assignment

translatable. We could also regard this condition as the computation of a Boolean attribute

consistent, which we associate with the left-hand side of the production.

As an example, Figure 8.1 gives a simplied attribute grammar for LAX assignments. Each

p

P

is marked by the keyword

rule

and written using EBNF notation (restricted to express only productions). The elements of

R

(

p

) follow the keyword

attribution

. We use a conventional expression-oriented programming language notation for the functions

f

, and ter- minate each element with a semicolon. Particular instances of an attribute are distinguished by numbering multiple occurrences of symbols in the production (e.g. name[1], name[2])

from left to right. Any condition is also marked by a keyword and terminated by a semicolon. In order to check the consistency of the assignment and to further identify the + operator, we must take the operand types into account. For this purpose we dene two attributes,

primode and postmode, for the symbols expression and name, and one attribute, mode,

for the symboladdop. Primode describes the type determined directly from the node and its

descendants;postmode describes the type expected when the result is used as an operand by

other nodes. Any dierence betweenprimode and postmode must be resolved by coercions.

8.1 Basic Concepts of Attribute Grammars 155 name1 identifier1 name3 identifier3 name2 identifier2 assignment expression addop ’+’ a) Syntactic structure tree

assignment.environment identifier_i.symbol

b) Attribute values given initially (

i

= 1

;::: ;

name1.environment expression.environment

name_i.environment name1.primode

name1.postmode expression.postmode namei.primode

expression.primode name1 condition

addop.mode name_i.postmode expression condition addop.operation name_i condition

c) Attribute values computed (

i

= 2

;

3) Figure 8.2: Analysis of

x

y

z

Figure 8.2 shows the analysis of

x

y

z

according to the grammar of Figure 8.1. (Assignment.environment would be computed from the declarations of

x

y

and

z

, but here

we show it as given in order to make the example self-contained.) Attributes on the same line of Figure 8.2c can be computed collaterally; every attribute is dependent upon at least one attribute from the previous line. These dependency relations can be expressed as a graph (Figure 8.3). Each large box represents the production whose application corresponds to the node of the structure tree contained within it. The small boxes making up the node itself represent the attributes of the symbol on the left-hand side of the production, and the arrows represent the dependency relations arising from the attribution rules of the production. The node set of the dependency graph is just the set of small boxes representing attributes; its edge set is the set of arrows representing dependencies.

identifier identifier symbol identifier symbol symbol

env pri post primode postmode

environment pri pri name addop name name assignment expression post post environment

env mode oper env

We must know all of the values upon which an attribute depends before we can compute the value of that attribute. Clearly this is only possible if the dependency graph is acyclic. Figure 8.3 is acyclic, but consider the following LAX type denition, which we shall discuss in more detail in Sections 9.1.2 and 9.1.3:

type

t =

record

(real x ,

ref

t p );

We must compute a type attribute for each of the identiers

t

x

and

p

so that the associated type is known at each use of the identier. The type attribute of

t

consists of the keyword

record

plus the types and identiers of the elds. Now, however, the type of

p

contains an application of

t

, implying that the type identied by

t

depends upon which type a use of

t

identies. Thus the type

t

depends cyclically upon itself. (We shall show how to eliminate the cycle from this example in Section 9.1.3.)

Let us now make the intuition gained from these examples more precise. We begin with

In document Compiler Construction - Free Computer, Programming, Mathematics, Technical Books, Lecture Notes and Tutorials (Page 158-170)