• No results found

3.4 Mapping Specications

5.1.3 Derivations and Parse Trees

Each production in a regular grammar can have at most one nonterminal on the right-hand side. This property guarantees { in contrast to the context-free grammars { that each sen- tence of the language has exactly one derivation when the grammar is unambiguous (Deni- tion 5.11).

Figure 5.4a is a regular grammar that generates the non-negative integers and real numbers if

n

represents an arbitrary sequence of digits. Three derivations according to this grammar are shown in Figure 5.4b. Each string except the last in a derivation contains exactly one nonterminal, from which a new string must be derived in the next step. The last string consists only of terminals. The sequence of steps in each derivation of this example is determined by the derived sentence.

The situation is dierent for context-free grammars, which may have any number of non- terminals on the right-hand side of each production. Figure 5.5 shows that several derivations, diering only in the sequence of application of the productions, are possible for a given sen- tence. (These derivations are constructed according to the grammar of Figure 5.3a.)

In the left-hand column, a leftmost derivation was used: At each step a new string was derived from the leftmost nonterminal. Similarly, a rightmost derivation was used in the right-hand column. A nonterminal was chosen arbitrarily at each step to produce the center derivation.

A grammar ascribes structure to a string not by giving a particular sequence of derivation steps but by showing that a particular substring is derived from a particular nonterminal.

5.1 Descriptive Tools 87

T

= f

n;:;

+

;

,

;E

g

N

= f

C;F;I;X;S;U

g

P

= f

C

!

n

,

C

!

nF

,

C

!

:I

,

F

!

:I

,

F

!

ES

,

I

!

n

,

I

!

nX

,

X

!

ES

,

S

!

n

,

S

!+

U

,

S

!,

U

,

U

!

n

g

Z

=

C

a) A grammar for real constants

C C C

n :I nF

:n n:I

n:nX

n:nES

n:nE

+

U

n:nE

+

n

b) Three derivations according to the grammar of (a) Figure 5.4: Derivations According to a Regular Grammar

For example, in Figure 5.5 the substring

ii

is derived from the single nonterminal

T

. We interpret this property of the derivation to mean that

ii

forms a single semantic unit: an instance of the operator applied to the

i

's as operands. It is important to realize that the grammar was constructed in a particular way specically to ascribe a semantically relevant structure to each sentence in the language. We cannot be satised with any grammar that denes a particular language; we must choose one reecting the semantic structure of each sentence. For example, suppose that the rules

E

!

E

+

T

and

T

!

T

F

of Figure 5.3a had been replaced by

E

!

ET

and

T

!

T

+

F

respectively. The modied grammar would describe the same language, but would ascribe a dierent structure to its sentences: It would imply that additions should take precedence over multiplications.

E

E

E

E

+

T

E

+

T

E

+

T

T

+

T

E

+

TF E

+

TF

F

+

T

T

+

T

F E

+

Ti

i

+

T

T

+

F

F E

+

F

i

i

+

TF T

+

F

i E

+

ii

i

+

F

F F

+

F

i T

+

ii

i

+

iF i

+

F

i

F

+

ii

i

+

ii

i

+

ii

i

+

ii

Figure 5.5: Derivations According to a Context-Free Grammar Substrings derived from single nonterminals are called phrases:

5.8 Definition

Consider a grammar

G

= (

T;N;P;Z

). The string

2

V

+ is a phrase (for

X

) of

if and only if

Z

)

X

) +

(

;

2

V

,

X

2

N

). It is a simple phrase of

if and only if

Z

)

X

)

.

Each of the three derivations of Figure 5.5 identies the same set of simple phrases. They are therefore equivalent in the sense that they ascribe identical phrase structure to the string

i

+

ii

. In order to have a single representation for the entire set of equivalent derivations, one that makes the structure of the sentence obvious, we introduce the notion of a parse tree (see Appendix B for the denition of an ordered tree):

5.9 Definition

Consider an ordered tree (

K;D

) with root

k

0 and label function

f

:

K

!

M

. Let

k

1

;:::;k

n, (

n >

0) be the immediate successors of

k

0. (

K;D

) is a parse tree according to the grammar (

T;N;P;Z

) if the following conditions hold:

(a)

M

V

[f

g

(b)

f

(

k

0) =

Z

(c)

Z

!

f

(

k

1)

:::f

(

k

n) 2

P

(d)

if

f

(

k

i)2

T

, or if

n

= 1 and

f

(

k

i) =

, then

k

i is a leaf

(e)

if

f

(

k

i)2

N

then

k

i is the root of a parse tree according to the grammar (

T;N;P;f

(

k

i)) Figure 5.6 is a tree for

i

+

ii

according to the grammar of Figure 5.3a, as can be shown by recursive application of Denition 5.9.

E E + T T T * F F F i i i

Figure 5.6: The Parse Tree for

i

+

ii

We can obtain any string in any derivation of a sentence from the parse tree of that sentence by selecting a minimum set of nodes, removal of which will break all root-to-leaf paths. (Such a set of nodes is called a cut { see Denition B.8.) For example, in Figure 5.6 the setf

T;

+

;T;;F

g(the third row of nodes, plus `+' from the second row) has this property and

T

+

T

F

is the fourth step in the center derivation of Figure 5.5.

5.10 Theorem

In a parse tree according to a grammar

G

= (

T;N;P;Z

), a set of nodes (

k

1

;:::;k

n) is a cut if and only if

Z

)

f

(

k

1)

:::f

(

k

n).

A parse tree species the phrase structure of a sentence. With the grammars given so far, only one parse tree corresponds to each sentence. This may not always be true, however, as illustrated by Figure 5.7. The grammar of Figure 5.7a describes the same language as that of Figure 5.3a, but many sentences have several parse trees.

5.11 Definition

A sentence is ambiguous if its derivations may be described by at least two distinct parse trees (or leftmost derivations or rightmost derivations). A grammar is ambiguous if there is at least one ambiguous sentence in the language it denes; otherwise the grammar is unambiguous.

5.1 Descriptive Tools 89

T

=f+

;;i

g

N

=f

E

g

P

=f

E

!

E

+

E;E

!

EE;E

!

i

g

Z

=

E

a) An ambiguous grammar E E E E E i i i + * E E E E i i i * E +

b) Two parse trees for

i

+

ii

Figure 5.7: Ambiguity

Figure 5.7b shows two parse trees for

i

+

ii

that are essentially dierent for our purposes because we associate two distinct sequences of operations with them. If we use an ambiguous grammar to describe the language (and this may be a useful thing to do), then either the ambiguity must involve only phrases with no semantic relevance or we must provide additional rules for removing the ambiguity.