3.4 Mapping Specications
5.1.3 Derivations and Parse Trees
Each production in a regular grammar can have at most one nonterminal on the right-hand side. This property guarantees { in contrast to the context-free grammars { that each sen- tence of the language has exactly one derivation when the grammar is unambiguous (Deni- tion 5.11).
Figure 5.4a is a regular grammar that generates the non-negative integers and real numbers if
n
represents an arbitrary sequence of digits. Three derivations according to this grammar are shown in Figure 5.4b. Each string except the last in a derivation contains exactly one nonterminal, from which a new string must be derived in the next step. The last string consists only of terminals. The sequence of steps in each derivation of this example is determined by the derived sentence.The situation is dierent for context-free grammars, which may have any number of non- terminals on the right-hand side of each production. Figure 5.5 shows that several derivations, diering only in the sequence of application of the productions, are possible for a given sen- tence. (These derivations are constructed according to the grammar of Figure 5.3a.)
In the left-hand column, a leftmost derivation was used: At each step a new string was derived from the leftmost nonterminal. Similarly, a rightmost derivation was used in the right-hand column. A nonterminal was chosen arbitrarily at each step to produce the center derivation.
A grammar ascribes structure to a string not by giving a particular sequence of derivation steps but by showing that a particular substring is derived from a particular nonterminal.
5.1 Descriptive Tools 87
T
= fn;:;
+;
,;E
gN
= fC;F;I;X;S;U
gP
= fC
!n
,C
!nF
,C
!:I
,F
!:I
,F
!ES
,I
!n
,I
!nX
,X
!ES
,S
!n
,S
!+U
,S
!,U
,U
!n
gZ
=C
a) A grammar for real constants
C C C
n :I nF
:n n:I
n:nX
n:nES
n:nE
+U
n:nE
+n
b) Three derivations according to the grammar of (a) Figure 5.4: Derivations According to a Regular Grammar
For example, in Figure 5.5 the substring
ii
is derived from the single nonterminalT
. We interpret this property of the derivation to mean thatii
forms a single semantic unit: an instance of the operator applied to thei
's as operands. It is important to realize that the grammar was constructed in a particular way specically to ascribe a semantically relevant structure to each sentence in the language. We cannot be satised with any grammar that denes a particular language; we must choose one reecting the semantic structure of each sentence. For example, suppose that the rulesE
!E
+T
andT
!T
F
of Figure 5.3a had been replaced byE
!ET
andT
!T
+F
respectively. The modied grammar would describe the same language, but would ascribe a dierent structure to its sentences: It would imply that additions should take precedence over multiplications.E
E
E
E
+T
E
+T
E
+T
T
+T
E
+TF E
+TF
F
+T
T
+T
F E
+Ti
i
+T
T
+F
F E
+F
i
i
+TF T
+F
i E
+ii
i
+F
F F
+F
i T
+ii
i
+iF i
+F
i
F
+ii
i
+ii
i
+ii
i
+ii
Figure 5.5: Derivations According to a Context-Free Grammar Substrings derived from single nonterminals are called phrases:
5.8 Definition
Consider a grammar
G
= (T;N;P;Z
). The string 2V
+ is a phrase (for
X
) of if and only ifZ
)X
) + (;
2V
,X
2
N
). It is a simple phrase of if and only ifZ
)
X
)
.Each of the three derivations of Figure 5.5 identies the same set of simple phrases. They are therefore equivalent in the sense that they ascribe identical phrase structure to the string
i
+ii
. In order to have a single representation for the entire set of equivalent derivations, one that makes the structure of the sentence obvious, we introduce the notion of a parse tree (see Appendix B for the denition of an ordered tree):5.9 Definition
Consider an ordered tree (
K;D
) with rootk
0 and label functionf
:K
!
M
. Letk
1
;:::;k
n, (n >
0) be the immediate successors ofk
0. (K;D
) is a parse tree according to the grammar (T;N;P;Z
) if the following conditions hold:(a)
M
V
[fg(b)
f
(k
0) =Z
(c)
Z
!f
(k
1)
:::f
(k
n) 2P
(d)
iff
(k
i)2T
, or ifn
= 1 andf
(k
i) =, thenk
i is a leaf(e)
iff
(k
i)2N
thenk
i is the root of a parse tree according to the grammar (T;N;P;f
(k
i)) Figure 5.6 is a tree fori
+ii
according to the grammar of Figure 5.3a, as can be shown by recursive application of Denition 5.9.E E + T T T * F F F i i i
Figure 5.6: The Parse Tree for
i
+ii
We can obtain any string in any derivation of a sentence from the parse tree of that sentence by selecting a minimum set of nodes, removal of which will break all root-to-leaf paths. (Such a set of nodes is called a cut { see Denition B.8.) For example, in Figure 5.6 the setf
T;
+;T;;F
g(the third row of nodes, plus `+' from the second row) has this property andT
+T
F
is the fourth step in the center derivation of Figure 5.5.5.10 Theorem
In a parse tree according to a grammar
G
= (T;N;P;Z
), a set of nodes (k
1;:::;k
n) is a cut if and only ifZ
)f
(k
1)
:::f
(k
n).A parse tree species the phrase structure of a sentence. With the grammars given so far, only one parse tree corresponds to each sentence. This may not always be true, however, as illustrated by Figure 5.7. The grammar of Figure 5.7a describes the same language as that of Figure 5.3a, but many sentences have several parse trees.
5.11 Definition
A sentence is ambiguous if its derivations may be described by at least two distinct parse trees (or leftmost derivations or rightmost derivations). A grammar is ambiguous if there is at least one ambiguous sentence in the language it denes; otherwise the grammar is unambiguous.
5.1 Descriptive Tools 89
T
=f+;;i
gN
=fE
gP
=fE
!E
+E;E
!EE;E
!i
gZ
=E
a) An ambiguous grammar E E E E E i i i + * E E E E i i i * E +b) Two parse trees for
i
+ii
Figure 5.7: AmbiguityFigure 5.7b shows two parse trees for