Syntax-directed translation generating devices

3.3 Syntax-directed translations

3.3.1 Syntax-directed translation generating devices

First, we present a generating device dening a syntax-directed translation a synchronous grammar called syntax-directed translation schema (Irons, 1961, Aho and Ullman, 1972,

ulík, 1965) and also known as syntax-directed transduction (Lewis II and Stearns, 1968).

It consists of an input CFG and an output CFG with a common set of nonterminals. In every synchronous production the number of nonterminals occurring in the right-hand side of the production of the input grammar is the same as the number of nonterminals occurring in the corresponding right-hand side of the production in the output grammar. Moreover, a pairing is made by associating occurrences of the same input/output nonterminals. In

other words, a syntax-directed translation schema may be seen as a CFG with translation elements attached to each production. Whenever a production is used in a derivation of an input string, the associated translation element generates a part of an output string. The precise denition follows.

Denition 3.3.1. A syntax-directed translation schema (SDTS) is a device SD = (N, X, Y, P, S)specied as follows.

(1) N is the alphabet of nonterminal symbols such that N ∩ (X ∪ Y ) = ∅.

(2) X and Y are the input and the output alphabet.

(3) S ∈ N is the start symbol.

(4) P is a nite set of productions of the form

A; A → v₀A₁v₁. . . v_m−1A_mv_m; w₀A_σ(1)w₁. . . w_m−1A_σ(m)w_m (σ) (3.3.1) where m ≥ 0, A, A1, . . . , A_m ∈ N, σ is a permutation on [m], and for every 0 ≤ i ≤ m, vi ∈ X^∗ and wi ∈ Y^∗. Note that σ(i) = j means that the i^th nonterminal in β corresponds to the j^th nonterminal in α.

When α ∈ X^∗ and β ∈ Y^∗ in a production A; A → α; β (σ) in P , we may omit the empty permutation σ and write the production simply as A; A → α; β. The input grammar of SD is the CFG SDⁱⁿ = (N, X, Pⁱⁿ, S), where Pⁱⁿ := {A → α | A; A → α; β (σ) ∈ P for some β and σ}. Similarly, the CFG SDôut= (N, Y, Pôut, S), where Pôut := {A → β | A; A → α; β (σ) ∈ P for some α and σ}, is called the output grammar of SD.

To present the semantics of an SDTS SD, we use the notion of associated nonterminals.

Whenever we apply a production in a derivation, we have to apply it to two associated

nonterminals. This notion will be formalized later in Section 5.3.1 for a more general case (cf. Aho and Ullman, 1969a, p. 321). The translation forms of SD, which are elements of (N ∪ X)^∗× (N ∪ Y )^∗, are dened inductively as follows.

(1) (S, S) is a translation form, and the two Ss are said to be associated.

(2) If (γAδ, γ⁰Aδ⁰) is a translation form in which the two explicit instances of A are associated and A; A → α; β (σ) is a production in P , then (γαδ, γ⁰βδ⁰)is a translation form. The nonterminals of α and β are associated in (γαδ, γ⁰βδ⁰) the same way as they are associated in the production. The nonterminals of γ and δ are associated with those of γ⁰ and δ⁰ in the new translation form exactly as in the original one.

If (γAδ, γ⁰Aδ⁰) and (γαδ, γ⁰βδ⁰)are as above, then we write (γAδ, γ⁰Aδ⁰) ⇒_SD(γαδ, γ⁰βδ⁰).

This is a leftmost derivation step if the explicit instance of A is the leftmost occurrence of any nonterminal symbol in γAδ. Furthermore, for any translation forms (γ, δ) and (γ⁰, δ⁰), (γ, δ) ⇒^∗_SD(γ⁰, δ⁰) means that, for some n ∈ N, there exists a derivation

(γ, δ) ⇒_SD(γ₁, δ₁) ⇒_SD. . . ⇒_SD (γ_n−1, δ_n−1) ⇒_SD(γ⁰, δ⁰)

of (γ⁰, δ⁰) from (γ, δ) in SD. A derivation is leftmost if every step in it is leftmost. The translation dened by SD is the relation

λ(SD) := {(v, w) | (S, S) ⇒^∗_SD (v, w)} (⊆ X^∗× Y^∗) .

Let λ[SDT S] denote the class of translations denable by SDTSs. A translation is syntax-directed (SDT) if it is denable by an SDTS. Moreover, two SDTSs are equivalent if they dene the same translation. In what follows, we might denote the class of all syntax-directed translations by SDT .

We illustrate the notions introduced so far by giving an example of an SDTS that trans-lates a fragment of English into Japanese (Yamada and Knight, 2001, Figure 1). Tîrn uc (2008, Section 5) gives as an example SDTSs that model a fragment of a Romanian-English and an English-Spanish translation.

Example 3.3.2. The system SD = (N, Roman, Roman, P, S, S), where the nonterminals S, VB, VB1, VP , NN, T O, DET and PRP are in N and P consists of the productions

S; S → PRP VB₁VP ; PRP VP VB₁(1 3 2) VB₁; VB₁ → adores; daisuki desu VP ; VP → VB T O; T O VB ga (2 1) VB; VB → listening; kiku no T O; T O → T O NN ; NN T O (2 1) T O; T O → to; wo

PRP ; PRP → he; kare ha NN ; NN → music; ongaku ,

is an SDTS. A derivation in SD is

(S, S) ⇒_SD (PRP VB₁VP, PRP VP VB₁) ⇒_SD(he VB₁VP, kare ha VP VB₁)

⇒_SD (he adores VP, kare ha VP daisuki desu)

⇒_SD (he adores VB TO , kare ha T O VB ga daisuki desu)

⇒_SD (he adores listening T O, kare ha T O kiku no ga daisuki desu)

⇒_SD (he adores listening T O NN, kare ha NN T O kiku no ga daisuki desu)

⇒²_SD (he adores listening to music, kare ha ongaku wo kiku no ga daisuki desu) translating the English text he adores listening to music into its corresponding Japanese sentence kare ha ongaku wo kiku no ga daisuki desu. In this example, we may observe some of the features of SDTSs that may be appealing, up to some extent, not only for the design of compilers but also for syntax-based translation of natural languages: they model simple reordering of parts of sentences required by languages with dierent grammatical structure (e.g., swapping of grammatical categories), they do insertions of extra strings in a derivation step to specify dierent syntactic cases, and they perform rough word-for-word translations between strings of both languages acting like a dictionary.

By restricting the productions of SDTSs or by giving them more freedom (e.g., in the associating process), we may obtain other useful classes of translation dening devices and translations (Aho and Ullman, 1972, 1973, Aho et al., 2006, Wu, 1997, Saers, 2011, Satta and Peserico, 2005, Satta, 2007). These we describe next. Firstly, we focus on the restricted versions of SDTSs.

Denition 3.3.3. An SDTS SD = (N, X, Y, P, S) is said to be

• of order k (k-SDTS, k ≥ 0) if in no production A; A → α; β (σ) in P , the number of nonterminals in α exceeds k,

• simple (sSDTS) if in each production A; A → α; β (σ) in P , σ is the identity permu-tation,

• an inversion transduction grammar (ITG) if in each production A; A → α; β (σ) in P , σ is the identity permutation or the inverse permutation,

• right-linear (rSDTS) if each production in P is of the form A; A → vB; wB (1) or of the form A; A → v; w, where A, B ∈ N, v ∈ X^∗ and w ∈ Y^∗, or

• linear (lSDTS) if each production in P is of the form A; A → vBv⁰; wBw⁰(1) or of the form A; A → v; w, where A, B ∈ N, v, v⁰ ∈ X^∗ and w, w⁰∈ Y^∗.

The classes of translations dened by SDTSs of order k, simple SDTSs, ITGs, right-linear SDTSs, and linear SDTSs are denoted by λ[k-SDTS], λ[sSDT S], λ[IT G], λ[rSDT S], and λ[lSDT S], respectively.

To clarify the above denition, we give some examples.

Example 3.3.4. The SDTS SD1= ({A, S}, {0, 1, ]}, {0, 1, ]}, P, S), where P consists of S; S → A]A; A]A (2 1) A; A → 1A; 1A (1)

A; A → 0A; 0A (1) A; A → ε; ε ,

is an ITG of order 2. Obviously, λ(SD1) = {(v]w, w]v) | v, w ∈ {0, 1}^∗}. The SDTS SD₂ = ({S}, {0, 1}, {0, 1}, {S; S → ε; ε, S; S → 0S; S0 (1), S; S → 1S; S1 (1)}, S) is linear (and hence of order 1) and denes the translation λ(SD2) = {(v, v^R) | v ∈ {0, 1}^∗}. The SDTS of Example 3.3.2 has order 3, and is neither simple nor an ITG. The SDTS

SD3= ({S}, X, X, {S; S → +SS; SS + (1 2), S; S → ∗SS; SS ∗ (1 2), S; S → x; x}, S) where X = {+, ∗, x}, is simple (of order 2) and translates every prex Polish arithmetic expression over X into its corresponding postx Polish expression. The system

SD4 = ({S}, {0, 1}, {0, 1}, {S; S → 0S; 1S (1), S; S → 1S; 0S (1), S; S → ε; ε}, S) is a right-linear SDTS that translates every bit string into its bitwise complement, e.g., 1001 into 0110.

Next we study synchronous context-free grammars, which are generalized SDTSs in which nonterminals associated in a production can be distinct. This decoupling of non-terminals may be essential to capture the syntactic divergences between languages (Huang et al., 2009, p. 565) and it allows more general parse tree transformations (Satta, 2007) as we will formally show in Section 6.2.2. Also, it has been claimed that this stronger expressivity may be very convenient when proving formal properties of the model (Satta, 2007). Let us note that in the NLP community, the term synchronous context-free grammar often refers to the syntax-directed translation schemata considered in Denition 3.3.1 (see

Chiang, 2006, 2007, Zhang et al., 2006, Zhang and Gildea, 2007, for example). However, we prefer the traditional names and to consider the two formalisms separately. The formal denition of a synchronous context-free grammar is as follows (cf. Satta and Peserico, 2005).

Denition 3.3.5. A synchronous context-free grammar (SCFG) is a construct SC = (N, X, Y, P, S, S⁰), where

(1) N, X and Y are as in Denition 3.3.1, (2) S, S⁰ ∈ N are the start symbols, and (3) P is a nite set of productions of the form

A; B → v0A1v1. . . vm−1Amvm; w0B1w1. . . wm−1Bmwm(σ) (3.3.2) where m ≥ 0, A, A1, . . . , A_m, B, B₁, . . . , B_m ∈ N, σ is a permutation on [m], and for every 0 ≤ i ≤ m, vi ∈ X^∗ and wi∈ Y^∗. Note that σ(i) = j has the same meaning as for SDTSs.

The input grammar, output grammar, translation forms, derivation and leftmost deriva-tion are dened for SCFGs in an analogous way as for SDTSs (cf. Satta, 2007, Tîrn uc , 2011, for details). Thus, the translation dened by SC is the relation

λ(SC) := {(v, w) | (S, S⁰) ⇒^∗_SC (v, w)} (⊆ X^∗× Y^∗) . The class of translations denable by SCFGs is denoted by λ[SCF G].

An example of an SCFG is given next.

Example 3.3.6. The device

SC = ({S, A}, {x}, {0, 1}, {S; S → S; A (1), S; S → x; 0, S; A → x; 1}, S, S) is an SCFG. There are exactly two successful derivations in SC:

(S, S) ⇒_SC (x, 0) and (S, S) ⇒_SC (S, A) ⇒_SC (x, 1) , and therefore, λ(SC) = {(x, 0), (x, 1)}.

Note that even if it is obvious how to construct an SDTS dening the same translation, there is no SDTS that can capture exactly the syntactic divergences represented by the two pairs of syntactic trees of SC. This is suggested by the fact that the same nonterminal may appear in a dierent number of occurrences in the pairs of parse trees, which is not the case for an SDTS. More precisely, the following SCFG-rule (Huang et al., 2009)

VP ; VP → VB NN ; VBZ NNS (1 2)

illustrates that Chinese does not have a plural noun (VBZ) or third-person-singular verb (NNS). The dierence between the syntactic trees of SDTSs and SCFGs will be formally shown with the help of tree bimorphisms in Section 6.2.2.

In document Syntax-directed translations, tree transformations and bimorphisms (Page 41-45)