• No results found

Crossing-Sensitive Factorization

Factorizations for projective dependency parsing have often been designed to allow effi- cient parsing. For example, the algorithms in Eisner (2000) and McDonald and Pereira (2006) achieve their efficiency by assuming that children to the left of the parent and to the right of the parent are independent of each other. The algorithms of Carreras (2007) and Model 2 in Koo and Collins (2010) include grandparents for only the outermost grand- children of each parent for efficiency reasons.

In a similar spirit, we avoid the blow-up in parent indices described in Section 5.2.2 by introducing a variant of the Grand-Sib factorization that scores crossed edges inde- pendently (as aCrossedEdgepart) and uncrossed edges under either a grandparent-sibling, grandparent, sibling, or edge-factored model depending on whether relevant edges in its lo- cal neighborhood are crossed (see Table 5.2). Whether the part includes the sibling depends

Local Neighborhood Crossings Crossed(~ehs) ¬Crossed(~ehs) ¬GProj(~ehm) Edge(h, m) Sib(h, m, s)

GProj(~ehm) Grand(g, h, m) GrandSib(g, h, m, s)

Table 5.2: Part type for an uncrossed edge~ehmfor the crossing-sensitive third-order factorization (gism’s grandparent;sism’s inner sibling).

on whether the edge~ehs from the parent to the sibling is crossed. GProj(~ehm)(Definition 13) determines whether~ehm’s local neighborhood is sufficiently projective to include the grandparent in the part.

Our parser will find the optimal 1-Endpoint-Crossing tree under this new factorization, solving the optimization problem below:

arg max y∈Y1−EC X g,h,m,s| ~egh∈y, ~ehm∈y, Sib(m,s)∈y Score(P art(g, h, m, s)) (5.1) P art(g, h, m, s) =                     

GrandSib(g, h, m, s) : ¬Crossed(~ehm)∧GProj(~ehm)∧ ¬Crossed(~ehs)

Grand(g, h, m) : ¬Crossed(~ehm)∧GProj(~ehm)∧Crossed(~ehs)

Sib(h, m, s) : ¬Crossed(~ehm)∧ ¬GProj(~ehm)∧ ¬Crossed(~ehs)

Edge(h, m) : ¬Crossed(~ehm)∧ ¬GProj(~ehm)∧Crossed(~ehs)

CrossedEdge(h, m) : Crossed(~ehm))

A fully projective tree would decompose intoexclusivelyGrandSibparts (as all edges would be uncrossed andGProj). As all projective trees are within the 1-Endpoint-Crossing search space, the optimization problem above includes all projective trees scored with grand-sibling features everywhere. Projective parsing with grand-sibling scores can be seen as a special case, as the crossing-sensitive 1-Endpoint-Crossing parser can simulate a grand-sibling projective parser by setting allCrossed(h, m)scores to−∞.

e1 e2 i1 i2 i3 h g e3 e4

Figure 5.4: The exterior children are numbered first beginning on the side closest to the parent, then the side closest to the grandparent. There must be a path from the root tog, so the edges fromhto its exterior children on the far side ofgare guaranteed to be crossed.

linear programming formulation of dependency parsing has previously been found to im- prove a parser’s accuracy (Martins et al., 2009).

To defineGProj, we first need a few auxiliary definitions:

Definition 11. Consider any edge~egh in a given tree y. We can partition h’s children

into two disjoint sets: Interiorg(h), and Exteriorg(h). Interiorg(h) consists of those

children of h that lie between g and h (in the linear order of words in the sentence).

Exteriorg(h)consists of the complementary set of children.

For each parenth, grandparentg, and subsetInteriorg(h)andExteriorg(h), we enu- merate the children in each subset in the following order: forInteriorg(h)the vertices are numbered from closest tohthrough furthest fromh; forExteriorg(h), we first number the vertices on the side closest to h from closest to h through furthest, then wrap around to include the vertices on the side closest tog. Figure 5.4 shows a parenth, its grandparentg, and a possible sequence of three interior and four exterior children.

Note that for a projective tree, there would not be any children on the far side ofg.

Definition 12. Outer(m)is the set of siblings tomthat are in the same subset of children and are later in the enumeration thanmis.

For example, in the tree in Figure 5.2,Outer(most) ={days,cars}.

Definition 13. An uncrossed edge~ehmisGProj if both of the following hold:

1. The edge~eghfrom the parent ofhtohis not crossed

CrossedEdge(*,do) Sib(cars,Which,-) CrossedEdge(favor,cars) Sib(do,Americans,-) Sib(do,favor,Americans) CrossedEdge(do,?) Sib(favor,most,-) Sib(favor,days,most) GSib(favor,days,these,-)

Table 5.3: Decomposing Figure 5.2 according to the crossing-sensitive third-order factorization described in Section 5.3. Null inner siblings are indicated with-.

In Figure 5.2, the edge from do to Americans is notGProj because Condition (1) is violated, while the edge fromfavortomostis notGProj because Condition (2) is violated. Table 5.3 lists the parts in the tree in Figure 5.2 according to this crossing-sensitive third- order factorization.

This definition eliminates the problematic grandparent cases discussed in Section 5.2.2, assuming that (1) for any sub-problem with an exterior point, all edges incident to the exterior point are crossed and (2) there exists at least one such edge (Section 5.4.1 will discuss how these assumptions are enforced).

Consider again the problematic case of the vertexfavorin Figure 5.2, with children in both[*,do]∪ {favor}and[favor,?]∪ {do}. Under a standard grandparent-sibling factor- ization, all three sub-problems would have needed an additional index noting the parent of favor. Under the crossing-sensitive grandparent-sibling factorization, the sub-problem [*,do]∪ {favor}now no longer needs this grandparent index, as all edges fromfavorto the interval are guaranteed to be crossed and thus scored independently.

Since the parent of favor is found in [do,favor], all children of favorin (favor,?]are

Exterior children. There must exist at least one child of favor in [*,do]∪ {favor} (As- sumption 2) and the edge to such a child must be crossed (Assumption 1). This child is on the opposite side offavor’s parent, and so would be anOutersibling toallof the children of favor in (favor,?] (by “wrapping around”). Therefore all of the edges from favor to children in(favor,?]would violate Condition (2) and be¬GProj, avoiding the need for a

Finally, since no children outside[do,favor]will need to knowfavor’s parent, the mid- dle sub-problem does not need to add a grandparent index for consistency reasons.

This section described the crossing-sensitive third-order factorization. The next section describes the parsing algorithm that finds the optimal 1-Endpoint-Crossing tree according to this factorization.