• No results found

We have thus far discussed our motivations for producing minimum spanning trees, and the mechanisms through which we generate meta-graphs and the weights of edges within them. Before we can approach the question of producing the spanning trees themselves, we must address the duality of our trees.

We remember that the edge weights in a meta-graph are defined in terms of pairs of nodes which are shared between two dependency trees. Both dependency trees are thus essential for any meta-graph, although we generate a meta-graph from the set of aligned nodes in just one of the two. As mentioned in Section 5.4.1, the dependency tree on which the meta-graph is based is selected through the ordinality of the alignment relation

5.5. SINGULARITY OF SPANNING TREES 65

C between the trees. Specifically, we stipulate that no node in one dependency tree D – the dependency tree whose node setND is used to generate the meta-graph – can be

aligned to more than one in the other,D0.

However, it would not be meaningless to create two separate meta-graphs, one from Dand one fromD0. Indeed, given our goals of considering every node in both sentences, it would appear at first glance to be necessary to produce two such graphs in which to generate two separate spanning trees.

In this section, we demonstrate why this assumption is incorrect and only one meta- graph is needed. This greatly simplifies the problem we need to solve, whose discussion is continued in Section5.6. To justify the simplification, for this section only we assume that two meta-graphs have been produced, based on the nodes fromDandD0respectively and namedM andM0, and a spanning tree is produced in each. We show that the result is equivalent, in all respects which we consider important, to that of a single spanning tree in the meta-graph based onD.

We rely here on the assumption of strict one-to-many alignment ordinality between D and D0. This is done to simplify the explanation, yet in Section 5.7 we show that this assumption is in reality unimportant and the choice of base dependency tree can be arbitrary.

5.5.1 Mapping edge sets

We begin by observing a number of relative characteristics of the two meta-graphs. We can consider two edges, one in each meta-graph, to be related if the nodes connected by each are themselves aligned. More formally: for any pair of edges(nM, mM) and

(nM0, mM0)in meta-graphsM andM0 respectively, we consider that the edges arecoun-

terpartsifC(nM, nM0), C(mM, mM0).

The assumption we have made about the ordinality of aligned nodes provides further information about counterpart edges. Recall our stipulation that no node in one meta- graphM be aligned to more than one in the other,M0, coupled with the requirement by the definition of meta-graphs that all nodes must be aligned to at least one in the opposing meta-graph. Given an edge(nM, mM)inM, we thus have two possibilities for edges in

M0: there can be either zero edges(nM0, mM0)inM0 if bothnM andmM are aligned to the same node inM0, or one if they are aligned to separate nodes.

Given a set of edgesTM inM, we can use this knowledge to generate a setTM0 inM0 of maximum size equal to that ofTM. We consider that due to the shared nature of the

generation of their weights, the generation ofTM0 is a direct byproduct of the generation ofTM rather than a separate process.

Further, we consider that it would be unreasonable to produce two edge sets which were not related through this mechanism: it would be meaningless to include any edges in either TM or TM0 without similarly including edges in the other such that the two trees are related through counterpart edges. For example, to include in TM the edge

(becoming,the)in the reference meta-graph in Figure5.2, one must also include inTM0

the edge(becomes,the)in the hypothesis meta-graph as each edge is only meaningful in the context of the other.

5.5.2 Mapping minimum spanning trees

We now consider a special case of TM, such that it fulfills the criteria stated above:

representing a minimum spanning tree inM. We investigate which of these properties transfer toTM0.

First, we trivially observe that the total weight of the edges inTM0 must equal that of TM. This is because the edge weights are defined through the alignments relationC, the

same mechanism used to generate the setTM0 itself. Thus, all one-to-one edge mappings must have the same weights. In the case of edges(n, m)inTM which have no counterpart

inTM0, the definition of the weight|(n, m)|forces the cost of such edges to be zero. From this, we can show that the combined cost ofTM andTM0 is minimal. Given that the weights of the two trees are equal, the total cost cannot be reduced in one tree without reducing it in the other. However, by definition the spanning treeTM is minimal, so its

cost cannot be reduced. If there can be no reduction in cost of either tree, their combined weight must be minimal.

Finally, we demonstrate thatTM0 must represent a connected graph inM0. Consider a situation which invalidates this: two nodesnM0, mM0 ∈NM0 are not connected by any contiguous path inTM0. We observe that their aligned counterpartsnM, mM ∈ M such thatC(nM, nM0), C(mM, mM0)must be connected inTM due to its nature as a spanning tree, withPM representing the set of edges in the (unique) path between them.

For each edge (nM, mM)inPM there are two possibilities, either of which allow us

to consider that their counterpart(s) in M0 are connected: either those counterparts are the same node, or they are connected by an edge which must be inTM0 in order to allow (nM, mM)to itself be inTM. Note that the contiguity of edges inPM requires that all

nodes be part of exactly two edges; as these nodes transfer uniquely to the counterpart nodes inM0, the path inPM0 must similarly be contiguous. Thus, the set of all counter- parts of edges inPM must connectnM0 andmM0, and we have a contradiction.

With the properties of minimality and connectedness being shared between TM and

TM0, we can consider that the two represent the best possible solution to our problem. Most importantly, the fact that both are connected graphs ensures that we have considered every error relative to every other, ensuring that no potential errors are ignored. The minimality of their combined cost, and the nature ofTMas a spanning tree without cycles,

ensure that we have duplicated errors as little as reasonably possible.

0 a b c 0 Meta-graph (Hypothesis) 0 0 b e 0 0 a 0 Meta-graph (Reference) d

Figure 5.5: Sample meta-graphs, using a similar visual convention to Figure5.1, with only black edges included in possible minimum spanning trees. Weights are indicated next to their respective edges.

Note that it is entirely possible forTM0 to contain cycles, for the simple reason that it can contain fewer nodes thanTM. If multiple nodes inM map to single nodes inM0, as