3.4 Tools used
3.4.3 Dependency structure
Once input sentences have been tokenised and those tokens assigned part-of-speech tags, they can be parsed to produce the dependency structures we rely on. These consist of trees in which each edge indicates a semantic link between two words. For example, a noun may be the subject or direct object (NSUBJ or DOBJ) of a verb, and may itself
be qualified by a determiner (DET) or an adjectival modifier (AMOD). The root of a
dependency tree is an important verb in its sentence.
While a number of possible syntactic frameworks exist, as discussed in Section2.1.1, we have selected dependency structures for a number of simple reasons. First, like context-free structures, dependency trees are able to represent complex linguistic phe- nomena using relatively lightweight grammars and quick parsers. The considerations of
3.4. TOOLS USED 29
size and speed are far from trivial, affecting how practicable any evaluation tool is in the notoriously data-intensive field of translation.
A second relevant feature of dependency parsing is its goal of capturing the key se- mantic relations between words in parsed sentences. While some other formalisms such as CFGs attempt primarily to describe the nature of words and phrases in a sentence, dependency structures attempt to encode thepurpose thereof. We believe that this per- spective is more closely tied with the problem of translation evaluation: two translations should contain elements which, while not necessarily exactly matching in nature, perform the same functions.
Our decision is in keeping with the general trend of translation evaluation, with sev- eral existing tools relying on the technique as discussed in Section2.2.3. However, should another formalism be considered relevant to the techniques we employ, they could be adapted without excessive work: nothing in our tools is inherently tied to dependency structures.
Visual representation
A number of representations of dependency trees exist in literature, including purely tex- tual representations [He, 2010], linear textual sentences with arcs to indicate dependen- cies [Clarket al., 2002; Nivre, 2003], visual tree structures akin to those used for context- free grammars [Hajiˇc et al., 2012] or even combinations of these [Gómez-Rodríguez
et al., 2011].
In the various examples throughout this document of tree structures, we have chosen to use the structure exemplified in Figure3.1. A dependency label, indicating the nature of a dependency relation, may be shown as a string of grey capital letters adjacent to a black line between the parent and child of that link. Such parents and children may be shown in black to indicate that they are not aligned with any word in another tree, while such alignments – introduced in Section3.4.5and first shown in Chapter 4(page45) – are indicated through matching colours.
A number of the examples we use are based on sentences used within our experi- ments (Chapters6-7). This is the case, for example, with the two examples shown in Figure3.1. However, occasionally we have constructed synthetic example sentences or phrases, intended to highlight certain aspects of the tree(s) or algorithm(s) in question.
The original text from which a dependency parse tree was produced can be read directly from such figures. This is done by considering the nodes (words) strictly from left to right, irrespective of vertical position within the tree. For clarity, original sentences are also shown underneath the visual tree structures in most cases.
Note that while dependency labels are included in all the parse trees we use, they are only core to the functionality of one of our tools. DTED (Chapter4) discards the information they contain, and consequently all figures in its chapter omit the dependency labels from any parse trees shown.
Projects used
To generate the parse trees we use in our experiments, we have used two well-known tools, chosen both for their high quality and popularity in other projects and for their opposing and thus complementary approaches to parsing. We were able to choose from
Sentence 1 . started Mälkki Ms a as cellist career her PUNCT POSS POBJ DET DOBJ NSUBJ AUX PREP A years few ago revitalised Kati Reuter textile designer lace the historic snowball
. DET NMOD ADVMOD AMOD PUNCT NSUBJ COMP AMOD DET AMOD AMOD COMP DOBJ Sentence 2
Figure 3.1: Two sample dependency trees. Dependency labels are distinguished from words by grey capital letters.
Sentence 1:Ms Mälkki started her career as a cellist.
Sentence 2:A few years ago textile designer Kati Reuter revitalised the historic snowball lace.
a number of high-profile dependency parsers which are available, including the Malt parser [Nivre et al., 2006], the Stanford Parser [Klein and Manning, 2003b] and the Berkeley parser [Petrov and Klein, 2007].
Each of these represents ongoing long-term collaborative projects, with numerous releases over several years. Each is separated into two primary components, with the main executable typically governing both the parsing of a given sentence and the genera- tion of the second component: information specific to the language being parsed, usually produced using the techniques mentioned in Section 2.1.2. Such models can either be obtained ready for immediate use from the same sources as the parser itself, or can be generated by the end user from a bespoke treebank.
We have chosen to use the first two tools mentioned above: the dependency parser implemented in the MaltParser framework [Nivreet al., 2006], and that produced as part of Stanford CoreNLP [Manninget al., 2014].
The Stanford Parser was originally built as an unlexicalized probabilistic context-free grammar parser [Klein and Manning, 2003b], intended to demonstrate the viability of the unlexicalized approach. This involves the parser annotating phrasal subtrees specifically according to function words (for, to, etc.) rather than simple head nodes: those which have been considered to best represent the nature of a subtree.
The Stanford Parser was then extended to allow conversion from the original phrase- structure trees to dependency relations [de Marneffeet al., 2006]. This is done in two steps: first the semantic head of any given subtree is calculated, often different from the syntactic head produced in the original parse. Next, the types of relations between heads and their erstwhile phrasal siblings are calculated using pattern-matching techniques.
The approach of the MaltParser is rather different from that of the Stanford project, focusing on deterministic shift-reduce parsing rather than the more common probabilistic approach. Unlike the Stanford system the MaltParser does not rely on a grammarper se, instead using a series of learned mappings from parser states to appropriate actions for the parser to take. Nonetheless, both forms of linguistic data are learned from gold-standard treebanks, with both of the systems we use having been trained on the highly popular Penn Treebank [Marcuset al., 1993].
We believe that these two systems represent reliable and high-quality approaches, ensuring that the parses we use are as legitimate as possible. They are, however, different
3.4. TOOLS USED 31 Reference (flattened) Ms Mälkki started her career as a cellist . DEP DEP DEP DEP DEP DEP DEP DEP Reference . started Mälkki Ms a as cellist career her PUNCT POSS POBJ DET DOBJ NSUBJ AUX PREP
→
Figure 3.2: Original and flattened versions of a sample dependency tree
enough from each other to allow for variety in those parses inasmuch as is permitted by the individual sentences. We are thus confident that they will, as intended, allow us to observe the effects of different parses on our tools.