2.2 Related work
2.4.3 Visualisation
An indispensable tool for working with DMRS is a visualiser. We heavily relied on De- mophin28, a DELPH-IN web demo, producing DMRS graphs directly from ACE output. The librarypydmrs includes a tool by Matic Horvat, which adapts Demophin to display DMRS from XML. All the DMRS figures presented in this thesis were produced by one of these tools.
26https://github.com/delph-in/pydelphin 27https://github.com/delph-in/pydmrs
28https://github.com/goodmami/demophinor a version hosted on the University of Washington
Chapter 3
Rule-based DMRS chunking
In this thesis we explore the concept of a semantic chunk as a self-contained unit of meaning that can be processed independently without loss of useful information. In Chapter 2 we considered this definition in the context of existing semantic terminology. The remainder of the thesis focuses on practical aspects of semantic chunking for DMRS and surface string representations.
The experiments described in this chapter evaluate our earlier claim that syntactic clauses are suitable candidates for semantic chunks (§2.1.3). We start by investigating finite clauses with a subject-verb structure in a closed set of grammatical constructions (§3.1). Finite clauses express propositions and are the most self-contained type of clauses, since they can occur on their own as standalone sentences and are valid inputs of tasks such as parsing. At the same time, they are the most restrictive interpretation of semantic chunks (§2.1.2).
The first chunking method we propose acts as a proof of concept for the task. The system described below is based on a set of hand-crafted rules for detecting chunking opportunities in DMRS graphs. The rules cover a selection of grammatical constructions which introduce coordinated and subordinated finite clauses. In §3.5 we examine the produced chunks and then demonstrate how they can be applied to realization (§3.6). Although realization is the primary target task of our chunking system, we introduce parsing as an additional objective, which is possible because of the close relationship between the two operations within the *MRS framework (§2.4.1). The resulting semantic chunks become the basis of our Chapter 4
investigation into surface-based semantic chunking.
The prototype chunking system described in this chapter has its origins in the author’s Master’s dissertation, and the related preliminary results were published at the 2016 ACL student workshop as Graph- and surface-level chunking (Muszy´nska, 2016). The realization experiments from §3.6 were published at the 10th International Conference on Natural Lan-
guage Generation (INLG ’17) as Realization of long sentences using chunking (Muszy´nska and Copestake, 2017).
3.1
Finite clause chunks
A key property of a semantic chunk is its independence from the rest of the sentence. We discussed in Chapter 2, in particular §2.1.4, how this constraint affects the choice of suitable candidates for chunks. The experiments in this chapter assume realization (§2.4.1) as the target application of semantic chunking. The chart generation used in the *MRS framework leverages the bidirectional nature of the ERG grammar and can be considered an inverse to parsing. The generator finds potential surface strings matching the input structure and attempts to match their grammar-based analyses to the original one. If the input MRS does not obey the grammar constraints, its realization fails.
Taking this into account, we choose to err on the side of strictness and maximally restrict the form of allowed DMRS chunks. A valid semantic chunk for the purpose of the rule-based prototype chunker is a subgraph of a DMRS representing a finite clause in a subject-verb form. Since finite clauses of this form can exist as standalone sentences, they can be straightforwardly processed by the generator, assuming they are sourced from a well-formed parse.
At the same time we choose the chunk form with the secondary objective in mind. Se- mantic chunks created in this chapter become the starting point of the experiments described in the next one. In particular, we use them as a training dataset for a semi-supervised surface chunking model (§4.3). Leveraging the bidirectional nature of the ERG, we chose parsing as the target task of the surface chunking, as the symmetry of the two processes suggests that they can benefit from similar chunks. The focus on the chunking precision ensures a high-quality dataset and consequently fewer errors propagating down the pipeline.
Finite clauses are represented in DMRS by semantic subgraphs centred around a tensed node (§2.3.4). Because of the way in which the ERG encodes copular constructions (§2.3.1), the node can be a verb, an adjective or a preposition. In this chapter we exclude from our consideration subjunctive and relative clauses. Subjunctive clauses, such as the subordinate clause in Example 53 ((he should default on the loan = (if) he ever happens to default), are often indistinguishable in their surface form from declarative clauses with a modal verb (= it would be wise of him to default). Without the disambiguating context of the rest of the sentence, the correct analysis cannot be guaranteed.