Comparing Dependency and Constituent Syntax for Frame-semantic Analysis

(1)

Comparing Dependency and Constituent Syntax for Frame-semantic Analysis

Richard Johansson, Pierre Nugues

Lund University

{richard, pierre}@cs.lth.se

Abstract

We address the question of which syntactic representation is best suited for role-semantic analysis of English in the FrameNet paradigm. We compare systems based on dependencies and constituents, and a dependency syntax with a rich set of grammatical functions with one with a smaller set. Our experiments show that dependency-based and constituent-based analyzers give roughly equivalent performance, and that a richer set of functions has a positive influence on argument classification for verbs.

1. Introduction

The role-semantic paradigm (Gildea and Jurafsky (2002), inter alia) is a prominent model in automatic semantic anal-ysis with a wide range of proposed applications. With a few exceptions, role-based semantic analysis relies cru-cially (Gildea and Palmer, 2002; Punyakanok et al., 2005) on some sort of syntactic representation as input. This en-ables the analyzer to extract syntactic features that are used by statistical classifiers. The connection between syntax and semantics has also been noted in annotation projects of semantic treebanks; for instance, the SALSA project (Bur-chardt et al., 2006) and the Prague Dependency Treebank (Hajiˇc, 1998) have annotated semantic structures on top of syntactic treebanks.

The CoNLL 2004 and 2005 shared tasks (Carreras and Màrquez, 2005) were the first events that conducted an impartial evaluation of role-semantic labelers on the Prop-Bank corpus. More recently, SemEval organized a simi-lar evaluation (Baker, 2007) using the FrameNet corpus. While nearly all participants in the CoNLL shared tasks used a constituent representation that underlies the Prop-Bank annotation, the best-performing system (Johansson and Nugues, 2007b) of SemEval 2007 used dependency graphs.

These evaluations are not directly comparable however. They use different corpora, annotation, and training meth-ods and from these results, it is difficult to conclude what is the optimal representation to use in the parsing step. Some grammatical features used in constituents and dependencies may be directly equivalent. Conversely, some other features are tied to a specific representation. In addition, the contri-bution of the features and their behavior as a function of the training set and its size would still need more exploration. This paper addresses the question of how the syntactic representation influences the performance of automatic se-mantic analysis in the paradigm of Frame Sese-mantics (Fill-more, 1976; Baker et al., 1998). To study the impact of syntactic representations, we performed a set of experi-ments in which we compare the performance of constituent-based and dependency-constituent-based semantic analyzers on the three main subtasks of FrameNet-based role-semantic anal-ysis.

2. Automatic Frame-Semantic Analysis

The task of frame-semantic analysis is usually divided into three main subtasks: detection and disambiguation of target

words, detection of semantic arguments, and finally classi-fication of arguments (see Figure 1).

Target word detection and frame assignment

Do I want him to see me ?

Argument detection

Do [I] [want] [him] [to [see] [me] ] ?

PERCEPTION_EXPERIENCE

DESIRING

Argument classification

Do [I] [want] [him] [to [see] [me] ] ?

DESIRING

EXPERIENCER FOCAL_P. EVENT

PHENOMENON PERCEIVER_PASSIVE

Do I [want] him to [see] me ?

DESIRING

Figure 1: The stages in the frame-semantic structure extrac-tion process.

Although it is generally agreed that the best performance is obtained when the tasks are solved jointly, it is easiest from an engineering point of view, and computationally less ex-pensive, to treat them as independent tasks that are solved sequentially.

(2)

(3)

(4)

(5)

Charles J. Fillmore. 1976. Frame semantics and the nature of lan-guage. Annals of the New York Academy of Sciences: Confer-ence on the Origin and Development of Language, 280:20–32. Daniel Gildea and Daniel Jurafsky. 2002. Automatic labeling of

semantic roles.Computational Linguistics, 28(3):245–288. Daniel Gildea and Martha Palmer. 2002. The necessity of

syntac-tic parsing for predicate argument recognition. InProceedings of ACL.

Jan Hajiˇc. 1998. Building a syntactically annotated corpus: The Prague Dependency Treebank. InIssues of Valency and Mean-ing, pages 106–132.

Richard Johansson and Pierre Nugues. 2007a. Extended constituent-to-dependency conversion for English. In Proceed-ings of NODALIDA.

Richard Johansson and Pierre Nugues. 2007b. Semantic structure extraction using nonprojective dependency tress. In Proceed-ings of SemEval-2007.

Ryan McDonald and Fernando Pereira. 2006. Online learning of approximate dependency parsing algorithms. InProceedings of EACL.

Joakim Nivre, Johan Hall, Jens Nilsson, Atanas Chanev, Gül¸sen Eryi˘git, Sandra Kübler, Svetoslav Marinov, and Erwin Marsi. 2007. MaltParser: A language-independent system for data-driven dependency parsing. Natural Language Engineering, 13(2):95–135.

Vasin Punyakanok, Dan Roth, and Wen-tau Yih. 2005. The ne-cessity of syntactic parsing for semantic role labeling. In Pro-ceedings of IJCAI.