Feature-based Entity Typing

4.5 Do Selectional Preferences Improve Coreference Resolution?

5.1.2 Feature-based Entity Typing

A typical supervised entity typing system extracts features for the given mention m and predicts entity types ˆt based on these features. During training, the system then receives the true entity types t and uses this supervision signal to adjust itself with the aim of making better predictions in the future.

Features are designed to automatically capture as much relevant information about the given entity mention as possible. Entity typing features fall into three main categories:

1. Mention surface features are based on the observation that named entity mentions exhibit certain regularities. For example, Mary often refers to a person, Western person names often consist of two capitalized words, place

names in England have suffixes like -ford and -shire, and the suffix -osis is common in the names of medical conditions and bio-chemical processes. To exploit theses regularities, systems extract surface features from the mention such as the mention text itself, its first few characters, its last few characters, whether it is capitalized, or its length.

2. Mention context features exploit coherence between entity mentions and their surrounding context. For example, consider the following sentence:

(96) After a fierce campaign, X won the election by only one vote.

Here, phrases from the political domain, namely “campaign”, “win the election”, and “vote”, appear in the context of the entity mention X and suggest that X is a politician.

3. World knowledge features make use of resources beyond the text itself. A simple way of incorporation world knowledge is the use name lists and gazetteers which allow checking, for example, whether a mention contains a common first name (e.g. Mary), place name (London), company type (Inc., Ltd.), or title (Dr., Esq.).

Having investigated the use of world knowledge in chapter 3 and the use of context in chapter 4, in this chapter we turn our focus on mention surface features.

The mention surface features described above are intuitive and interpretable, but suffer from the drawback of symbolic represenation: Knowing that Stephen is a person does not help deciding whether Steven is a person as well, since these names are taken as discrete, opaque objects without inherent similarities. While it is possible to define similarity measures such as string edit distance, these measures are not robust. For example, Stephen and Steven have a small string edit distance of 2 and would be judged correctly as similar, but according to this criterion, the number seven or the city name Steuben, which also have a string edit distance of 2 to Steven would also be classified as a person.

While early coarse-grained and fine-grained entity typing systems (McCallum and Li, 2003; X. Ling and Weld, 2012) employed manually-defined surface features like the ones described above, more recent neural approaches overcome some of the limits of symbolic representaion via a combination of neural networks and automatically-learned distributional word representations.

104 Chapter 5. Multilingual Entity Typing with Subword Units

Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers, pages 1271–1280,

Valencia, Spain, April 3-7, 2017. c 2017 Association for Computational Linguistics

Neural Architectures for Fine-grained Entity Type Classification

Sonse Shimaoka†∗_{, Pontus Stenetorp}‡_{, Kentaro Inui}†

_{and Sebastian Riedel}‡

†_{Graduate School of Information Sciences, Tohoku University}

‡_{Department of Computer Science, University College London}

{simaokasonse,inui}@ecei.tohoku.ac.jp

{p.stenetorp,s.riedel}@cs.ucl.ac.uk

Abstract

In this work, we investigate several neu-

ral network architectures for fine-grained

entity type classification and make three

key contributions. Despite being a natural

comparison and addition, previous work

on attentive neural architectures have not

considered hand-crafted features and we

combine these with learnt features and es-

tablish that they complement each other.

Additionally, through quantitative analysis

we establish that the attention mechanism

learns to attend over syntactic heads and

the phrase containing the mention, both of

which are known to be strong hand-crafted

features for our task. We introduce param-

eter sharing between labels through a hi-

erarchical encoding method, that in low-

dimensional projections show clear clus-

ters for each type hierarchy. Lastly, de-

spite using the same evaluation dataset,

the literature frequently compare models

trained using different data. We demon-

strate that the choice of training data has a

drastic impact on performance, which de-

creases by as much as 9.85% loose mi-

cro F1 score for a previously proposed

method. Despite this discrepancy, our

best model achieves state-of-the-art results

with 75.36% loose micro F1 score on the

well-established FIGER

(GOLD) dataset

and we report the best results for models

trained using publicly available data for

the OntoNotes dataset with 64.93% loose

micro F1 score.

a match series against New Zealand is held on Monday

Output Word Embeddings LSTM Layers A9en:ons Context Representa:on Men:on Representa:on /organiza)on, /organiza)on/sports_team

Figure 1:

An illustration of the attentive en-

coder neural model predicting fine-grained seman-

tic types for the mention “New Zealand” in the ex-

pression “a match series against New Zealand is

held on Monday”.

1 Introduction

Entity type classification aims to label entity men-

tions in their context with their respective semantic

types. Information regarding entity type mentions

has proven to be valuable for several natural lan-

guage processing tasks; such as question answer-

ing (Lee et al., 2006), knowledge base popula-

tion (Carlson et al., 2010), and co-reference reso-

lution (Recasens et al., 2013). A natural extension

to traditional entity type classification has been to

divide the set of types – which may be too coarse-

grained for some applications (Sekine, 2008) –

into a larger set of fine-grained entity types (Lee

et al., 2006; Ling and Weld, 2012; Yosef et al.,

2012; Gillick et al., 2014; Del Corro et al., 2015);

for example person into actor, artist, etc.

Given the recent successes of attentive neural

∗

_{This work was conducted during a research visit to Uni-}

versity College London.

1271

FIGURE 5.2: A typical neural entity typing architecture. See text for details. Image source: Shimaoka et al. (2017).

In document Aspects of Coherence for Entity Analysis (Page 116-118)