4.5 Do Selectional Preferences Improve Coreference Resolution?
5.1.2 Feature-based Entity Typing
A typical supervised entity typing system extracts features for the given mention m and predicts entity types ˆt based on these features. During training, the system then receives the true entity types t and uses this supervision signal to adjust itself with the aim of making better predictions in the future.
Features are designed to automatically capture as much relevant information about the given entity mention as possible. Entity typing features fall into three main categories:
1. Mention surface features are based on the observation that named entity mentions exhibit certain regularities. For example, Mary often refers to a person, Western person names often consist of two capitalized words, place
names in England have suffixes like -ford and -shire, and the suffix -osis is com- mon in the names of medical conditions and bio-chemical processes. To ex- ploit theses regularities, systems extract surface features from the mention such as the mention text itself, its first few characters, its last few characters, whether it is capitalized, or its length.
2. Mention context features exploit coherence between entity mentions and their surrounding context. For example, consider the following sentence:
(96) After a fierce campaign, X won the election by only one vote.
Here, phrases from the political domain, namely “campaign”, “win the elec- tion”, and “vote”, appear in the context of the entity mention X and suggest that X is a politician.
3. World knowledge features make use of resources beyond the text itself. A simple way of incorporation world knowledge is the use name lists and gazetteers which allow checking, for example, whether a mention contains a common first name (e.g. Mary), place name (London), company type (Inc., Ltd.), or title (Dr., Esq.).
Having investigated the use of world knowledge in chapter 3 and the use of context in chapter 4, in this chapter we turn our focus on mention surface features.
The mention surface features described above are intuitive and interpretable, but suffer from the drawback of symbolic represenation: Knowing that Stephen is a person does not help deciding whether Steven is a person as well, since these names are taken as discrete, opaque objects without inherent similarities. While it is possible to define similarity measures such as string edit distance, these measures are not robust. For example, Stephen and Steven have a small string edit distance of 2 and would be judged correctly as similar, but according to this criterion, the number seven or the city name Steuben, which also have a string edit distance of 2 to Steven would also be classified as a person.
While early coarse-grained and fine-grained entity typing systems (McCallum and Li, 2003; X. Ling and Weld, 2012) employed manually-defined surface fea- tures like the ones described above, more recent neural approaches overcome some of the limits of symbolic representaion via a combination of neural networks and automatically-learned distributional word representations.
104 Chapter 5. Multilingual Entity Typing with Subword Units
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers, pages 1271–1280,
Valencia, Spain, April 3-7, 2017. c 2017 Association for Computational Linguistics
Neural Architectures for Fine-grained Entity Type Classification
Sonse Shimaoka†∗, Pontus Stenetorp‡, Kentaro Inui†
and Sebastian Riedel‡
†Graduate School of Information Sciences, Tohoku University
‡Department of Computer Science, University College London
{simaokasonse,inui}@ecei.tohoku.ac.jp
{p.stenetorp,s.riedel}@cs.ucl.ac.uk
Abstract
In this work, we investigate several neu-
ral network architectures for fine-grained
entity type classification and make three
key contributions. Despite being a natural
comparison and addition, previous work
on attentive neural architectures have not
considered hand-crafted features and we
combine these with learnt features and es-
tablish that they complement each other.
Additionally, through quantitative analysis
we establish that the attention mechanism
learns to attend over syntactic heads and
the phrase containing the mention, both of
which are known to be strong hand-crafted
features for our task. We introduce param-
eter sharing between labels through a hi-
erarchical encoding method, that in low-
dimensional projections show clear clus-
ters for each type hierarchy. Lastly, de-
spite using the same evaluation dataset,
the literature frequently compare models
trained using different data. We demon-
strate that the choice of training data has a
drastic impact on performance, which de-
creases by as much as 9.85% loose mi-
cro F1 score for a previously proposed
method. Despite this discrepancy, our
best model achieves state-of-the-art results
with 75.36% loose micro F1 score on the
well-established FIGER
(GOLD) dataset
and we report the best results for models
trained using publicly available data for
the OntoNotes dataset with 64.93% loose
micro F1 score.
a match series against New Zealand is held on Monday
Output Word Embeddings LSTM Layers A9en:ons Context Representa:on Men:on Representa:on /organiza)on, /organiza)on/sports_team
Figure 1:
An illustration of the attentive en-
coder neural model predicting fine-grained seman-
tic types for the mention “New Zealand” in the ex-
pression “a match series against New Zealand is
held on Monday”.
1 Introduction
Entity type classification aims to label entity men-
tions in their context with their respective semantic
types. Information regarding entity type mentions
has proven to be valuable for several natural lan-
guage processing tasks; such as question answer-
ing (Lee et al., 2006), knowledge base popula-
tion (Carlson et al., 2010), and co-reference reso-
lution (Recasens et al., 2013). A natural extension
to traditional entity type classification has been to
divide the set of types – which may be too coarse-
grained for some applications (Sekine, 2008) –
into a larger set of fine-grained entity types (Lee
et al., 2006; Ling and Weld, 2012; Yosef et al.,
2012; Gillick et al., 2014; Del Corro et al., 2015);
for example person into actor, artist, etc.
Given the recent successes of attentive neural
∗
This work was conducted during a research visit to Uni-
versity College London.
1271
FIGURE 5.2: A typical neural entity typing architecture. See text for details. Image source: Shimaoka et al. (2017).