Classification models - Coreference resolution approaches

2.4 Coreference resolution approaches

2.4.1 Classification models

This section introduces the three state-of-the-art classification models: mention- pairs, rankers, and entity-mention. Each one is explained in detail, revealing their strengths and weaknesses.

Mention-pairs

Classifiers based on the mention-pair model determine whether two mentions corefer or not. To do so, a feature vector is generated for a pair of mentions using, for instance, the features listed in Section 2.3.3. Given these features as input, the classifier returns a class: CO (coreferent), or NC (not coreferent). In many cases, the classifier also returns a confidence value about the decision taken. The class and the confidence value of each evaluated pair of mentions will be taken into account by the linking process to obtain the final result.

Many systems based on the mention-pair model use decision trees for classification (Aone and Bennett, 1995; McCarthy and Lehnert, 1995; Soon et al., 2001), but many other binary classifiers can be found in state-of-the-art systems. Such classifiers include RIPPER (Ng and Cardie, 2002b), maximum en- tropy (Nicolae and Nicolae, 2006; Denis and Baldridge, 2007; Ji et al., 2005), TiMBL (Klenner and Ailloud, 2008), perceptrons (Bengtson and Roth, 2008), and support vector machines (Yang et al., 2006).

Figure 2.14: A pairwise classifier does not have enough information to classify pairs (“A. Smith,” “he”) and (“A. Smith,” “she”).

The mention-pair model has two main weaknesses: a lack of contextual information and contradictions in classifications. Figure 2.14 shows an example of lack of information. The figure is a representation of a document with four

Figure 2.15: Green edges mean that both mentions corefer, and red edges mean the oppo- site. An independent classification of (“A. Smith,” “he”) and (“A. Smith,” “she”) produces contradictions.

mentions (“Alice Smith,” “A. Smith,” “he,” “she”). The edges between mentions represent the classification in a mention-pair model, green means that the classifier returns the CO class, and red (also marked with an X) returns the NC class. In this case, the lack of information is due to the impossibility of determining the gender of “A. Smith.” Next, Figure 2.15 shows a possible scenario with contradictions. In this scenario, the classifier has determined that the pairs (“A. Smith,” “he”) and (“A. Smith,” “she”) corefer, which causes contradictions when generating the final coreference chains given that the pairs (“Alice Smith,” “he”) and (“he,” “she”) do not corefer.

Rankers

The rankers model overcomes the lack of contextual information found using mention-pairs. Instead of directly considering whether miand mj corefer, more

perspective can be achieved by looking for the best candidate from a group of mentions to corefer with an active mention.

The first approach towards the rankers model was the twin-candidate model proposed by Yang et al. (2003) (motivated by Connolly et al. (1997)). The model formulates the problem as a competition between two candidates to be the antecedent of the active mention. Suppose that mk is the active mention.

The classifier must determine which of the candidates mi and mj would make

the best antecedent. So, in this case, the output classes are not CO and NC but 1 or 2, which indicates the preferred mention between mi and mjto corefer

with mk. The linking process may use this information to avoid errors and

An extension of the twin-candidate model perspective is to consider all the candidates at once, and rank them in order to find the best one (Denis and Baldridge, 2008). This method can obtain more accurate results than the twin- model due to a more appropriate context in which all the candidate mentions are considered at the same time.

Ranker models are strongly linked with backward search approaches. They select the best possible candidate to corefer with an active mention. But note, however, that a candidate is always selected, so rankers cannot determine whether an active mention forms a new chain, i.e., does not have antecedents. Consequently, using these models may require a previous process to classify mentions as anaphoric (has antecedents) or not anaphoric (the first mention of a new entity). This process is usually called an anaphoric filter, and is described in detail in Section 2.4.2.

Entity-mention

We have so far described classification models based on mentions. Even in pairwise or groupwise classifiers (i.e., mention-pair and ranker models), the active mention is always evaluated with the other mentions in the document. The main difference is the number of mentions involved in each classification. In this section, the model changes towards the concept of entity.

A partial entity is a set of mentions considered coreferent during resolution. The entity-mention model classifies a partial entity and a mention, or two partial entities, as coreferent or not. In some models, a partial entity even has its own properties or features defined in the model in order to be compared with the mentions. Due to the information that a partial entity gives to the classifier, in most cases this model overcomes the lack of information and contradiction problems of the mention-based models. For example, a partial entity may include the mentions “Alice Smith” and “A. Smith,” whose genders are “female” and “unknown” respectively. In this case, the partial entity is more likely to be linked with the subsequent mention “she” than with “he” (Figures 2.14 and 2.15).

Many approaches use the entity-mention model combined with different linking processes (Yang et al., 2008; Luo et al., 2004; Lee et al., 2011; Yang et al., 2004; McCallum and Wellner, 2005). The features used for entity-mention models are almost the same as those used for mention-based models. The only difference is that the value of an entity feature is determined by considering the particular values of the mentions belonging to it.

In document ADVERTIMENT ADVERTENCIA. WARNING (Page 38-40)