Features - Maximum Entropy-based Answer Extraction Model

Chapter 6 Maximum Entropy-based Answer Extraction Model

6.3 Features

Surface Features We incorporate four types of surface features into Maximum En- tropy model.

• Expected Answer Type Matching Features: If the semantic category of answer candidate accords with the expected answer type (EAT) of question, EAT feature fires. The identification of question expected answer type was discussed in Sec- tion 3.1.2 and the recognition of answer candidate semantic category was discussed in Section 3.4.

• Orthographic Features: They capture the surface format of answer candidate, such as capitalizations, digits and lengths, etc. We expect to judge what a proper answer looks like from word format point of view since the semantic category of answer candidate naturally might not be correctly recognized all the time.

• POS Features: For certain question type, if the words in answer candidate belong to certain POS type, one POS feature fires. It is also expected to backup the fail of semantic category recognition.

• Surface Pattern Matching: Considering that there are questions with very high frequency to be asked in TREC, we build question patterns to map high frequent questions to classes and extract answers for the question classes using answer patterns. Surface pattern matching is discussed in Section 3.1.4. Once question and answer pattern matching succeeds, one feature of suface pattern matching fires.

Table 6.2 lists some examples of surface features. All of them are binary features. In addition, many other features, such as the answer candidate frequency, can be ex- tracted based on the Sentence Retrieval output and are thought as an indicative evidence for the Answer Extraction (Ittycheriah and Roukos, 2002). However, in this thesis, we

Table 6.2: Surface Features

Features Examples Explanation

EAT matching EAT DAT ac type matchs the EAT (DATE) of question EAT PERSON ac type matchs the EAT (PERSON) of question EAT DISTANCE ac type matchs the EAT (DISTANCE) of ques-

tion

Orthographic

SSEQ Q ac is a subsequence of question

CAP EAT LOC ac is capitalized and the EAT is LOCATION LNGlt3 EAT PER ac length is less than 3 and the EAT is PERSON POS CD EAT NUM syn. tag of ac is CD and the EAT is NUMBER NNP EAT PER syn. tag of ac is NNP and the EAT is PERSON Surface Pattern SUR PTN question and answer pattern matching succeeds Matching

are to focus on the Answer Extraction Module independently, so we do not incorporate such features in the current model.

Dependency Relation Features The extraction of dependency relation information is discussed in Chapter 4, which consists of dependency relation pattern matching and dependency relation correlation. Both of them are on the basis of the comparison between dependency relations of question and answer sentence.

Dependency relation pattern matching was discussed in Section 4.3. We respec- tively extract question and answer patterns from training data. These patterns will be used to exact answer for an unseen question. We firstly match the unseen question to the question patterns. Once we get the matched question pattern, the answer patterns evoked by the question pattern will be further matched to pinpoint proper answers. A string kernel, calculating the similarity between two sequences, is used to tolerant answer pattern matching in stead of exact matching. The feature value is set as the answer pattern matching score. The experiments (Section 7.3.2) will evaluate the coverage of the pattern sets and the performance of the two pattern matching methods.

Q: What party led Australia from 1983 to 1996? Target: party

Topic: Australia Constraint: 1983; 1996 Verb: lead

Figure 6.1: Examples of question phrase types

Dependency relation correlation was discussed in Section 4.4. Dependency relation paths in question are firstly paired with paths in answer sentence according to question key word mapping. Then a dynamic time warping algorithm is applied to align the paired relation paths and calculate their correlation. The correlation of two paths relied on the correlation of individual relations which are statistically estimated from training data. Finally, the correlation score is used as feature value. Two facts are considered to affect relation path comparison: question phrase type and path length. For each question, we divide question phrases into four types: target, topic, constraint and verb. Figure 6.1 shows an example of each question phrase type.

• Target is a kind of word which indicates the expected answer type of a question, such as ”party” in ”What party led Australia from 1983 to 1996?”.

• Topic is the event/person that a question is talking about, such as the word ”Aus-

tralia” in the above example question. Intuitively, it is the most important phrase

of a question.

• Constraint is the other question phrase except topic, such as ”1983” and ”1996”.

• Verb is the main verb of a question, such as ”lead”.

Furthermore, since shorter path indicates closer relation between two phrases, we dis- count path matching score by dividing the score by the question path length. Lastly, we

sum the discounted path matching score for each type of question phrases and fire it as a feature, such as

• Target Ptn=p, where ”p” is the pattern matching score for question target words.

• Topic Cor=c, where ”c” is the path correlation value for question topic words.

Totally, there are 8 dependency relation features to fire for each answer candidate, in- cluding Target Ptn, Topic Ptn, Constraint Ptn, Verb Ptn, Target Cor, Topic Cor, Con- straint Cor and Verb Cor.

Semantic Structure Matching Features FrameNet-style semantic role information was discussed in Chapter 5. We present an automatic method for semantic role as- signment which is based on the comparison of dependency relation paths attested in FrameNet annotations and raw texts. We formalize the search for an optimal role assign- ment as an optimization problem in a bipartite graph. This formalization allows us to find an exact globally optimal solution. In addition, the soft labeling is enabled in the optimization which goes some way towards addressing coverage problem related with FrameNet. Finally, semantic structure matching is formulated as a graph matching problem. The matching score is used as the value of semantic structure matching feature.

Finally, the ME-based ranking model incorporate the surface features, dependency relation features and semantic structure matching features to rank answer can- didates.

Chapter 7

In document Exploring rich evidence for maximum entropy-based question answering (Page 133-137)