Implicit Causal Association (ICA) - Extraction of Background Knowledge

Chapter 3 Knowledge Acquisition for Verb-Verb Pairs

3.2 Extraction of Background Knowledge

3.2.2 Implicit Causal Association (ICA)

In this section, we propose a metric ICA to handle the problem of training data sparseness discussed in the previous section. This metric makes use of functions for the identification of roles of events in a cause relation. After briefly describing the roles of events in a causal relation below, we continue with the description of ICA.

• Roles of Events in a Causal Relation: Each of the two events in a causal relation can be assigned either cause or effect role. For example (3) from section 3.1.1, the verb appearing after “because” represents a cause event and the verb before “because” represents an effect event. These roles of events are given below:

5. A Michigan woman lost custody of her young daughter because she placed the child in day care while attending college classes. (eplace, RC)

6. A Michigan woman lost custody of her young daughter because she placed the child in day care while attending college classes. (elose, RE)

The notations RC and RE represent the cause and effect roles, respectively. Table 3.3 shows the

assignment of roles to the events connected by the unambiguous discourse markers. We used these discourse markers to generate the Explicite_vi-e_vj training corpus.

We use core features of events to determine the likelihood of their roles in causation. These features include lemmas, part-of-speech tags, all senses from WordNet of both verbs and their arguments (i.e., subject and object). Next, we use these features to handle training data sparseness.

Discourse Marker Roles Information Because (evbef ore, rE), (evaf ter, rC)

For this (that) reason (evbef ore, rC), (evaf ter, rE)

Consequently (evbef ore, rC), (evaf ter, rE)

As a consequence of (evbef ore, rE), (evaf ter, rC)

As a result of (evbef ore, rE), (evaf ter, rC)

Table 3.3: A list of causal discourse markers and the assignment of roles to the events of causal relations signaled by these markers. The event evbef ore (evaf ter) is represented by the verb appearing before (after)

the causal discourse marker in text, respectively.

• Handling of Training Data Sparsity: To deal with the problem of training data sparsity, we define the metric ICA as follows:

ICA(vi-vj) = 1 | V P | X I_vi-vj∈V P (CD(vi-vj) × CI × ERMe_vi-e_vj) (3.11)

where CD and CI are defined earlier and ERM determines the likelihood of the roles of events in a

cause relation. We remind the reader that CD is the unsupervised causal dependency of verb-verb pair and CI is the tendency of instance I of a verb-verb pair to belong to the cause class than the

non-cause one using the full set of features from section 3.1.2.

Events Roles Matching (ERMe_vi-e_vj) (equations 3.12 and 3.13) is the negative log-likelihood of events evi

and evj appearing as cause or effect role determined using the causal training instances of Explicitevi-evj

corpus and the core features of events discussed above.

ERMe_vi-e_vj = −1.0 × max(S(evi, RC) + S(evj, RE), S(evi, RE) + S(evj, RC)) (3.12)

S(evi, RC) = n X k=1 log(P (fk| RC)) (3.13) S(evj, RE) = n X k=1 log(P (fk| RE))

Here, S(evi, RC) is the score of evi being a cause event and S(evj, RE) is the score of evj being an effect

Similarly, S(evi, RE) and S(evj, RC) are calculated and max is taken. A high score of ERM represents

low matching of an event-event pair (verbs and their arguments) with the explicit contexts of causal training instances of Explicite_vi-e_vj corpus. The high score of ERM of an event-event pair can have one

of the following two interpretations: (A) it is a non-causal pair, or (B) it is a causal pair but this pair and the pairs which are semantically closer to it hardly appear in explicit and unambiguous causal contexts. In the metric ICA, CD(vi-vj) × CI is used as a guiding score to interpret the scores of ERM

as follows:

– If CD(vi-vj) × CI has a high score then the value of ERM is not penalized by this guiding score

because ERM’s value can be interpreted using (B) above.

– If CD(vi-vj) × CI has a low score then the value of ERM is penalized by this guiding score because

evi-evj can be a non-causal pair according to the interpretation (A) above.

ICA is a boosting factor to determine the causal verb-verb pairs that remain undiscovered due to the problem of training data sparseness. We selected top 500 scored verb-verb pairs using the metric ICA. Following are some examples of causal verb-verb pairs from these 500 pairs: shoot-hold, fall-break, develop-provide, hit-hold, break-make, etc. These examples of pairs are not included in the top 500 list of pairs by the metric ECA due to the problem of training data sparseness. We also observed some false positives in the top 500 scored pairs. Some examples of these pairs are cut-raise, carry-leave, fall- boost, give-take, raise-lower, etc. Notice that in these examples some pairs contain nearly antonymous verbs (e.g., cut-raise, carry-leave, fall-boost, raise-lower) or the verbs in temporal only relation (e.g., give-take). In the next chapter we empirically evaluate performance of the metric ICA by using the causal associations in verb-verb pairs derived from this metric in our model for identifying causality. We also define a Boosted Causal Association (BCA) metric by adding ICA to the original ECA metric as follows: BCA(vi-vj) = 1 | V P | X I_vi-vj∈V P (CD(vi-vj) × CI) + (CD(vi-vj) × CI× ERMe_vi-e_vj) (3.14)

We acquire the likelihood of each verb-verb pair to encode causation via above metrics and store this information in a resource called the knowledge base of causal associations of verb-verb pairs (i.e., KBc).

In document Mining novel sources of knowledge to identify causal information in text (Page 44-47)