Assessment of the Knowledge of Causal Semantics of Nouns

Chapter 6 Identifying Causality in Verb-Noun Pairs

6.2.3 Assessment of the Knowledge of Causal Semantics of Nouns

In this section, we first supply the information about the semantic classes of nouns with a high and low tendency to encode causation and then add the information of the association of metonymies with the

Score SC +NER +SCN¬MWNETnp +SCN¬MFNET-WNETnp Accuracy 28.86 48.13 62.12 71.86 Precision 13.52 17.34 21.23 26.18 Recall 92.59 89.50 80.86 75.30 F-score 23.60 29.05 33.63 38.85 Accuracy 61.46 67.61 75.53 80.73 Precision 19.46 22.22 26.88 32.02 Recall 71.60 69.13 61.72 55.55 F-score 30.60 33.63 37.45 40.63

Table 6.3: The performance of the supervised classifier (i.e., SC) and the model after the addition of information of semantic classes of nouns. The column NER represents the model in which the semantic classes of nouns are identified by merely relying on NER. The term (SCN¬M) is used to refer the model with

information of semantic classes of nouns but the information regarding Metonymies is not yet available. The information of the semantic classes of nouns is acquired using a NER and a supervised classifier for the labels Cnp and ¬Cnp trained via either WNETnp or FNET-WNETnp corpus. The first (second) row of the table

presents results over the supervised classifier (SC) executed using NB (MaxEnt) classification algorithms, respectively.

noun phrases.

For the current model, we predict the labels Cnp and ¬Cnp for the noun phrases using the named

entity recognizer (Finkel et al., 2005) and a supervised classifier trained via either WNETnp or FNET-

WNETnp corpus (see section 5.2.1). The training corpus of WNETnp consists of instances of unambiguous

nouns (or noun phrases) with the labels Cnp and ¬Cnp. These unambiguous nouns (or noun phrases) are

extracted from WordNet. These nouns are unambiguous because all of their senses originate from the same semantic hierarchy. However, the training corpus FNET-WNETnp contains instances of both ambiguous

and unambiguous noun phrases. This training corpus is the representative of the real data set with both ambiguous and unambiguous instances of noun phrases.

Table 6.3 provides the performance of our model with the addition of information of semantic classes of nouns. We use the term “SCN¬M” to refer to the model with information of the semantic classes of nouns but

the information regarding metonymies is not yet available to the model. With the addition of information of semantic classes of nouns, our model gains both accuracy and F-score to a great extent. The prediction of semantics classes of nouns via classifier relying on the FNET-WNETnp training corpus provides lots of

improvements in performance as compared with the models relying only on NER and the training corpus WNETnpfor this purpose.

As it is revealed in Table 6.3, the model +SCN¬MFNET-WNETnp gains 15.25% (10.03%) F-score over the

supervised classifier build using NB (MaxEnt) classification algorithms, respectively. Similarly, the model +SCN¬M_FNET-WNETnp gains 43% (19.27%) accuracy over the supervised classifier implemented via NB (Max-

Score SC +SCN¬M +SCNM1 +SCNM1GR +SCNM1GR+M2 Accuracy 28.86 71.86 71.35 71.42 71.64 Precision 13.52 26.18 26.29 26.34 27.54 Recall 92.59 75.30 78.39 78.39 85.18 F-score 23.60 38.85 39.37 39.44 41.62 Accuracy 61.46 80.73 80.65 80.73 81.02 Precision 19.46 32.02 32.41 32.52 34.09 Recall 71.60 55.55 58.02 58.24 64.19 F-score 30.60 40.63 41.59 41.68 44.53

Table 6.4: The performance of the supervised classifier (SC) and the model after the addition of information of semantic classes of nouns with no knowledge of metonymies (SCN¬M), information of semantic

classes of nouns and metonymies derived via method M1(SCNM1), information of semantic classes of nouns

and metonymies derived via method M1GR (SCNM1GR) and information of semantic classes of nouns and

metonymies derived via methods M1GR and M2 (SCNM1GR+M2)

the recall drops by around 16% on both NB and MaxEnt supervised classifiers. We have observed that the model with information of semantic classes of nouns helps reducing lots of false positives and this leads to significant raise in accuracy, precision and F-Score. But our model does not currently have information regarding the association of metonymies and this leads to lots of false negatives in the predictions of the model +SCN¬MFNET-WNETnp.

Next, we add information regarding metonymies to avoid as many false negatives as possible. Our objective is to recover the recall while not reducing accuracy, precision and F-score. Table 6.4 provides results after the addition of information of metonymies associated with the noun phrase. In the section 5.2.2, we have introduced two methods for metonymy resolution. One of these methods employing the verb frames is referred with the name M1 and the other method employing prepositions is referred with the name M2.

For the method M1, we also evaluate the performance of our model when the metonymies are identified by

depending only on the core grammatical (dependency) relations of subject, object and agent. This method is referred with the name M1GR where GR = {csubj, csubjpass, nsubj, nsubjpass, xsubj, dobj, iobj, pobj,

agent}

Table 6.4 shows that the addition of information of metonymies allows the model +SCNM_1GR+M2 to

achieve 2.77% (3.9%) gain in F-score over +SCN¬M with NB (MaxEnt) supervised classifier, respectively.

In fact the method M1GR of metonymy resolution recovers 3.09% (2.69%) recall over +SCN¬M with NB

(MaxEnt) supervised classifier, respectively. Even more, the method M1GR + M2 recovers more than 8%

recall over +SCN¬M and these improvements in recall are not at the cost of precision, accuracy, F-score. In

fact, the method M1GR + M2 allows the model +SCNM1GR+M2 to boost the precision of the model by more

Score SC +SCNM +SCNM + ¬Cev = {R} +SCNM + ¬Cev = {R, I S} Accuracy 28.86 71.64 73.26 73.62 Precision 13.52 27.54 28.63 28.84 Recall 92.59 85.18 83.95 83.33 F-score 23.60 41.62 42.70 42.85 Accuracy 61.46 81.02 81.46 81.46 Precision 19.46 34.09 34.78 34.78 Recall 71.60 64.19 64.19 64.19 F-score 30.60 44.53 45.11 45.11

Table 6.5: The performance of the supervised classifier (SC) and the model after the addition of knowledge of causal semantics of nouns (SCNM where M = M1GR+M2), knowledge of causal semantics of verbs with

¬Cev = {R} and ¬Cev = {R, I S}.

In document Mining novel sources of knowledge to identify causal information in text (Page 111-114)