• No results found

Applying WordNet::QueryData for the Semantic Tagging of the

Chapter 4: The Development and the Linguistic Analysis of the Corpus

4.2 The Linguistic Analysis and the Semantic Classification of the Corpus

4.2.2 Using WordNet for Semantic Tagging

4.2.2.3 Using WordNet::QueryData for Semantic Tagging

4.2.2.3.2 Applying WordNet::QueryData for the Semantic Tagging of the

In this research, the WordNet::QueryData module is applied to determine the semantic classes of the target words that are commonly used to write SMART objectives. This demonstrates a task for partial WSD, in which target words in the given objective sentences are labelled using the lexicographer semantic classes

available in WordNet. The following paragraphs illustrate in more detail how the WordNet::QueryData module is used in this research.

As explained earlier in the previous section (Section 4.2.2.2), the SR-AW algorithm has been applied to automatically find the senses of all words in objectives from WordNet based on the context in which they occur. However, this algorithm does not specify the relations between words or synsets as well as is not able to find the semantic categories of synsets. Since this study requires performing a useful text processing for the objectives by adding more general semantic information and disambiguating the words in objective sentences at the semantic class level, it is a necessity to use an approach for automatic semantic class classification of words in text.

Thus, the methodology applied here is based on using the disambiguation output produced by the SR-AW algorithm which includes all the words in the given text of objectives with the senses assigned. To be specific, the result of disambiguating the sense of each target word in the objectives by the SR-AW algorithm is provided to the WordNet::QueryData module for determining a semantic relation of domain category (class) for a specified synset by using information from WordNet.

In the previous section, the SR-AW algorithm has processed and disambiguated the objective sentence “To increase PC gross-sales by 15% in 2009 by making discounts on the overstocked items.”, where the algorithm has retrieved the correct senses for the target words “increase” and “gross-sales” in this objective sentence (e.g. increase#v#2 and gross_sales#n#1).

To be able to retrieve a semantic class for each target word (e.g. “increase”, “gross- sales” etc) in the given objective sentences, the ‘lexname’ function in

WordNet::QueryData is applied to access the WordNet database and return a semantic category (WordNet lexicographer semantic class) for a specified synset (e.g.

increase#v#2, gross_sales#n#1 etc).

Therefore, a Perl script is written for each occurrence of a target word in the given objective sentences based on the disambiguating result produced by the SR-AW

algorithm for that target word occurrence, where the word category and the sense number of each target word occurrence is specified in the scripts. The following represents the created Perl script for the occurrence of the target word “increase

which appears in the above objective sentence:

Example 3:

“use WordNet::QueryData;

my $wn = WordNet::QueryData->new;

print "Category: ", join(", ", $wn->lexname("increase#v#2", "dmnc")), "\n";

As it is shown in the above Perl script, the ‘lexname’ function takes two arguments ("increase#v#2" and "dmnc"), where the first argument ("increase#v#2") states the retrieved meaning of the occurrence of the target word “increase” in the above

objective by the SR-AW disambiguation algorithm. This argument includes the word “increase”, the specified POS tag of the word “increase”, which is a verb (‘v’), and

the specified sense number of the word “increase”, which is ‘2’. The second argument

states the relation name which is “dmnc”. This semantic relation (“dmnc”) indicates the domain category of a synset (e.g. verb.change, verb.possession, verb.creation etc). The above Perl script of Example 3 is run on a Perl editor and provided the following output:

Category:verb.change

Here the program prints the detected lexicographer semantic class of the target word “increase” which is “verb.change”. The ‘lexname’ function in WordNet::QueryData

has exploited the “dmnc” semantic relation to retrieve the lexicographer semantic class for the specified synset ("increase#v#2") from WordNet. The WordNet lexicographer semantic category of the target word “increase” in the given sentence is

specified correctly.

To determine the semantic class of the occurrence of the target word “gross-sales” in

join(", ", $wn->lexname("gross_sales"#n#1", "dmnc")), "\n";”. By executing the modified Perl script, the output is as follows:

Category:noun.possession

Here WordNet::QueryData has used the ‘lexname’ function and the semantic relation

“dmnc” to return the lexicographer semantic category of the target word “gross- sales” from WordNet. Thus, the retrieved lexicographer semantic category of the specified synset ("gross_sales"#n#1") is “noun.possession”. The target word “gross- sales” in the given sentence has been disambiguated with the correct WordNet semantic class.

The objective sentence “To attain at least £888 sales for mobiles by June of next year.” has been processed and disambiguated by the SR-AW algorithm in the

previous section. The disambiguation result showed that the target word “attain” has been assigned with the right sense (“attain#v#1”), while the target word “sales” has been disambiguated incorrectly (“sale#n#3”) in this objective sentence.

The created Perl script for the occurrence of the target word “attain” which occurs in the above objective sentence is as follows:

Example 4:

“use WordNet::QueryData;

my $wn = WordNet::QueryData->new;

print "Category: ", join(", ", $wn->lexname("attain#v#1", "dmnc")), "\n";

In the above Perl script, the ‘lexname’ function, which takes two arguments (“attain#v#1” and "dmnc"), is employed to access WordNet and retrieve the lexicographer semantic category of the synset “attain#v#1” by exploiting the semantic relation “dmnc”. The output of executing the above Perl script is:

Here the program prints the retrieved lexicographer semantic category of the target word “attain” which is “verb.social”. The target word “attain” in the above sentence

has been assigned with the correct WordNet semantic class.

To specify the semantic class of the occurrence of the target word “sales” that appears

in the above objective sentence, the last line of the above Perl script of Example 4 is set to “print "Category: ", join(", ", $wn->lexname("sales"#n#3", "dmnc")), "\n";”. Then, the modified Perl script is run on a Perl editor and produces the following output:

Category: noun.act

Here the program prints the detected lexicographer semantic class of the target word “sales” which is “noun.act”. The retrieved semantic class of the target word “sales

(synset: “sale#n#3”) is incorrect in comparison to the context in which it occurs, since the word “sales” in the above sentence denotes possession and not acts or

actions.