Measuring Semantic Coverage

(1)

Measuring Semantic Coverage

Sergei Nirenburg, Kavi M a h e s h and S t e p h e n Beale

C o m p u t i n g R e s e a r c h L ~ b o r ~ t o r y N e w M e x i c o St~tte U n i v e r s i t y

bt~s C r u c e s , N M 8 8 0 0 3 - 0 0 0 1

USA

s e r g e i ; m ~ h e s h ; s b { ~ c r l . n m s u . e d u

A b s t r a c t

T h e developlnent of natural language processing systems is currently driven to a large extent by measures of knowledge- base size and coverage of individual phe- n o m e n a relative to a corpus. While these measures have led to significant advances for knowledge-lean applications, they do not adequately m o t i v a t e progress in c o m p u t a t i o n a l semantics leading to the development of large-scale, general purpose N L P systems. In this article, we argue t h a t depth of semantic representation is essential for covering a broad range of p h e n o m e n a in the c o m p u t a - tional t r e a t m e n t of language and propose (lepth as an i m p o r t a n t additional dimension for measuring the semantic cover-

age of N L P systems. We propose an

operationalization of this measure and show how to characterize an NLP system along the dimensions of size, corpus coverage, and depth. T h e proposed framework is illustrated using sever~fl promi- nent N L P systems. We hope the preliminary proposals m a d e in this article will lead to prolonged debates in the field and will continue to be refined.

1 M e a s u r e s of Size versus

M e a s u r e s of D e p t h

Evaluation of current and potential performance of' an N L P system or m e t h o d is of crucial impor- tance to researchers, developers and users. Cur- rent performance of systems is directly measured using a variety of tests and techniques. Often, as in the case of machine translation or information extraction, an entire "industry" of evaluation gets developed (see, for example, A R P A M T Evalua- tion; MUC-4 ). Measuring the performance of an N L P m e t h o d , approach or technique (and through it the promise of a s y s t e m based on it) is more difficult, as j u d g m e n t s m u s t be m a d e about " b l a m e assigmnent" and the i m p a c t of improving a variety

of system components on the overall future performance. One of the widely accepted measures of potential performance i m p r o v e m e n t is the fea- sibility of scaling up the static knowledge sources of an NLP system its g r a m m a r s , lexicons, worht knowledge bases and other sets of language descriptions (the reasoning being that the larger the s y s t e m ' s g r a m m a r s and lexicons, the greater per- centage of input they would be able to m a t c h and, therefore, the better the performance of the systeml). As a result, a system would be considered very promising if its knowledge sources could be significantly scaled up at a reasonable expense. Natm'ally, the expense is lowest if acquisition is performed automatically. This consideration and the recent resurgence of corpus-based methods heighten the interest in the a u t o m a t i o n

of knowledge acquisition, llowever, we believe

that such acquisition should not 1)e judged solely by the utility of acquired knowledge for ~ particular application.

A preliminary to the sealability estimates is a j u d g m e n t of the current coverage of a s y s t e m ' s

static knowledge sources. Unfortunately, judg-

ments based purely on size ace often mislead- ing. While they m a y be sufficiently straightfor- ward for less km)wledgeAntensive methods used in such applications as information extraction and retrieval, part of speech tagging, bilingual corpus alignment, and so on, the saute is not true about more rule- and knowledge-based methods (such as syntactic parsers, semantic analyzers, semantic lexicons, ontological world models, etc.). It ix widely accepted, for instance, t h a t j u d g m e n t s of the coverage of a syntactic g r a m m a r in terms of the number of rules are tlawed. It is s o m e w h a t less self-evident, however, that the number of lexicon entries or ontology concepts is not an adequate measure of the quality or coverage of NLP

a Incidentally, this consideration eontributes to evMuation of current perforntance as well. In the ab- sence of actual evaluation results, it is customary to c|aim the utility of the system by simply mention- ing tit(: size of its knowledge sources (e.g., "over 550 grammar rules, over 50,000 concepts in the ontology and over 100,00(I word senses in the dictionary").

(2)

systems. A.n adequate measure of these must ex- amine not only size and its scalability, but also depth of knowledge along with its scalability. In addition, these size and depth measures cannot be generalized over the whole system, but must be directly associated with individual areas t h a t cover the b r e a d t h of N L P problems (i.e. morphol- ogy, word-sense ambiguity, semantic dependency, coreference, discourse, semantic inference, etc.). And finally, the m o s t helpfld m e a s u r e m e n t s will not judge the s y s t e m solely as it stands, but must in some way reflect the u l t i m a t e potential of the system, along with a quantification of how far additional work aimed at size and depth will bring a b o u t a d v a n c e m e n t toward t h a t potential.

In this article, we a t t e m p t to formulate measures of coverage i m p o r t a n t to the development and evaluation of semantic systems. We proceed h'om the a s s u m p t i o n t h a t coverage is a function of not only the n u m b e r of elements in (i.e., size of) a static knowledge source but also of the a m o u n t of information (i.e., d e p t h ) and the types of in- f o r m a t i o n (i.e., b r e a d t h ) contained in each such element. Static size is often emphasized in evalu- ations with no attention paid to the often very in- significant a m o u n t of information associated with each of the m a n y "labels" or primitive symbols. We snggest a starting framework for measuring size together with other significant dimensions of semantic coverage. In particular, the evaluation measures we propose reflect the necessary contribution of the depth and breadth of semantic descriptions. D e p t h and b r e a d t h of semantic description are essential for progress in com- p u t a t i o n a l semantics and, ultimately, for building large-scale, general purpose N L P systems. Of course, for a n u m b e r of applications a very limited semantic analysis (e.g., in t e r m s of, say, a dozen separate features) m a y be adequate for sufficiently high performance. However, in the long run, progress towards the u l t i m a t e goal of NLP is not possible without depth and breadth in semantic description and analysis.

There is a well-known belief t h a t it is not appropriate to measure success of N L P using field- internal criteria. Its adherents m a i n t a i n t h a t N L P should be evaluated exclusively through evaluating its applications: information retrieval, machine translation, robotic planning, h u m a n - c o m p u t e r interaction, etc. (see, for: example, the Proc. of the Active N L P Workshop; A R P A M T Evaluation). This m a y be true for N L P users, but developers m u s t have internal measures of success. This is because it is very difficult to assign blame for the success or failure of an application on specific components of an N L P system. For example, in reporting on the MUC-3 evaluation efforts, Lehnert and Sundheim (1991) write:

A wide range of language processing strategies was employed by the top-

scoring systems, indicating t h a t m a n y natnral language-processing techniques provide a viable foundation for sophis- ticated text analysis. Further evaluation is needed to produce a more detailed as- sessment of the relative merits of specific technologies and establish true performance limits tbr a u t o m a t e d information extraction. [emphasis added.]

Thus,

evaluating the information extraction application did not provide constructive criticism on particular NLP techniques to enable advances in the state of the art. Also, evaluating an application does not directly contribute to progress in NLP as such. This is in part because a m a j o r i t y of current and exploratory NLP systems are not complete enough to fit an application but rather are devoted to one or more of a variety of components of a comprehensive NLP system (static e.g., lexicons, g r a m m a r s , etc.; or dynamic e.g., an algorithm for" treating m e t o n y m y in English).

1.1 C u r r e n t M e a s u r e s o f C o v e r a g e

Success in NLP (including semantic analysis and related areas) is currently measured by the following criteria:

• Size of static knowledge sources: A mere

nmnber indicating the size of a knowledge source does not tell us much about the coverage of the system, let alone its semantic capa- bilities. For example, m o s t machine readable dictionaries (MRI)) are larger t h a n computational lexicons but they are not usable for: computational semantics.

(3)

l ' h e n o m e l)t~.'ired S l a t e

" C . r r e n t S t a t e

iiiii

, , -

~ ~ : : . : i " K n o w l e d e Base

Figure 1: Dimensions of Semantic Coverage: (hlr- rent and Desired l)irections

processing became exponentially expensive as a result.

Figure 1 shows the dimensions of size and b r e a d t h (or phenomenon coverage) along tit(', horizontal plane. D e p t h (or richness) of a semantic s y s t e m is shown on the vertical axis. We believe t h a t recent progress in NLP with its emphasis on corpus linguistics and statistic~d m e t h o d s has re- suited in a significant spread akmg the horizontal plane but little been done to grow the Iield in the vertical dimension. Figure 1 also shows the desired state of c o m p u t a t i o n a l semantics advmlced alor|g each of the three dimensions shown. If

We proceed from the assumption that high- quality NLI ) systems require o p t i m u m coverage on all three scales, the|| apparently different roads (-an be taken to t h a t target. T h e s p e e t r m n of choices ranges f r o m developing all three dimensions more or less simultaneously to taking care

of t h e m in turn. As is often the case in king-

t e r m high-risk enterprises, inany researchers opt to start out with acquisition work which promises s h o r t - t e r m gains on one of the coverage dimensions, with little thought a b o u t further steps. Of_ ten the reason they cite can be s u m m a r i z e d by the phrase "Science is the art of the possible." This position is quite defensihle .... if no claims are m a d e a b o u t broad semantic (:overage. Indeed, it is quite legitimate to study a particular language p h e n o m e n o n exclusively or to cover large chunks of the lexis of a language in a shallow man-

ner. IIowever, for practical gains in large-scale

computational-selnantie applications one needs to achieve results on each of the three dimensions of coverage.

1 . 2 D e s i d e r a t a f o r L a r g e - S c a l e C o l n p u t a t i o n a l S e m a n t i c s

Once the initial knowledge acquisition canq)aign for a I)articular apt)lication has been concluded, the following crucial scalability issues 2 ira|st be addressed, if any t|nderstanding of the longer-term significance of the research is sought:

• domain independence: scalability to n e w (lo-

mains; general-purpose Nl,l )

• language independence: sealability across

languages

• phenolnenon coverage: sealability to new

phenomena; going beyond core semantic analysis; ease of integrating component pro- eesses and resources.

• application-independence: sealability to new applications; toolkit o[' NLP techniques applicable to any t~sk.

We believe that coverage in terms of the det)th and breadth of the knowledge given to an NLI ) system is m a n d a t o r y for attaining the above goals in the long run. Such coverage is best esti(nated not in terms of raw sizes of lexicons or world models but rather through the availability in t h e m of information necessary for the t r e a t m e n t of a w>

riety of l)henomena in natural language issues

related to semantic dependency bull(ling, lexical disambiguation, semantic constraint tracking and relaxation (for the cases of unexpected input, including non-li~eral language as well as t r e a t m e n t of unknown lexis), reference, p r a g m a t i c i m p a c t and discourse structure. The resolution of these issues is at the core of t)ost-syntactic text processing. We believe that one can treat the al)ove phe- n o m e n a only by acquiring a broad range of relevant knowledge elements for the system. One nse- flfl measure for sufficiency of infbrmation would be an analysis of kinds of knowledge necessary to gen- erate a text (or (liMog) meaning representation. For applications in which more procedural computational semantics is l)refl~'rable, a correspond ing measure of sutliciency should be developed.

There exist other, broader desiderata which are applicable to any All systetn. T h e y include con- cerns about system robustness, correctness, and efficiency which are orthogonal to the above issues. EquMly i m p o r t a n t but more broadly applicable are considerations of economy and ease of

acquisition of knowledge sources for example,

reducing the size of knowledge bases and sharing knowledge across applications.

2At present, se~dability is considered in the field ahnost exclusively ~ts propagation o[ the nulnber of entries in the NLP knowledge bases, not the quantity and quality of information inside each such entry.

[image:3.596.58.271.111.298.2]

(4)

2 H o w t o R e a s o n a b o u t D e p t h , B r e a d t h a n d S i z e

A useful measure of semantic coverage must in- volve m e a s u r e m e n t along each of the three dimensions with respect to correctness (or success rate) and efficiency (or speed). In this first a t t e m p t at a qualitative metric, we list questions relevant for assigning qualitative ("tendency") scores to an N L P s y s t e m to measure its semantic coverage. Our experience over the years has led us to the following sets of criteria for measuring semantic coverage. Itowever, we understand t h a t the following are not complete or unique; they are representative of the types of issues t h a t are relevant to measuring semantic coverage.

2.1 L e x i c a l C o v e r a g e

• ' l b w h a t extent do entries share semantic primitives (or concepts) to represent word meanings? W h a t is the relation between the n u m b e r of semantic primitives defined and the n u m b e r of word senses covered?

• W h a t is the size of the semantic zones of the entry? tlow m a n y semantic features are covered?

• How m a n y word senses from standard h u m a n - o r i e n t e d dictionaries are covered in the NLP-oriented lexicon entry?

• W h a t types of information are included?

- seleetional restrictions

- constraint relaxation information

syntax-semantics linking - collocations

- procedural a t t a c h m e n t s for contextual processing

-- stylistic p a r a m e t e r s

- aspectual, temporal, m o d a l and attitu- dinal meanings

- o t h e r idiosyncratic information about

the word

• and, finally, the total n u m b e r of entries in the lexicon.

2.2 O n t o l o g i c a l C o v e r a g e

T h e total n u m b e r of primitive labels in a world model is not a useful measure of the semantic coverage of a system. At least the following considerations m u s t be factored in:

• T h e n u m b e r of properties and links defined for an individual concept

• N u m b e r of types of non-taxonomic relationships a m o n g concepts

• Average n u m b e r of links per concept: "con- nectivity"

• T y p e s of knowledge included: defaults, selec- tional constraints, complex events, etc.

• Ratio of number of entries in a lexicon to number of concepts in the ontology

• and, finally, total number of concepts in the ontology.

2.3 M e a s u r i n g B r e a d t h o f M e a n i n g

R e p r e s e n t a t i o n s

A p a r t from lexical and ontological coverage, the depth and breadth of the meaning representations constructed by a system are good indicators of the overall semantic coverage of the system. Tile number of different types of meaning elements included fl'om the following set provides a reasonable measure of coverage:

• A r g u m e n t structure only

• T e m p l a t e filling only

• Events and participants

• T h e m a t i c role assignments

• T i m e and t e m p o r a l relations

• Aspect

• Properties: attributes of events and objects; relations between events and objects.

• R,eference and coreference

• Attitude, modality, stylistics

• Quantitative, comparative, and other m a t h e - matical relations

* Textual relations and other discourse relations

• Multiple ambiguous interpretations

* Propositional and story/dialog structure

3 M e a s u r i n g S e m a n t i c C o v e r a g e : E x a m p l e s

Figure 2 shows the a p p r o x i m a t e position of several well-known approaches and systems (including a possible Cyc-based system) in the 3- dimensional space of semantic coverage. We have chosen representative systems fl'om the different approaches for lack of precise terms to n a m e tlle approaches.

How do the approaches illustrated in Figure 2 rate with respect to the metrics suggested above'? When estimating their profiles, we thought either about some representative systems belonging to an approach or thought of the properties of a pro- totypical system in a particular p a r a d i g m if no examples presented themselves readily. In the in- terests of space, we consider the above criteria for measuring semantic coverage but only provide brief summaries of how each system or approach is located along the dimensions of depth, b r e a d t h and size.

(5)

+

+++++++++{

!!}:

i +

:-+:::+ ::+.

++++++5++ i+~i++++++

+

[image:5.596.57.273.108.288.2]

++++++++5+i+i+?+++++++++++5+++++i+++ +++5+i+ ++i +++ ++++++++++i~ + + +ii+i{5++++++51+5~+++++++ ~!:+:: :, +77:+}+ }+++ ++

... ~'ii ... , ... Ix.~,tA ...

i{ {! ii

ii',

:: :: ¢ {

iii{i{5~!{iii!!5 ~:i! 5{ 5! 7!!5 ~ ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: ~ i i { {i{i:!Si{ i

:~:s~;ii~:::::!:!~:!i ~ ~/:i

! ! !i!::?!: !:::::~ :: = :: ~ : ::: :::::::::::::::::: :::::::::::::::::::::::::::::::::::: ~ 5~!:: !:!: ~ :~ ~:! ~s : ~ ~::: ?~:!i.'.,~-: :::: ... :":~ ::: :: ,:

{!~i{~i: ::::: :~::::::::: ::::::::::::~::~:. :::::::::::::::::::::::::::::: ~ ::::::::::::::::::::::::::: 2: :::::!:!::!!!:i: ::::::::::::::::::::::::::::::::::::::::::: ::::::::::::::::::::::::::::::::::::::::::::::::: :, :. :::i:2: ::::!:: ::f? ::::i: :

... I "2; ": ... .~*~ ... : ~ ... ~ ...

?!:5!ii!i!iii!:ili ~iiiiliiii:i:i :i:i:i:i:i:5:!:i:?i:!~ '!!!!!ii . ~ ::::: :::: ::£,~. :::::::::::::::::::::::::

:{::{!{:::i:i::{!;:i:i ~:~:i:i::::i{i:: ii:~!~:i~[::i .::::::::: t,,:xx::::::::::: :::::::::::::::::::::::::

::::::::::~:: :,"~::::::hl :::::::::::::::% :::::::::++::~:::i:i:::::!!: z.. :::::::::::::::::::::::::::

i~ili~}iii::iil)+::i: ~ti{i::~:ii::;:i: ~}:!}i}:.~}!!~i!ii~i} ~ .ilili~ii~iii~i #. ~:i:~:}ii~ v+} } if<

::::::::::::5::::: ;{:::::::::::::::::: ::::::::::::::::::::::: :::::::::: :~::::::::::: qX :::::::::::::::::::::::

a

{ {

+++++++++++++++++++{++t++++++++++++++++~+-:+ +~++++++i~}+++++++i~2~+++;++g~i +++++++++~++i)+ +~;~+i++ii++++)Y + +++ +++ :+~++

:: ::: :::::: ;;.;~ ;;~a::::: ~,<E'I:~ :::: N~i~++:i/~i~: :: :~ 'a::+" *~::i:i:i :;+JlKI ~ +p-'~' ~:~:~9.:~ : :: ~i<~. +.~.~t : :::::

Figure 2: l)imensions of Semantic Coverage: Cur-+ rent att(t l)esired l)ireetions

domain- and task-del)endent, AI-style, schema- based N L P systc'm. It m a y I>e considered an ex- treme e x a m p l e of a system with deep, rich knowl edge of its, rather narrow, worhl in which cowwing language p h e n o m e n a is nee(ted only inasmuch as it supports general reasoning. Boris was able to process a very smMl n u m b e r of texts sutticiently for

its goMs. T h e coverage of

phenomena was

strictly

utilitarian (which, we believe, is quite appropriate). lilt was not demoimtratext t h a t Boris can be scaled up to (:over a signiticant part of the English lexicon.

As an e x a m p l e of an early knowletlge-basetl MT

system (thai, is,

unlike the above, a system whose goals were mainly computational-lit|guistic) we chose the K B M T - 8 9 system ( G o o d m a n and Niren- burg, 1991). It covered its small corpus relatively completely and described the necessary phenomena relatively fldly, lilt was a p r i m a r y goal of this line of research to b e g i n m e e t i n g the a b o v e criteria for semantic coverage.

A p n t a t i v e NIA ) system based on the

(~yc

project has been selected as a prototyl)e for sys- t e m s not devised h)r a particular application. The Cyc large-scMe knowledge base ]|as significant a m o u n t s of deep knowledge, llowever, it is not clear whether the knowledge is apl>licM)le in a straightff)rward m a n n e r to deal with a range of linguistic phenomena. T h e big question for this kind of s y s t e m is whether it is, in fact, possible, to acquire knowledge without a reference to an intended application.

A purely corpus--based, statisticM approach to N L P , on the other hand, has an extremely narrow range of knowledge, but, m a y haw; a large

size. For example, snch a system m a y have a

large lexicon with only word frequency and col- location i n f o r m a t i o n in each entry. A l t h o u g h sta- tistical m e t h o d s h a v e been s h o w n to work o n s o m e

problenm and applications, they are typically ap- plied to one or two phenomena at a time. It is not rlear that statistical information acquired tbr one probleln (such as sense disambiguation) is of use in hmtdling other problems (such as processing non-literal expressions).

Mixed-strategy NI,I ~ systems are epitomized by I'angloss (199d), a multi-engine translation sys- t e m in which semantic processing is only one of the possible translation engines. The semantics engine of this system is equipped with a large-size ontology of over 50,000 entries (Knight and link, t99d) which is nsed essentially as an am:hot for mapl)ing lexicM traits front the 8otlrce to the tar.- gel, language. As shown in I,'igure 2, Pangloss has a large size and covers a good range of l)het,)m. em~ as well. llowew',r, there is little information (only taxonomic and partonolnie relationships) in each concept in its Sensus ontology. T h e limited depth constrains the ultimate potentia.1 of the systetn as a sentatd,ic and pragmatic processor. I"or exatnple, there is no hfl'ortnal;ion in its knowledge sources to make judgements about constr~fint re-

laxal,ion

to process non-literal expressions snch as metonymies and metal~hors.

The Mikrokosntos

system (e.g.,

Onyshkevych

an(t Nirenburg, 1994), has a t t e m p t e d to cover each dimension equally well. Its knowledge bases and text meaning representations are rather deep and of nontrivial sizes. It has been designed froln the start to deal with a comprehensiw; range ot'seman-. tic p h e n o m e n a including the linldng of syntax attd semantics, (-ore semantic analysis, sense disam- l)iguation, I)rocessing non-literM expressions, SLIt({ so on, althongh not all of them have yet been im plemented.

Front the abow'~ examples, it is clea.r t h a t having good coverage along one or two of the three dimensions is not good enough for meeting the long-term goMs of NI,P. Poor coverage of language p h e n o m e n a (i.e., poor brea, dth) indicates t h a t the acquired knowh;dge, even when it is deep and large in size, m a y not be applicable to other phenomena and m a y not transfer to other applications. Poor depth suggests t h a t knowledge and processing techniques are either application- or language- specific and limits the ultimate potential of the system in solving semantic problems. Depth and breadth are of course of little use if the system cmmot bc scaled up to a signilicant size. More- over, as already noted, cow;rage in depth, breadth, and size must all be achievetl in conjnnction with maintaining good me, asnres of correctness, et[i- ciency, and robustness.

4 D i s c u s s i o n a n d C o n c l u s i o n s

All oft-quoted objection to having deep semantic (:overage is the dilliculty in scMing up such a sys- t e m along the dimension of size. This is a valid

concern, llowever, the situation (:an be amelio-

(6)

rated to a large extent by developing a methodology (see, e.g., Mahesh and Nirenburg, 1995) for constraining knowledge acquisition to minimally

meet semantic processing needs. Such concen-

tration of effort will allow knowledge acquirers to have spend a fraction of the effort that must go into building a general machine-tractable en- cyclopedia of knowledge and yet to attain significant coverage of language phenomena. Significant scale-up can be accomplished under such a constraint without jeopardizing the high values on the depth and breadth scales.

Size is important in NLP. But size alone is not a sufficient metric for evaluating semantic coverage. Focusing on size to the exclusion of other criteria has biased the field away from semantic solutions to NLP problems. We have made a first step in formulating a more appropriate and complete set of measures of semantic coverage. Depth and breadth of knowledge necessary to cover a wide range language phenomena are at least as important to NLP as size. The discussion of pe- culiarities of the various approaches should be ex- panded in at least two directions - greater detail of description and analysis of the relative ditficulty of reaching the set goal of attaining an optimum value on each of the three measurement scales. We hope that this paper will elicit interest in contin- ned discussion of the issues of coverage measurement, which, in turn, will lead to better -- quantitative as well as qualitative- measures, including a methodology for comparing lexicons and ontolo- gies.

Acknowledgments

Many thanks to Yorick Wilks for his constructive criticism.

R e f e r e n c e s

Active NLP Workshop: Working Notes from the AAAI Spring Symposium "Active NLP: Natural Language Understanding in Integrated Systems" March 21-23, 1994, Stanford University, Califor- nia (Also available as a Technical Report from the American Association for Artificial Intelligence).

AI{PA MT Evaluation: Report of the Advanced Research Projects Agency, Machine Translation Program System Evaluation, May-August 1993.

Goodman, K. and S. Nirenburg (eds.) (1991).

The K B M T Project: A Case Study in Knowledge- Based Machine Translation. San Marco, CA: Morgan Kaufmann.

Knight, K. and Luk, S. K. (1994). Building a Large-Scale Knowledge Base for Machine Trans- lation. In Proc. Twelfth National Conf. on Arti- ficial Intelligence, (AAAI-94).

Leant, D. B. and Guha, R. V. (1990). Building

Large Knowledge-Based Sysiems. Reading, MA: Addison-Wesley.

Lehnert, W. G., Dyer, M. G., Johnson, P. N., Yang, C. J., and Harley, S. (1983). BORIS - An Experiment in In-I)epth Understanding of Narra-

tives. Artificial Intelligence, 20(1):15-62.

Lehnert, W. G. and Sundheim, B. (1991). A performance evaluation of text-analysis technolo-

gies. AI Magazine, 12(3):81-94.

Mahesh, K. and Nirenburg, S. (1995). A situ- ated ontology for practical NLP. In Proceedings of the Workshop on Basic Ontological Issues in Knowledge Sharing, International Joint Confer- ence on Artificial Intelligence (IJCAI-95), Mon- treal, Canada, August 1995.

MUC-4: Proc. Fourth Message Understanding Conference (MUC-4), June 1992. Defense Ad- vanced Research Projects Agency.Morgan Kanf- mann Publishers.

Onyshkevych, B. and Nirenburg, S. (1994). The lexicon in the scheme of KBMT things. Technical Report MCCS-94-277, Computing Research Lab- oratory, New Mexico State University. Also to appear in Machine rlh'anslation.

Pangloss. (1994). The PANGLOSS Mark Ill Machine Translation System. A Joint Technical Report by NMSU CRL, USC ISI and CMU CMT, Jan. 1994.