Operating Statistics for the Transformational Question Answering System

(1)

Operating Statistics for

The Transformational Question Answering System

F r e d J . D a m e r a u

M a t h e m a t i c a l S c i e n c e s D e p a r t m e n t I B M T h o m a s J. W a t s o n R e s e a r c h C e n t e r

P o s t O f f i c e B o x , 2 1 8

Y o r k t o w n H e i g h t s , N e w Y o r k 1 0 5 9 8

This paper presents a statistical summary o f the use o f the Transformational Question Answering ( T Q A ) system by the City o f White Plains Planning Department during the year 1 9 7 8 . A c o m p l e t e record o f the 7 8 8 questions submitted to the system that year is included, as are separate listings o f s o m e o f the problem inputs. Tables summarizing the performance o f the system are also included and discussed. In general, performance o f the system was sufficiently good that we believe that the approach being followed is a viable one, and are continuing to develop and extend the system.

Introduction

Natural language question answering systems have been the subject of much research over approximately 20 years. In only very few cases have such systems been exposed to real users trying to solve real p r o b - lems, another example perhaps being Krause (1979). In an attempt to see if a useful natural language query system could be built for an application which existed i n d e p e n d e n t l y of the r e s e a r c h p r o g r a m , an a p p r o a c h was made to the Planning D e p a r t m e n t of the City of White Plains asking them to take part in such an experiment. Their incentive was the free use of an interactive query facility which would allow them to explore their data base more freely than the b a t c h c o m p u t e r facility run by the city could do. The remainder of the paper describes the user e n v i r o n m e n t , describes the operation of the T Q A system, and discusses the operating results in the first year of operation, 1978.

The User Environment

W h e n the experiment was first being discussed, the Planning D e p a r t m e n t h a d five professionals plus the services of a consultant more or less constantly available. The main files used for planning were the

parcel

file

and the

geobase file,

which had been c o n v e r t e d to machine readable form under a grant from the D e p a r t - ment of Housing and U r b a n Development. D e p a r t - ment members were very familiar with these files, in m a n y instances knowing the c o r r e s p o n d e n c e b e t w e e n codes in the file and their English equivalents.

The parcel file c o n t a i n e d a record for each parcel of land in the city, a p p r o x i m a t e l y 10,000 of them, which had a taxable existence. E a c h record c o n t a i n e d the a c c o u n t number, the block, the owner, the address, land use information, n u m b e r of dwelling units, area, taxes, and the like. The geobase file c o n t a i n e d a record for each city block, telling what census tract it was in, what traffic zone, what n e i g h b o r h o o d association, etc. Selected summaries of these files had been p r e p a r e d by the City's c o m p u t i n g center, and the files themselves had been printed in a couple of large vol- umes. A d hoe queries required special programs to be written by the c o m p u t i n g center, with sufficient delay that this was seldom done. Thus, we were told that during the 1974 gasoline shortage, the parcel file was searched b y hand on the land use code field to find the locations of all the gas stations so that police could be routed there to direct traffic. O t h e r uses of the files are a p p a r e n t f r o m i n s p e c t i o n of the questions asked of the system, a sample of which are given in Figure 1.

A l t h o u g h the terminal was located in an o p e n area available to all m e m b e r s of the department, it turned out that most of the questions were asked by one of the d e p a r t m e n t members, w h o had been designated as our liaison. The reason for this was never clear, but m a y have simply been normal reluctance to experiment with radically new t e c h n o l o g y on the part of the older m e m b e r s of the d e p a r t m e n t . O t h e r p e r s o n n e l did sometimes use the system, but under his sign-on code,

Copyright 1981 by the Association for Computational Linguistics. Permission to copy without fee all or part of this material is granted provided that the copies are not made for direct commercial advantage and the

Journal

reference and this copyright notice are included on the first page. To copy otherwise, or to republish, requires a fee and/or specific permission.

0 3 6 2 - 6 1 3 X / 8 1 / 0 1 0 0 3 0 - 1 3 5 0 1 . 0 0

(2)

Where are the parcels w i t h a LUC o f 641 ? What s i n g l e f a m i l y houses in Fisher H i l l have

exemptions g r e a t e r than $0 ?

How many two f a m i l y houses are t h e r e in the Oak Ridge Residents Assn. ?

Who owns the parcels in subplanning area 7.60 ? Where are the churches on North S t r e e t ? What parcels on Stevens St. have a LUC o f 116 ? What parcel in ward 1 does c i t y own ?

P r i n t the parcel area o f the LUC'S 300 - 399 ! What is the assessment o f the p a r c e l s in the

B a t t l e H i l l Assn. having more than 3 d w e l l i n g u n i t s ? Where is M a r t i n ' s parcel ?

Where are the apartment d w e l l i n g s which have more than 50 u n i t s which are more than 6 s t o r i e s high on Lake St. ? Where is the Calvary B a p t i s t Church parcels ?

Where is parcel 50550006401 ? What p r o p e r t i e s does Longo own ?

Where are the apartment b u i l d i n g s having less than 8401 sq f t ground f l o o r area ?

Figure 1. Example inputs to TQA.

so that the actual user for a n y question c a n n o t be

d e t e r m i n e d f r o m the log in m o s t cases. During the

course of the experiment, a new planning director was appointed, who c h a n g e d the mission of the d e p a r t m e n t somewhat. The result was that m e m b e r s of the planning d e p a r t m e n t did f e w e r of the conventional planning activities than formerly, with results that are ap- p a r e n t in the statistics that follow.

The TQA System

The T Q A system, originally n a m e d R E Q U E S T , has b e e n u n d e r d e v e l o p m e n t f o r s o m e time at the I B M T h o m a s J. W a t s o n R e s e a r c h Center. An e x p e r i m e n t a l a p p l i c a t i o n using business statistics had b e e n quite

satisfying. F o r the White Plains application, m a j o r

additions were made to the lexicon, new g r a m m a r rules extending c o v e r a g e were written, a new c o m p o n e n t for interfacing the g r a m m a r and a data base was developed, and an interface was built to an existing data

base m a n a g e m e n t system, the RSS. T h e system was

installed in White Plains late in 1977 for final debug- ging, and turned o v e r to the planners at the beginning of 1978. It was d i s c o n n e c t e d at the end of 1979 p a r t - ly for legal reasons and partly b e c a u s e we felt little new could be learned b y leaving it there.

A generalized flow diagram of the T Q A s y s t e m is given in Figure 2, and an e x a m p l e of processing in Figures 3a-g. The structures printed in Figure 3a are a b r a c k e t e d terminal string r e p r e s e n t a t i o n of structures which are stored and m a n i p u l a t e d as trees by the processing programs. T h e trees, t o g e t h e r with their associ- ated c o m p l e x features, for the e x a m p l e are shown in Figures 3b-g. These structures are the o u t p u t s at simi-

I n p u t l I

IPreprocessor( < . . . Lexicon

l

[ L i s t o f l e x i c a l t r e e s i

. . .

I T r a n s f o r m a t i o n a l p a r s e r l < . . . . S t r i n g t r a n s f o r m a t i o n s . . .

l

I L i s t o f t r e e s l

. . .

I C o n t e x t f r e e p a r s e r l < . . . Context f r e e phrase . . . s t r u c t u r e r u l e s

l

r L i s t o f s u r f a c e t r e e s i

. . .

I T r a n s f o r m a t i o n a l p a r s e r l < . . . . Inverse t r a n s f o r m a t i o n a l . . . grammar

J

J Underlying s t r u c t u r e ( s ) l

. . .

I T r a n s f o r m a t i o n a l p a r s e r l < . . . . Data base s p e c i f i c . . . t r a n s f o r m a t i o n a l r u l e s

I

I Query s t r u c t u r e ( s ) I

. . .

ISemantic i n t e r p r e t e r J < . . . Semantic r u l e s

. . . < - - I

i l

I Logical f o r m ( s )

l l

. . . I

I E v a l u a t o r l < . . . Data base

[image:2.612.340.574.49.558.2]

I i Answer

Figure 2. Flow diagram of TQA.

larly n a m e d points of the flow diagram in Figure 2.

I n p u t f r o m an I B M 3275 display station is fed to the p r e p r o c e s s o r , which s e g m e n t s the input c h a r a c t e r

string into words and p e r f o r m s lexical lookup. The

process of lookup is c o m p l i c a t e d s o m e w h a t by a provi- sion for s y n o n y m and phrase r e p l a c e m e n t . Words like

" c a r " and " a u t o m o b i l e " are c h a n g e d to a u t o , and

strings like "gas s t a t i o n " are f r o z e n into single lexical

units. The o u t p u t f r o m the lexical lookup is a list of

(3)

Fred J. D a m e r a u Operating Statistics for the Transformational Question A n s w e r i n g S y s t e m

TYPE NEXT QUESTION.

Where are the gas stations in t r a f f i c zone 579 ?

STRING TRANSFORMATIONS:

((AT ((WH SOME) (PLACE X11))) BE THE

((GAS_STATION =$17) X4) IN (TRAFFIC_ZONE 579) ?)

SURFACE STRUCTURES:

I. ((AT ((WH SOME) (PLACE X ] I ) ) ) BE (THE (((GAS_STATION =$17) X4) (IN (TRAFFIC_ZONE 579

2. ((AT ((WH SOME) (PLACE X11))) BE (THE ((GAS STATION =$17) X4)) (IN (TRAFFIC_ZONE 579

UNDERLY I NG STRUCTURES :

I. (BD LOCATED (THE (((GAS STATION =$17) X4) (* BD LOCATED X4 (TRAFFIC ZONE 579) BD * ) ) ) ((WH SOME) (PLACE X11)) BD)

QUERY STRUCTURES:

1. (BD ADDRESS ((WH SOME) (THING X l l ) ) (THE (((GAS STATION 553) X4)

(* BD TRAFFIC ZONE 579 X4 BD * ) ) ) BD)

LOGICAL FORM:

(setx 'Xll

' ( f o r a t l e a s t @I 'X63 (setx 'X4

'(and

( t e s t f c t '@579 '('TRAFZ X4 '@1976) '= )

( t e s t f c t '0553 '('LUC X4 '@1976) ' : ) ) )

' ( t e s t f c t X11

'('ADDRESS X63 '@1976) ' : ) ) )

ANSWER:

ADDRESS 1976

2 06300 02000 122 S LEXINGTON AV

2 06600 00100 101W POST RD

2 05500 09300 102 W POST RD 2 07100 03300 109 W POST RD 2 07100 02900 115 W POST RD

) ) ?

?)

Figure 3a. Short trace of e x a m p l e q u e s t i o n s h o w i n g m a j o r i n t e r m e - diate s t r u c t u r e s .

in its detail but still valid in main outline, is given in R o b i n s o n (1973). W i t h o u t going into great detail, one can see in Figure 3b that "gas s t a t i o n " and " t r a f f i c

z o n e " h a v e b e e n m a d e into single units. T h e node

= $ 1 7 in the entry for gas station is a m a c r o standing

for a bundle of semantic features. M a n y of the f e a -

ture n a m e s should be obvious, but, e.g., P L stands for

" p l a c e " , P A G f o r parcel aggregate, i.e., an aggregate

of s e p a r a t e parcels, and C I N S for "cardinal insertion',

i.e., can be followed by a cardinal n u m b e r .

( (where

( ((RP ((+ WH) (+ LOC))) ((NOM)

((NOUN ((+ REL) (- HU) (+ PL))) ((INDEX ( ( - CONST)))

( ( x o ) ) ) ) ) ) ) ) (are

( ((BAUX ( ( - PAST) (- SG))) ( ( B E ) ) ) ) )

(the

(((gET) ( ( T H E ) ) ) ) ) (gas_station

(((NOM)

((NOUN ( ( - HU) (- SG) (+ PL) (+ SV))) ((v)

((v) ((GAS STATION))) ((=S17)))

((INDEX ((- CONST)))) ) ) ) ) (in

(((PREP) ( ( I N ) ) ) ) ) ( t r a f f i c zone

(((NOM)

((NOUN ( ( - HU) (+ SG) (+ PL) (+ PAG) (+ GEO))) ((V ((+ OGEN) (+ POBJ)))

((TRAFFIC ZONE)) )

((INDEX ( ( - CONST) (+ CINS)))) ) ) ) )

(579

( ((VADJ ((+ ADJ) (+ CARD) (+ D3))) ( ( 5 7 9 ) ) ) ) )

(?

[image:3.612.242.528.49.527.2]

( ((PUNCT ((+ QUES))) ( ( ? ) ) ) ) ) )

Figure 3b. List of lexical trees.

T h e list of lexical trees is input to a set of string

transformations, d e s c r i b e d in Plath ( 1 9 7 4 ) . T h e s e t r a n s f o r m a t i o n s o p e r a t e o n a d j a c e n t lexical items to deal with p a t t e r n s of classifiers, ordinal n u m b e r s , s t r a n d e d prepositions, and the like. T h e e f f e c t of this p h a s e is to reduce the n u m b e r of surface parses and the a m o u n t of w o r k d o n e in the t r a n s f o r m a t i o n a l cy- cle. R e f e r r i n g to Figure 3c, n o t e that " 5 7 9 " has b e e n i n c o r p o r a t e d with " t r a f f i c z o n e " u n d e r a single node,

P R O P N O M .

T h e resulting list of trees is input to a c o n t e x t free

parser, which p r o d u c e s a set of surface trees. In the

e x a m p l e , t w o s u r f a c e t r e e s are p r o d u c e d , s h o w n in

Figures 3d and 3e. T h e trees differ in the point of

a t t a c h m e n t of the p h r a s e "in traffic z o n e 5 7 9 " . In

structure 1, it is a t t a c h e d to the N P " t h e gas s t a t i o n " , and in structure 2 it is directly u n d e r the S node.

T h e r e c o g n i z e r a t t e m p t s to find an underlying

structure f o r e a c h s u r f a c e t r e e (Plath 1973, 1976). Typically, only o n e of a set of surface trees will result in an underlying structure. In the e x a m p l e , the struc-

(4)

((AT ((WH SOME) (PLACE X I I ) ) ) BE THE

((GAS_STATION :$17) X4) IN (TRAFFIC ZONE 579) ?) ((ST)

((PP)

((NSPREP) ((AT))) ((NP)

((NOM)

((V ((+ ADJ) (+ QUANT))) ((WH))

((SOME))) ((NOM)

((NOUN ((+ SG) (- HU) (+ PL))) ((V) ((PLACE)))

((INDEX ((- CONST))) ((Xll))) ) ) ) ) ) ((BAUX ((- PAST) (- SG)))

( ( B E ) ) ) ((DET) ((THE))) ((NOM)

((NOUN ((- HU) (- SG) (+ PL) (+ SV))) ((v)

((V) ((GAS STATION))) ( ( = $ 1 7 ) ) )

((INDEX ((- CONST))) ( ( X 4 ) ) ) ) ) ((NSPREP) ((IN))) ((PROPNOM)

((NOUN ((- HU) (+ SG) (+ PL) (+ PAG) (+ GEO))) ((V ((+ OGEN) (+ POBJ)))

((TRAFFIC ZONE)) ) ((INDEX ((+ CONST)))

((VADJ ((+ ADJ) (+ CARD) (+ D3))) ( ( 5 7 9 ) ) ) ) ) )

((PUNCT ((+ QUES))) ( ( ? ) ) ) )

Figure 3c. L i s t of t r e e s a f t e r s t r i n g t r a n s f o r m a t i o n s .

ture in which the prepositional phrase was a t t a c h e d to "gas station" survives. The underlying structures are similar to those p r o p o s e d under the variant of trans-

f o r m a t i o n a l g r a m m a r called generative semantics. This

is not the place to d e f e n d that particular t h e o r y or its use on the T Q A system; suffice it to say that a considerable b o d y of g r a m m a t i c a l w o r k on English has b e e n done in a style c o m p a t i b l e with this theory. To simpli- fy, every S in the underlying structure has a predicate

and s o m e n u m b e r of n o u n p h r a s e a [ g u m e n t s . T h e

n o u n p h r a s e s m a y d o m i n a t e i m b e d d e d S's. In the

e x a m p l e , the t o p level p r e d i c a t e is L O C A T E D with

arguments of "gas s t a t i o n " and " s o m e place". The

NP for "gas station" d o m i n a t e s an S which also has a main predicate of L O C A T E D and a r g u m e n t s of " X 4 " , i.e., the s a m e index as that for "gas s t a t i o n " , and "traffic z o n e 5 7 9 " , i.e., the gas station located in traffic zone 579. Notice that the parser has supplied b o t h

I. (CAT ((WH SOME) (PLACE X l I ) ) ) BE (THE

(((GAS_STATION =$17) X4) (IN (TRAFFIC_ZONE 579)))) ?) (($1)

((PP)

((NOM)

(iV ((+ ADJ) (+ QUANT))) ((WH))

((SOME ) ((NOM)

((NOUN (+ SG) (- HU) (+ PL))) ((V) (PLACE)))

((INDEX ((- CONST))) ( ( X l i ) ) ) ) ) ) ) ) ((BAUX ((- PAST) (- SG)))

( ( B E ) ) ) ((NP)

((DETX)

((DET) ((THE)))) ((NOMX)

((NOMZ) ((NOM)

((NOUN ((- HU) (- SG) (+ PL) (+ SV))) ((v)

((V) ((GAS_STATION))) ((=$17)))

((INDEX ((- CONST)))

((x4))) ) )

((Z1) ((PP3X)

((PP)

((NSPREP) ((IN))) ((RPX)

((NP) ((PROPNOM)

((VADJ ((+ ADd) (+ CARD) (+ D3))) ( ( 5 7 9 ) ) ) ) ) ) ) ) ) ) ) ) ) ) ((PUNCT ((+ QUES)))

( ( ? ) ) ) )

Figure 3d. S u r f a c e s t r u c t u r e t r e e 1.

instances of " l o c a t e d " . A fuller e x p l a n a t i o n is given

in Plath (1973, 1976).

T h e data b a s e has no predicate, i.e., column h e a d -

ing, n a m e d " l o c a t e d " . E v e n if it did, the two L O -

(5)

Fred J. Damerau Operating Statistics for the Transformational Question Answering System

2. ((AT ((WH SOME) (PLACE X11))) BE (THE

((GAS STATION =$17) X4)) (IN (TRAFFIC ZONE 579)) ?) (($1)

((PP)

((NOM)

((V ((+ ADJ) (+ QUANT))) ((WH))

((SOME))) ((NOM)

((INDEX ((- CONST))) ( ( x 1 1 ) ) ) ) ) ) ) ) ((BAUX ((- PAST) (- SG)))

( ( B E ) ) ) ((NP)

((DETX)

((DET) ( ( T H E ) ) ) ) ((NOMX)

((NOM)

((NOUN ((- HU) (- SG) (+ PL) (+ SV))) ((v)

((V) ((GAS_STATION))) ((=S17)))

((INDEX ((- CONST))) ( ( X 4 ) ) ) ) ) ) ) ((PP4X)

((PP)

((NSPREP) ((IN))) ((RPX)

((NP) ((PROPNOM)

((TRAFFIC_ZONE))) ((INDEX ((+ CONST)))

((VADJ ((+ ADJ) (+ CARD) (+ D3))) ( ( 5 7 9 ) ) ) ) ) ) ) ) ) ) ((S2PCX)

((PUNCT ((+ QUES))) ( ( ? ) ) ) ) )

Figure 3e. Surface s t r u c t u r e tree 2.

m a y not m a t c h a particular data base organization. There are m a n y approaches one could take to make this match. We have chosen to implement the match- ing function as a separate transformational c o m p o n e n t in the g r a m m a r ( D a m e r a u 1977). The u n d e r l y i n g structure itself is t h e r e f o r e input to the t r a n s f o r m a - tional recognizer, using a (small) set of grammar rules tailored to a specific data base and produces a query structure. Q u e r y structures are similar to underlying structures in form, but reflect the particular meaning

I. (BD LOCATED (THE (((GAS_STATION :S17) X4) (* BD LOCATED X4 (TRAFFIC ZONE 579) BD * ) ) ) ((WH SOME) (PLACE X11)) BD)

(($I ((- PAST) (+ WH) (+ QUES) (+ TOP))) ((BD))

((V ((+ ADJ) (+ EN) (+ LOC) (+ TEMP) (+ PSUBJ) (+ POBJ)))

((LOCATED))) ((NP)

((DET) ((THE))) ((NOM)

((NOM)

((NOUN ((- HU) (- SG) (+ PL) (+ SV))) ((v)

((V) ((GAS STATION))) ( ( : $ 1 7 ) ) )

((INDEX ((- CONST))) ( ( X 4 ) ) ) ) ) ((Si)

((BD))

((V ((+ ADd) (+'EN) (+ LOC) (+ TEMP) (+ PSUBJ) (+ POBJ)))

((LOCATED))) ((NP)

((NOM)

((NOUN ((- HU) (- SG) (+ PL) (+ SV))) ((INDEX ((- CONST)))

( ( X 4 ) ) ) ) ) ) ((NP ((+ IN) (+ LOC)))

((NOM)

(iV ((+ ADJ) (+ CARD) (+ D3))) ( ( 5 7 9 ) ) ) ) ) ) )

BD)) ) ) ) ( NP ((+ AT) (+ LOC)))

((NOM)

((V ((+ ADJ) (+ QUANT))) ((WH))

((SOME))) ((NOM)

((INDEX ((- CONST))) ( ( X l l ) ) ) ) ) ) ) ( ( B D ) ) )

Figure 3f. U n d e r l y i n g s t r u c t u r e .

constraints resulting from the f o r m a t and c o n t e n t of a given data base. In Figure 3g, one can see that the first instance of L O C A T E D has been c h a n g e d to the

(6)

1. (BD ADDRESS ((WH SOME) (THING X l l ) ) (THE (((GAS STATION 553) X4)

(* BD TRAFFIC ZONE 579 X4 BD * ) ) ) BD)

(($I ((- PAST) (+ WH) (+ QUES) (+ TOP))) ((BD))

((V ((+ ADJ) (+ EN) (+ LOC) (+ TEMP) (+ PSUBJ) (+ POBJ))) ((ADDRESS)))

((NP ((+ AT) (+ LOC))) ((NOM)

((V ((+ ADJ) (+ QUANT))) ((WH))

((SOME))) ((NOM)

((NOUN ((+ SG) (- HU) (+ PL))) ((V) ((THING)))

((INDEX ((- CONST))) ( ( X l l ) ) ) ) ) ) ) ((NP)

((DET) ((THE))) ((NOM)

((NOM)

((NOUN ((- HU) (- SG) (+ PL) (+ SV))) (V)

((V) ((GAS_STATION))) ((LUC) ( ( 5 5 3 ) ) ) ) (INDEX ((- CONST)))

( ( X 4 ) ) ) ) ) ((s1

((BD))

((V ((+ ADJ) (+ EN) (+ LOC) (+ TEMP) (+ PSUBJ) (+ POBJ)))

((TRAFFIC ZONE)) ) ((NP ((+ IN) (+ LOC)))

((NOM)

((NOUN ((- HU) (+ SG) (+ PL) (+ PAG) (+ GEO))) ((INDEX ((+ CONST)))

((V ((+ ADJ) (+ CARD) (+ D3))) ( ( 5 7 9 ) ) ) ) ) ) )

((NP) ((NOM)

((NOUN ((- HU) (- SG) (+ PL) (+ SV))) ((INDEX ((- CONST)))

( ( x 4 ) ) ) ) ) ) ( ( B D ) ) ) ) )

((BD)))

Figure 3g. Query structure.

predicate A D D R E S S and the second instance to the predicate TRAFFIC ZONE.

The query structure tree is processed by a K n u t h - style semantic interpreter (Petrick 1977), producing a logical form. A logical form can best be thought of, in our context, as a retrieval expression which is to be

evaluated, producing an answer to the English input query. Referring to Figure 3a, the logical form can be read as:

Find the set of thiags X l l such that for at least l of the things X63 in the set of things X4 where the traffic zone of X4 is 579 and the L U C (land use code) is 0553, (viz., the set of a c c o u n t n u m b e r s having b o t h these properties), the address of X63 is X11.

The process of answer e x t r a c t i o n from the data base is accomplished by a combination of LISP and P L / ! programs ( D a m e r a u 1978), and an experimental relational data base m a n a g e m e n t system called Rela- tional Storage System (RSS) (Astrahan, et al., 1976). The RSS provides the capability to g e n e r a t e a data base of n-ary relations, with indexes on any field of the relation, and low-level access c o m m a n d s like O P E N , N E X T , C L O S E , with appropriate parameters, to retrieve information from such a data base. This particular data base had just one relation of 40 col- umns. The LISP programs examine the logical form to establish relationships b e t w e e n variables and to gener- ate requests to the data base c o m p o n e n t to find items with specified properties. In the example, one retrieval request would find the qualifying account numbers, i.e., the X4s, and a s e c o n d request would find the addresses of those a c c o u n t numbers.

Notice that the logical form simply specifies a set of addresses as the answer. This is clearly unsatisfac- tory, and the data base interface program supplies the a c c o u n t number for each address as part of the answer. The long numbers in the answer are the parcel identifiers, (ward-block-lot) referred to above as account numbers and sometimes called lot numbers.

All the processing modules are under the control of a driver module which maintains c o m m u n i c a t i o n with the user, calls the processors in the correct sequence, and tests for errors.

U s a g e S t a t i s t i c s

(7)

Table I. Number of events.

JAN FEB MAR APR MAY dUN JUL AUG SEP OCT NOV DEC TOTAL

QUERIES 45 161 180 52 45 114 87 31 32 20 13 8 788

COMPLETED 18 92 127 44 28 66 68 22 17 17 9 5 513

LOOKUP FAILURES 3 19 17 5 10 19 31 8 I 2 3 I 119

PARSING FAILURES 14 26 38 4 4 29 11 7 11 0 3 0 147

NOTHING IN DATA BASE 12 8 12 3 1 2 13 5 I I 3 0 61

ABORTED 3 9 14 2 5 9 4 2 I 2 1 I 53

USER COMMENTS 9 10 16 7 3 10 21 11 3 1 4 1 96

USER MESSAGES 0 3 0 0 0 3 4 I 0 0 0 0 11

OPERATOR MESSAGES 14 6 10 2 2 0 2 3 2 0 4 0 45

PROGRAM ERRORS I 12 9 1 6 5 0 0 1 1 0 3 39

USER CANCELLED I 5 I 1 2 5 4 0 2 0 0 0 21

LEXICAL CHOICE 6 32 11 4 10 31 6 3 1 11 3 I 119

Table II. Percentage of events to number of queries.

JAN FEB MAR APR MAY dUN JUL AUG SEP OCT NOV DEC TOTAL

COMPLETED 40.0 57.1 70.6 84.6 62.2 57.9 78.2 71.0 53.1 85.0 69.2 62.5 65.1

LOOKUP FAILURES 6.7 11.8 9.4 9.6 22.2 16.7 35.6 25.8 3.1 10.0 23.1 12.5 15.1

PARSING FAILURES 31.1 16.1 21.1 7.7 8.9 25.4 12.6 22.6 34.4 0.0 23.1 0.0 18.7

NOTHING IN DATA BASE 26.7 5.0 6.7 5.8 2.2 1.8 14.9 16.1 3.1 5.0 23.1 0.0 7.7

ABORTED 6.7 5.6 7.8 3.8 11.1 7.9 4.6 6.5 3.1 10.0 7.7 12.5 6.7

USER COMMENTS 20.0 6.2 8.9 13.5 6.7 8.8 24.1 35.5 9.4 5.0 30.8 12.5 12.2

USER MESSAGES 0.0 1.9 0.0 0.0 0.0 2.6 4.6 3.2 0.0 0.0 0.0 0.0 1.4

OPERATOR MESSAGES 31.1 3.7 5.6 3.8 4.4 0.0 2.3 9.7 6.3 0.0 30.8 0.0 5.7

PROGRAM ERRORS 2.2 7.5 5.0 1.9 13.3 4.4 0.0 0.0 3.1 5.0 0.0 37.5 4.9

USER CANCELLED 2.2 3.1 0.6 1.9 4.4 4.4 4.6 0.0 6.3 0.0 0.0 0.0 2.7

LEXICAL CHOICE 13.3 19.9 6.1 7.7 22.2 27.2 6.9 9.7 3.1 55.0 23.1 12.5 15.1

for a suggestion. If t h a t s u g g e s t i o n w o r k e d , this w o u l d inflate the p e r c e n t a g e of success s o m e w h a t .

The T Q A s y s t e m i n c o r p o r a t e s t w o l o g g i n g f a c i l i - ties. O n e of t h e m is a v e r b a t i m r e c o r d of all o u t p u t t h a t a p p e a r e d on the user terminal. The o t h e r is a m u c h m o r e c o m p r e h e n s i v e t r a c e of the s y s t e m f l o w while it p r o c e s s e d e a c h question. The p r i m a r y use of the s e c o n d t r a c e was to allow us to i s o l a t e the n a t u r e of the p r o b l e m s which a r o s e with a view to c o r r e c t i o n . This file, h o w e v e r , c o n t a i n s m u c h i n t e r e s t i n g d e t a i l a b o u t the a m o u n t of c o m p u t e r time, (as o p p o s e d to e l a p s e d t i m e ) , used in e a c h p r o c e s s i n g step, the n u m - b e r of i n t e r m e d i a t e s t r u c t u r e s c r e a t e d , i n d i c a t i o n s as to e x a c t l y which step c a u s e d a failure, a n d the like. U n f o r t u n a t e l y , the s e c o n d file was n o t o r i g i n a l l y m e a n t to be a m e n a b l e to m a c h i n e p r o c e s s i n g , a n d its f o r m a t was c h a n g e d a few times during the c o u r s e of the y e a r . In a d d i t i o n , w h e n the c o m p u t i n g s y s t e m f a i l e d , this s e c o n d file was s o m e t i m e s lost, a l t h o u g h the basic log file s e l d o m was. A s a result, the s t a t i s - tics in the t a b l e s m a y be in e r r o r b y a p e r c e n t a g e p o i n t or two. In any case, the e r r o r is not s u f f i c i e n t l y large to a f f e c t any of the m a j o r c o n c l u s i o n s , w h i c h are qual- itative at best.

T a b l e I lists the raw n u m b e r s , b y m o n t h and t o - t a l e d for the y e a r , for a n u m b e r of the e v e n t s which o c c u r in s y s t e m o p e r a t i o n . P e r c e n t a g e s for e a c h e v e n t r e l a t i v e to the n u m b e r of queries are given in T a b l e II.

Q U E R I E S r e f e r s t o u s e r i n p u t s t e r m i n a t e d b y a q u e s t i o n m a r k or e x c l a m a t i o n point. C O M P L E T E D is the n u m b e r of queries which r e s u l t e d in access to the d a t a b a s e a n d a r e s u l t i n g r e s p o n s e to the user. L O O K U P F A I L U R E S is the n u m b e r of times the user t y p e d a w o r d which was not in the l e x i c o n a n d was not c a p i t a l i z e d . Such w o r d s w e r e d i s p l a y e d b a c k , with the o p t i o n of c h a n g i n g the w o r d e n t e r e d or e n t e r i n g an e n t i r e new question. P A R S I N G F A I L U R E m e a n s t h a t the s y s t e m d i d n o t r e a c h the p o i n t of p r o d u c i n g a q u e r y s t r u c t u r e . N O T H I N G I N D A T A B A S E m e a n s t h a t a logical f o r m was g e n e r a t e d a n d a q u e r y issued to S y s t e m / R , b u t n o t h i n g s a t i s f i e d the query. This c o u l d h a p p e n f o r a v a r i e t y of r e a s o n s , i n c l u d i n g an e r r o n e o u s parse, a w r o n g logical f o r m , a m i s t a k e in the p r o g r a m w h i c h g e n e r a t e d the s e a r c h r e q u e s t , or a real case of missing d a t a . D e t a i l e d a n a l y s i s of e a c h case w o u l d be r e q u i r e d to b e sure w h a t p r o p o r t i o n fell i n t o e a c h c a t e g o r y .

T h e o t h e r c a t e g o r i e s are m o r e o r less s e l f - e x p l a n a t o r y . A B O R T E D is an i n d i c a t i o n t h a t q u e r y

(8)

processing did not reach a normal termination, usually b e c a u s e of a m a c h i n e failure, i.e., h a r d w a r e or system software, but s o m e t i m e s b e c a u s e of a p r o b l e m in the T Q A code. The U S E R C O M M E N T S line comes f r o m a feature we had included in the system in an a t t e m p t to collect o n - g o i n g user reaction to s y s t e m behavior. At the end of e a c h question, b e f o r e p r o d u c i n g the message " T Y P E N E X T Q U E S T I O N " , we displayed a

request for c o m m e n t on the preceding question. It

turned out that users found this something of a nui- sance, since mostly they were satisfied with the answer, so we made an early change to the terminal read p r o g r a m such that a reply to the p r o m p t i n g message for c o m m e n t s which ended in " ? " would be t r e a t e d as a null c o m m e n t and a n o t h e r question, in effect making the c o m m e n t optional. Student users t o o k the p r o m p t s o m e w h a t m o r e seriously than the regular e m p l o y e e s and m a d e m o r e c o m m e n t s . The two M E S S A G E cate- gories refer, to a s y s t e m facility enabling a user to send messages to the T Q A o p e r a t o r and the reverse. O n e can see that it is usually used b y the T Q A o p e r a t o r as a s o m e w h a t m o r e c o n v e n i e n t m e a n s t h a n the tele- p h o n e for advising a user a b o u t the nature of a p r o b - lem or for warning that the system was to be stopped, and the like. P R O G R A M E R R O R is an error in the T Q A p r o g r a m which was d e t e c t e d and caused a query to a b o r t but left the s y s t e m able to a c c e p t a n o t h e r input. T h e r e is also a facility, t a b u l a t e d under U S E R C A N C E L L E D , allowing a user to cancel a q u e r y at a n y time by pressing a special k e y on the t e r m i n a l k e y b o a r d . This is used w h e n the user realizes he m a d e a mistake or when processing seems to be going on too

long. T h e c a t e g o r y L E X I C A L C H O I C E indicates

h o w o f t e n the s y s t e m a s k e d the user to clarify the

m e a n i n g of a word. F o r e x a m p l e , " a r e a " can be

"parcel a r e a " or " f l o o r a r e a " and if the s y s t e m c a n n o t disambiguate by the context, the user is s h o w n the two

choices and asked to " T y p e A or B". The m o s t fre-

q u e n t ambiguities h a v e to do with duplication of names for streets, n e i g h b o r h o o d s and schools.

As can be seen in line 1 of T a b l e I, there was a drastic fall-off in usage in the latter part of the year.

This has two p r i m a r y causes. In the first place, the

planning office was being rebuilt during the period and the w o r k space was o f t e n disrupted. The m a j o r cause, as m e n t i o n e d above, was a r e - o r i e n t a t i o n of the planning d e p a r t m e n t activities b y a newly a p p o i n t e d direc-

tor. T h e d e p a r t m e n t m e m b e r s n o w spend the m a j o r

part of their time on administrative activities; planning activities, like land analyses and drafting of new z o n - ing districts, are n o w carried out b y o f f - p r e m i s e s c o n - sultants who do not have r e a d y access to the terminal. F r o m the latter p a r t of 1978 through 1979, the s y s t e m was used only i n t e r m i t t e n t l y b y the planners, w h o o c c a s i o n a l l y n e e d e d s o m e of the basic land r e c o r d

data, and b y o t h e r s , like the fire d e p a r t m e n t , w h o w a n t e d to h a v e a list of tall buildings.

T h e r e is an o b v i o u s rise in U S E R C O M M E N T S , L O O K U P F A I L U R E S , a n d P A R S I N G F A I L U R E S during June, July, and August. During this period, the planning d e p a r t m e n t h a d a n u m b e r of student interns w h o were using the system, s o m e t i m e s in a play mode. It is easy to u n d e r s t a n d that new users exploring the system would have m o r e than an ordinary n u m b e r of

parsing failures. T h e increase in lookup failures is a

little m o r e puzzling, although part of the increase has the s a m e cause, i.e., some of the questions were out- side the d o m a i n , and c o n s e q u e n t l y the words used were not in the lexicon.

T h e large p e r c e n t a g e for the N O T H I N G I N D A T A B A S E c a t e g o r y in J a n u a r y c o m e s in large part f r o m two queries which had four underlying structures, all of which resulted in the answer " N O T H I N G I N T H E D A T A B A S E . " This was caused b y a p r o b l e m in the g r a m m a r which has since b e e n corrected. Some of the o t h e r instances of this message during the year resulted f r o m an inadvertently, and u n f o r t u n a t e l y , successful

e x a m p l e of subtle user training or conditioning. We

discovered that users t e n d e d to r e s p o n d to the s y s t e m b y echoing, s o m e t i m e s e x a c t l y and s o m e t i m e s , only partially, w h a t the s y s t e m h a d p r i n t e d to t h e m , no m a t t e r if their initial phrasing had b e e n a c c e p t e d or not. Some input word sequences are t r e a t e d as phras- es by the strategy of scanning an input q u e r y against a phrase lexicon first b e f o r e looking in the regular lexi-

con. An e n t r y like " g a s s t a t i o n " c a m e out of the

phrase lookup r e p r e s e n t e d as " G A S S T A T I O N " for

p u r p o s e s of lexical lookup. T h e users would s o m e -

times input that, or s o m e variant like " G a s S t a t i o n " ,

which would be t a k e n as a p r o p e r name. T h e result

would then be i n t e r p r e t e d as a query having to do with

an o w n e r n a m e d " G a s Station". We n o w echo what

the user has typed, or some variant which will be ac- c e p t a b l e if typed. M o s t of the instances of the c a t e g o - ry N O T H I N G I N D A T A B A S E really are the r e s p o n s e to a request for i n f o r m a t i o n which is not in the data

base. M a n y of these are r e q u e s t s f o r i n f o r m a t i o n

a b o u t p e o p l e w h o are n o t o w n e r s of p r o p e r t y , or a b o u t addresses which are not legitimate.

A p a r t f r o m the drastic fall in usage at the end of the y e a r discussed above, there a p p e a r s not to be a n y trend in a n y of the other m e a s u r e s of s y s t e m o p e r a - tion. The sequence of p e a k s and valleys m a y be char- acteristic of s y s t e m s like this, or m a y simply result f r o m insufficient time for the s y s t e m to settle down.

Operating Characteristics

Cumulative statistics for the s y s t e m o p e r a t i n g characteristics are f o u n d in T a b l e s I I I and IV and Figures

4 t h r o u g h 8 for each s y s t e m c o m p o n e n t . T h e histo-

(9)

T a b l e III. Q u e r y / l o g i c a l f o r m t i m e - N o . o f s t r u c t u r e s .

1 2 3 4 5 >

TOTAL YEAR

QUERY STRUCTURE TIME (SEC) 497 53 22 16 3

LOGICAL FORM TIME (SEC) 536 24 4 2 12

NO. OF SURFACE STRUCTURES 470 146 38 9 3

NO. OF UNDERLYING STRUCTURES 581 7 0 2 0

NO. OF QUERY STRUCTURES 581 7 0 2 0

NO. OF LOGICAL FORMS 571 5 0 2 0

NO. OF ANSWERS 513 0 0 0 0

T a b l e IV. Q u e r y / l o g i c a l f o r m t i m e - N o . o f s t r u c t u r e s b y m o n t h .

1 2 3 4 5 >

MAY

JUNE

JULY

QUERY STRUCTURE TIME (SEC) 58 10 I I 0

LOGICAL FORM TIME (SEC) 69 O 0 0 0

QUERIES

200

150

1001

501 X XX XX XX XX XXX XXX XXX XXX XXX XXX XXX XXXX

XXXXX XXXXXX XXXXXXX XXXXXXXXXX

lXXXXXXXXXXXXXXXXXX XXXXX XXX X

51 101 151 201 251 301

>

[image:9.612.283.549.55.647.2]

MINUTES

Figure 4 . E l a p s e d t i m e - T o t a l .

QUERIES

2501

I

Ix

IX

and a long tail. ( T h e a p p a r e n t spike at the right-hand end of the " N u m b e r of lines in a n s w e r " is not real; t h a t point is really " a n s w e r s of 50 lines or m o r e " . T h e right-most point of all the histograms should be read in a similar way, i.e., as including the total of all values in the r e m a i n d e r of the tail.)

Tables III and I V contain t w o kinds of information. T h e first two lines under " T o t a l Y e a r " in T a b l e I I I and u n d e r each m o n t h in T a b l e IV show the n u m b e r of queries w h o s e m a c h i n e p r o c e s s i n g time required the n u m b e r of seconds shown by the column head. Thus, in T a b l e III, 4 queries t o o k 3 seconds of processing

time to g e n e r a t e a logical form. A table f o r m was

used for this i n f o r m a t i o n b e c a u s e a h i s t o g r a m f o r " t i m e to p r o d u c e a q u e r y s t r u c t u r e " and " t i m e to p r o d u c e a logical f o r m " would h a v e b e e n simply a spike. T h e o t h e r lines in the tables are counts, so that,

IX

501X

IX X

IX X X

IXXXX XXX X

IXXXX XXXXXXXXXXXXXXXXXXXXXXXX X X XX XX X XX

. . .

51 101 151 201 251 301 351 401 451 501

Figure 5. N u m b e r o f l i n e s in a n s w e r - T o t a l .

again in T a b l e III, 7 queries had 2 underlying structures.

F r o m the first two lines of T a b l e III, it c a n be seen that neither the q u e r y structure nor the logical f o r m a c c o u n t for very m u c h of the processing time, m o s t l y

(10)

QUERIES 2001

X

150 X

X X X X

100 X

XXXX XXXX XXXX XXXXXX 501XXXXXX I XXXXXXX I XXXXXXX I XXXXXXXXXX

IXXXXXXXXXXXXXXXXXX XXX XXXXX

.... ... ... ... ... ..

51 101 151 201 251 301

>

SECONDS

QUERIES I

4001

I I

IX IX

3501X IX

IX 50IX

Ix IXX IXXXX

IXXXXXXXXXXXXX x x x x xxxx x

. ... ... ... ... ... ....

51 101 151 201 251 301

>

SECONDS

Figure 8. Answer processing - Total.

Figure 6. Generation of surface structures - Total.

QUERIES I

1501

100 x x x xxxx x x x x x x x x x x xxxxx 501 xxxxx

I x x x x x x x I x x x x x x x xx IXXXXXXXXXXX

IXXXXXXXXXXXXXXXXX XXXXX XXXXXX xx x

.... .... .... ... .... .... .... .... .... .... .... .... ...

51 lOl 151 201 251 301 351 4Ol 451 501

>

SECONDS

Figure 7. Transformational parsing - Total.

less than 1 second, and Figure 8 shows that answer processing, i.e., data base lookup and printing, is also not a major time user. F r o m Figures 6 and 7 it can be seen that surface structure parsing, which includes the

time for string t r a n s f o r m a t i o n s , and t r a n s f o r m a t i o n a l processing both typically consume around 4 seconds. T h e r e f o r e , total machine time for a typical q u e r y is around 10 seconds, although extreme cases can take much longer.

F r o m Figure 4, it can be seen that elapsed time for a query is around 3 minutes, although there is again a long tail. Elapsed time depends primarily on machine load and user behavior at the terminal. The c o m p u t e r on which the system o p e r a t e d was an IBM 3 7 0 / 1 6 8 with an attached processor, 8 m e g a b y t e s of m e m o r y and extensive peripheral storage, operating under the V M / 3 7 0 time-sharing system. During business hours, there were usually in excess of 200 users logged on to the system, so any one user received only a small share of the resources. Besides queuing for the C P U and memory, this system developed queues on an I B M 3850 Mass Storage System, on which the T Q A data base was stored. With all delays a c c o u n t e d for, it could easily take several minutes of elapsed time to accumulate 10 or 15 seconds of C P U time.

[image:10.612.318.485.55.333.2]

(11)

Fred J. Damerau Operating Statistics for the Transformational Question Answering System

T h e user is, of course, not c o n c e r n e d with his o w n

delays, but only with the system delay. In order to

k e e p him i n f o r m e d of progress through the query, and give him assurance that the c o m p u t e r is still operating, a C P U time clock is displayed in the lower right corner of the screen, along with an indication of the process-

ing phase being carried out at that time. In general,

these users did not b e c o m e c o n c e r n e d with r e s p o n s e time, possibly b e c a u s e t h e y had no e x p e r i e n c e with

o t h e r interactive c o m p u t e r systems. H o w e v e r , if the

s y s t e m was slow, a user might well choose to look up a single piece of i n f o r m a t i o n in the printed listing of the d a t a base rather t h a n ask through the c o m p u t e r system.

Figure 5 shows that m o s t queries have a one line

a n s w e r , b u t t h a t the tail is v e r y long. G e n e r a l l y

speaking, users were interested in totals or averages of d a t a fields like "dwelling units" or " p a r c e l a r e a " for aggregates of parcels specified b y land use, neighbor- hood, census tract, and the like, o f t e n in c o m b i n a t i o n . Questions of this kind are not readily a n s w e r e d f r o m the printed copies of the parcel file, and specific p a t - terns do not occur o f t e n enough to m a k e it worthwhile to ask for a b a t c h p r o g r a m to be written by the City c o m p u t e r staff.

T h e yearly s u m m a r y statistics discussed a b o v e c o n - ceal the considerable variation which can be e n c o u n -

tered o v e r shorter time periods. T a b l e IV shows the

m o n t h l y statistics, c o r r e s p o n d i n g to T a b l e III for M a y ,

J u n e and the m o r e typical July. T h e p e r c e n t a g e of

queries with m o r e than one surface structure is clearly

v e r y high in M a y a n d June. This results in longer

a v e r a g e times for query structure processing and logi-

cal f o r m processing, as seen also in T a b l e IV. T h e

e f f e c t is e v e n m o r e o b v i o u s in the histograms for surface structure processing, Figures 9-11, and underlying structure processing, Figures 12-14. I n s p e c t i o n of the logs for M a y and J u n e shows that m o s t of this e f f e c t results f r o m only a few questions, e.g., " W h a t is the area of the x family houses in the y z o n e ? " , r e p e a t e d for a n u m b e r of c o m b i n a t i o n s of x and y. It h a p p e n s that o n e of the planners was checking figures to be included in a table in a large r e p o r t during this period. Sequences of questions like this are, of course, of little help in s y s t e m d e v e l o p m e n t , but do serve to build confidence in the utility of the system.

QUERIES

15

I0 X

X

XX

XXX

XXX X

XXX X X

XXXXXX XXXX X X X XX

...

51 101 151 201 251 301

>

[image:11.612.341.483.65.336.2] [image:11.612.336.483.393.608.2]

SECONDS

Figure 9. G e n e r a t i o n of s u r f a c e s t r u c t u r e s - May.

QUERIES

I0

XXX X

XXXX X

XXXXX X X

XXXXXX X X

XXXXXX XX X X

XXXXXXXXXX XXX XX

XXXXXXXXXXX XXX X XX

XXXXXXXXXXXXXXXXX XXX XXX

...

51 101 151 201 251 301

>

SECONDS

The Input Queries

Figure 1 given earlier shows a small selection of the queries put to the s y s t e m during 1978, to give the reader an idea of the kinds of questions asked b y the users. Additionally, there are four lengthy a p p e n d i c e s

which appear only in the microfiche supplement to this issue of the Journal. A p p e n d i x A p r e s e n t s the entire set of queries s u b m i t t e d to the s y s t e m during 1978, (but n o t e the caveat at the beginning regarding possi-

Figure 10. G e n e r a t i o n of s u r f a c e s t r u c t u r e s - June.

ble missing data b e c a u s e of lost 10gs). Sentences pre- ceded b y a C in A p p e n d i x A were c o m p l e t e d success- fully, b u t o n e m u s t r e m e m b e r t h a t " N O T H I N G I N T H E D A T A B A S E " was c o u n t e d as a successful answer b y the data reduction p r o g r a m e v e n t h o u g h there m a y h a v e b e e n s o m e earlier p r o b l e m in the query. A p p e n d i x B is a list of s e n t e n c e s which did n o t parse

(12)

QUERIES

I I

i I 301

I X

151 X

I X I X I X

I XX

101 XXX

I XXXX

5l XXXXX I XXXXXX I XXXXXXX

I XXXXXXXX

I XXXXXXXXX X

. . .

51 101 151 201 251 301

>

SECONDS

Figure 11. G e n e r a t i o n of surface s t r u c t u r e s - July.

X

201 X

QUERIES

25

X

X X

/

X X X X X X

X XX XXX X X XX XX

. . . .

51 101 151 201 251 301 351 401 451 501

>

SECONDS

QUERIES

15

X

X X

101 X X

I X X

I XX X X

I XXX X X

I XXX X X

51 XXX X X

I XXX X X X

I XXX XX XX X X

I XXXXXXXXXX X X XX X

JXXXXXXXXXXXX XXXX X X XXXXXX XX X

. . . .

51 101 151 201 251 301 351 401 451 501

>

SECONDS

Figure 13. T r a n s f o r m a t i o n a l p a r s i n g - June.

QUERIES

201

I i

I

151 X

I XX

I XXX

101 XXX

I XXX X I XXXXX I XXXXX

I XXXXX

51 XXXXX

I XXXXX X I XXXXX X

I XXXXXXXXXX X

I XXXXXXXXXXXX X

. ... ... ... ... ... ... ..

51 101 151 201 251 301 351 401 451 501

>

SECONDS

[image:12.612.55.198.48.415.2] [image:12.612.53.555.55.723.2] [image:12.612.325.548.63.336.2]

(13)

Fred J. D a m e r a u Operating Statistics f o r t h e Transformational Question Answering System

at their first submission, and A p p e n d i x C is a list which failed for some reason other than parsing at their first submission. Appendix D is a list of sentences for which the answer was " N O T H I N G I N T H E D A T A B A S E . " In most cases where this answer was not correct, the problem was in dealing with searches on persons' names. The treatment used presently is still not c o m p l e t e l y satisfactory, but has b e e n im- proved since the initial version. Names in the data base occur in a great variety of patterns, more than we felt worth devising procedures to handle. In the three latter appendices, those sentences which are satisfacto- rily answered by the system of M a y 14, 1979 are pre- ceded by an X. ( A n y necessary spelling corrections or ambiguity resolution were supplied.) A little over 60 percent of the failing sentences are processed correctly by this version. As can be seen, many of the remain- ing queries are so flawed that no system should be expected to answer them.

The set of successful queries is not really indicative of the full coverage of the system. There are a number of permissible c o n s t r u c t i o n s which the planners simply never used, so the subset is s o m e w h a t richer than shown. On the other hand there are large numbers of c o n s t r u c t i o n s involving p e r s o n a l p r o n o u n s , t h r e e - a r g u m e n t c o m p a r a t i v e s , q u a n t i f i c a t i o n with " e a c h " , etc., which the planners knew the system did not handle, and which they c o n s e q u e n t l y did not use.

Conclusions

The m o t i v a t i o n for publishing this p a p e r with its long appendices was to make it possible for a reader to come to some c o n c l u s i o n of his o w n regarding the utility of the T Q A system or something like it in a real environment. F o r reasons mentioned above, none of us thinks this experiment provides a definitive test for our system or even that very strong claims a b o u t percent of success can be made. F o r such purposes, a controlled experiment using a fixed system is clearly necessary, and we hope to make such a test in the near future. H o w e v e r , those of us working on the project are e n c o u r a g e d by the results summarized here, and even more by our conversations with the users of the system, who have been very positive. Within the limi- tations established by the subset of English that the system can recognize, it appears that n o n - d a t a processing professionals can extract useful i n f o r m a t i o n from their files with almost no training. A number of the gaps in our system can certainly be plugged if we can find the resources to work on them.

Our next step will be to make the T Q A system a front end to an existing formal query language system, p r o b a b l y S Q L - b a s e d System R, developed at the I B M San Jose Research L a b o r a t o r y , a version of which has been a n n o u n c e d as an I B M P r o g r a m P r o d u c t for use under the D O S / V S operating system. Such a step will

permit us to devote all of our resources to the problems of language and k n o w l e d g e r e p r e s e n t a t i o n , in- stead of spending part of that effort on data m a n a g e - ment. B e y o n d that, we will c o n s i d e r seriously the question of moving to different environments, both to new situations in the city planning domain and to c o m - pletely different domains, without having to rewrite all or even a major portion of our base system.

Acknowledgements

Besides Petrick, Plath and the author, the T Q A project currently includes Mark Pivovonsky, w h o has done the systems p r o g r a m m i n g for the project. Since this d o c u m e n t summarizes one y e a r ' s activity, it neces- sarily reflects the c o n t e n t of m a n y discussions with these individuals, and some of their opinions and in- sights on the nature of these problems are i n c o r p o r a t - ed herein. The w r o n g ones naturally are mine.

References

Astrahan, M.M.; Blasgen, M.W.; Chamberlin, D.D.; Eswaran, K.P.; Gray, J.N.; Griffiths, P.P.; King, W.F.; Lorie, R.A.; McJones, J.; Mehl, J.W.; Putzolu, G.R.; Traiger, I.L.; Wade, B.W.; Wat- son, V. (1976). System R: Relational Approach to Database Management. A C M Transactions on Database Systems, 1 (21), pp. 97-137.

Damerau, Fred J. (1977). Advantages of a Transformational Grammar for Question Answering. Proceedings o f the Fifth International Joint Conference on Artificial Intelligence, vol 1, p.

192.

Damerau, Fred J. (1978). The Derivation of Answers from Logical Forms in a Question Answering System. American Journal o f Computational Linguistics, Microfiche 75, pp. 3-42.

Krause, Juergen. (1979). Results of a User Study with the 'User Specialty Languages' System and Consequences for the Archi- tecture of Natural Language Interfaces. Technical Report 79.04.003, IBM Heidelberg Scientific Center, May 1979. Petrick, Stanley R. (1977). Semantic Interpretation in the Request

System. In Computational and Mathematical Linguistics, Pro- ceedings o f the International Conference on Computational Linguis- tics, Pisa, pp. 585-610.

Plath, Warren J. (1973). Transformational Grammar and Transfor- mational Parsing in the R E Q U E S T System. I B M Research Report R C 4396, Thomas J. Watson Research Center, Yorktown Heights, N.Y.

Plath, Warren J. (1974). String Transformations in the REQUEST System. American Journal o f Computational Linguistics, Micro- fiche 8.

Plath, Warren J. (1976). REQUEST: A Natural Language Question-Answering System. I B M Journal o f Research and Development, 20:4, (July 1976), pp. 326-335.

Robinson, Jane J. (1973). An Inverse Transformational Lexicon. In Natural Language Processing, Randall Rustin, ed., Algorithm- ics Press, Inc., New York, pp. 43-60.

Fred J. Damerau is a Research S t a f f M e m b e r at the I B M Thomas J. Watson Research Center in Yorktown

Heights, New York. He received the Ph.D. degree in

linguistics f r o m Yale University in 1966.