Text Generation

(1)

2. T e x t G e n e r a t i o n

W i l l i a m M a n n , C h a i r p e r s o n

Institute for Scientific Information (ISI) at the University of Southern California

Marina del Ray, CA 90291

Panelists

Madeline Bates, Bolt, B e r a n e k and N e w m a n B a r b a r a G r o s z , SRI I n t e r n a t i o n a l

David D. M c D o n a l d , University of M a s s a c h u s e t t s K a t h l e e n R. M c K e o w n , University of P e n n s y l v a n i a William Swartout, Institute for Scientific I n f o r m a t i o n

2.1 I n t r o d u c t i o n

This r e p o r t consists of t w o d o c u m e n t s describing the state of the art of c o m p u t e r g e n e r a t i o n of natural language text. B o t h were p r e p a r e d b y a panel of individuals w h o are active in research on text generation. T h e first d o c u m e n t assesses the state of the art, identi- fying four kinds of technical d e v e l o p m e n t s which will shape the art in the coming decade: linguistically justified g r a m m a r s , k n o w l e d g e r e p r e s e n t a t i o n m e t h o d s , models of the reader, and models of discourse. T h e second d o c u m e n t is a c o m p r e h e n s i v e bibliography on text generation, the first of its kind. In addition to citations of d o c u m e n t s , it includes descriptions of o n - going research efforts.

2.2 A s s e s s i n g T e x t Generation T e c h n o l o g y

O u r goal here is to assess the state of the art of text generation for two purposes: to help people w h o intend to apply that art in the near future and to aid in the design or selection of a p p r o p r i a t e research.

This assessment covers all of the technical m e t h o d s b y which c o m p u t e r p r o g r a m s create and p r e s e n t E n g - lish text in their outputs. ( F o r simplicity we always call the o u t p u t language English.) Because text generation has not always b e e n t a k e n seriously f r o m a technical point of view, it has b e e n actively p u r s u e d only recently as a topic in artificial intelligence. As a result of this late start, m u c h of the t e c h n o l o g y available for application t o d a y is still r a t h e r superficial. H o w e v e r , text generation is n o w such an active research topic that this superficial t e c h n o l o g y will s o o n be sur- passed. (The last part of this r e p o r t contains an extensive bibliography on the subject.)

2.3 W h a t T e c h n i q u e s A r e N o w A v a i l a b l e for Use in S y s t e m Designs?

T w o kinds of practical text g e n e r a t i o n techniques are already in general use and fairly well u n d e r s t o o d . T h e first is displaying p r e v i o u s l y p r e p a r e d text (or canned text), and the second is p r o d u c i n g text b y direct translation of k n o w l e d g e structures.

The simplest and m o s t c o m m o n l y used w a y to have a c o m p u t e r s y s t e m p r o d u c e text is for the i m p l e m e n - ters of the s y s t e m to figure out in a d v a n c e what sorts of English output will be required and t h e n store it as text strings. T h e c o m p u t e r m e r e l y displays the text that has b e e n stored. ( F o r e x a m p l e , almost all e r r o r m e s s a g e s are p r o d u c e d in this way.) It is relatively e a s y to h a v e a p r o g r a m p r o d u c e English in this way, and the text can be c o m p l e x and elegantly written if desired. U n f o r t u n a t e l y , b e c a u s e the text strings can be c h a n g e d i n d e p e n d e n t l y of any k n o w l e d g e structures the p r o g r a m might use, there is no g u a r a n t e e of c o n - sistency b e t w e e n w h a t the p r o g r a m does and w h a t it says it does. A n o t h e r p r o b l e m with c a n n e d text is that all questions and answers must be a n t i c i p a t e d in ad- wance; f o r large systems, that m a y p r o v e to be i m p o s - sible. Finally, since one text string looks like a n y o t h - er as far as the c o m p u t e r is c o n c e r n e d , the c o m p u t e r p r o g r a m c a n n o t easily h a v e a c o n c e p t u a l m o d e l of w h a t it is saying. This m e a n s t h a t o n e should not expect to see m u c h closure: satisfying 100 needs for text will not m a k e the s e c o n d 100 m u c h easier.

(2)

that it is readily understood. Finally, systems employ- ing this technique typically have had very little linguistic knowledge, so they have p r o d u c e d text that is ver- bose, stilted, and redundant, although readable.

Practical, n e a r - t e r m applications of text generation will share certain characteristics:

1. T h e y require short texts: one to three sentences. 2. T h e y have well-elaborated program data structures

corresponding in fairly simple ways to the desired texts.

3. The important knowledge can be represented well with present techniques; it does not involve the difficulties listed in section 2.6.

4. Limited fluency of output is acceptable.

Some so-called " e x p e r t s y s t e m s " that explain their reasoning in English have these characteristics.

We believe that text-producing systems of the future will continue to include processes that p r o d u c e text by translating knowledge structures. H o w e v e r , they will be integrated with other processes that use extensive linguistic knowledge, a discourse model, a model of the reader, and enhanced knowledge repre- sentations.

Because of the limited capabilities of present techniques, a new project aiming to produce a b e n c h m a r k application program in the text generation area would currently be counterproductive, since it would produce little or no transferable t e c h n o l o g y and would detract from the community's ability to make progress on the general problem.

2.4 Basic C o m p o n e n t s for a T e x t G e n e r a t i o n Facility

H o w can the very limited capabilities now available be d e v e l o p e d into fluent, p o w e r f u l text g e n e r a t i o n methods that are easily applied to new tasks? The next few sections describe the kinds of methods that are needed and are being developed.

The underlying model presumed here, which present research is moving toward, has the following characteristics:

1. Responsibility for text generation is in a text generation module r a t h e r than being s c a t t e r e d at the points of use.

2. A major portion of the text generation module is portable and is d e v e l o p e d cumulatively t h r o u g h many systems. The portable c o m p o n e n t s include a grammar that encodes general knowledge of Eng- lish and processes that handle linguistic, task- i n d e p e n d e n t information.

We feel strongly that a c o m p e t e n t text generation facility must have the following four identifiable com- ponents, and that limitations on these will be limita-

tions on the overall state of the art for the foreseeable future:

1. A comprehensive, linguistically justified grammar. 2. A k n o w l e d g e - r e p r e s e n t a t i o n formalism that can

encode diverse kinds of information. 3. A model of the intended reader of the text. 4. A model of discourse structure and control.

E a c h of these draws on existing n o n c o m p u t a t i o n a l precedents, and each requires some special adaptation to the text generation task.

Below we describe each of these basic c o m p o n e n t s in a form that it might achieve in five to ten years of research. (These descriptions are followed by a pro- jection of the practical alternatives available to system designers five years hence.)

2.5 L i n g u s t i c a l l y J u s t i f i e d G r a m m a r s f o r T e x t G e n e r a t i o n

G r a m m a r s are ordinarily d e v e l o p e d by linguists over periods of ten to t w e n t y or more years, in depart- ments of linguistics. The best ones may be written by a single individual, but they reflect the ideas of dozens or hundreds of people who have c o n t r i b u t e d to refin- ing particular forms.

Present practice in linguistics emphasizes carefully reasoned d e v e l o p m e n t of small fragments of grammars. H e n c e c o m p r e h e n s i v e , linguistically justified grammars, the sort we need, are very rare.

Several linguistic traditions (some associated with c o m p u t a t i o n and some not) are particularly likely to produce suitably refined, comprehensive grammars for text generators. T h e y are:

1. The systemic tradition, f o u n d e d by Michael A. K. Halliday around 1961.

2. The transformational tradition, decisively articulat- ed by Noam C h o m s k y starting in 1957.

3. The Generalized Phrase Structure G r a m m a r tradition, currently associated with Gerald Gazdar. 4. The A T N tradition, begun by Bill Woods and now

being developed by him and m a n y others.

5. The LSP tradition, developed so far mainly under the direction of Naomi Sager.

6. The Diamond (or Diagram) grammar by Jane Rob- inson.

(3)

William Mann Text Generation

2.6 K n o w l e d g e R e p r e s e n t a t i o n F o r m a l i s m s

T e x t generation p r o g r a m s c a n n o t i m p r o v e m u c h on the k n o w l e d g e t h e y are given. T h e n o t a t i o n f o r k n o w l e d g e must already contain a p p r o p r i a t e a b s t r a c - tions in an easily accessible form. T o d a y ' s n o t a t i o n s are relatively good at r e p r e s e n t i n g logical f o r m u l a s and deductive necessities, and also hierarchies of objects. C o v e r a g e is particularly w e a k for these o t h e r kinds of knowledge:

1. Time 2. Space

3. E v e n t s and actions 4. Cause

5. Collectives 6. Likelihood 7. Obligation 8. Possibility 9. N e g a t i o n 10. Q u a n t i f i c a t i o n

11. C o n t i n u i t y and discreteness

2.7 M o d e l s of t h e Reader

T e x t p r e p a r e d w i t h o u t c o n s i d e r i n g the r e a d e r is u n i f o r m l y awful. P r o g r a m s must h a v e explicit models of the reader that e n c o d e (or m a k e available) at least the following four kinds of information:

1. W h a t is o b v i o u s - including c o m m o n factual k n o w l - edge and certain " o b v i o u s " inferable information. O b v i o u s n e s s does not agree with logical validity. 2. W h a t has already b e e n told, and w h a t is o b v i o u s

f r o m that.

3. Wha~ others believe - including mutual beliefs and beliefs a b o u t the writer's belief.

4. W h a t is currently in the r e a d e r ' s attention.

B e y o n d these, the p r o g r a m should be able to r e a s o n a b o u t belief and intent.

2.8 M o d e l s of D i s c o u r s e

This is a n o t h e r linguistic m a t t e r , distinct f r o m g r a m m a r . R u n n i n g text has subtle i n t e r a c t i o n s b e - t w e e n its parts. W h e n we g e n e r a t e m u l t i s e n t e n t i a l text, we need a set of principles for organizing it. A f e w linguists a n d p h i l o s o p h e r s are m a k i n g i m p o r t a n t contributions, b u t far m o r e w o r k is needed. Again, to d e v e l o p e f f e c t i v e m o d e l s of discourse, r e s e a r c h p r o - jects will need to h a v e linguists on the staff.

A n a d e q u a t e discourse m o d e l will include s o m e r e p r e s e n t a t i o n of at least the following:

1. T h e structures that can be built out of sentences and larger units.

2. T h e needs of the writer t h a t each discourse structure meets.

3. T h e principal effects that the use of e a c h structure produces.

4. T h e e f f e c t s of various discourse structures on the r e a d e r ' s attention.

2.9 Relating t h e Basic C o m p o n e n t s to T e x t G e n e r a t i o n

H o w are these basic c o m p o n e n t s r e l a t e d to the whole task? W h y are they necessary, and h o w does their quality affect w h a t can be d o n e ?

2.9.1 The T e x t Q u a l i t y L i m i t a t i o n s of G r a m m a r s

In o r d e r to g e n e r a t e text that is not a w k w a r d or misleading, one m u s t be able to c o n t r o l a wide v a r i e t y of language e f f e c t s at the s e n t e n c e level. E f f e c t s will be, p r e s e n t in the m i n d of the r e a d e r in a n y case, a n d so the p r o g r a m m u s t either c o n t r o l t h e m or take seri- ous risks of m i s u n d e r s t a n d i n g and error. T h e e f f e c t s are p r o d u c e d b y the a r r a n g e m e n t s of words used, a n d so a t h e o r y of the a r r a n g e m e n t s of words is n e e d e d in o r d e r to a c h i e v e control. T h e o r i e s of the a r r a n g e - m e n t s of words are (or include) g r a m m a r s . T h e ability of a text g e n e r a t i o n s y s t e m to express m a n y d i f f e r e n t ideas well will be limited to the d i f f e r e n t e f f e c t s c o n - trollable t h r o u g h its g r a m m a r .

Use of an ad hoc g r a m m a r limits the g e n e r a t o r to expressing a n a r r o w range of ideas. It m a y do well in a short, carefully p l a n n e d d e m o n s t r a t i o n , b u t it will be too n a r r o w f o r m a n y practical purposes.

We c a n think of the g r a m m a r as a b o t t l e n e c k or filter at the o u t p u t of the text generator. O n l y those e x p r e s s i v e t e c h n i q u e s t h a t the s y s t e m c a n c o n t r o l t h r o u g h its g r a m m a r will be used.

2.9.2 The Knowledge Representation

T h e k n o w l e d g e r e p r e s e n t a t i o n f r a m e w o r k s in a p r o g r a m limit the range of things t h a t the p r o g r a m c a n u,;efully o p e r a t e upon. Since a text g e n e r a t o r m u s t create text out of s o m e k n o w l e d g e r e p r e s e n t a t i o n , it is likewise limited.

L i m i t a t i o n s on k n o w l e d g e r e p r e s e n t a t i o n s include two i m p o r t a n t kinds:

1. P r e s e n c e of a b s t r a c t i o n s - - a r e the c o n c e p t s t h a t m u s t be c o n v e y e d in text actually s y m b o l i z e d ? 2. E a s e of access--is there a fast, u n i f o r m m e t h o d f o r

finding the s y m b o l s t h a t r e p r e s e n t particular c o n - cepts?

W e c a n think of t h e s e limits as a b o t t l e n e c k or filter at the input of the text g e n e r a t o r . O n l y those c o n c e p t s t h a t pass t h r o u g h the filter will a p p e a r in the text.

2,9.3 M o d e l s of t h e Reader or S y s t e m User

(4)

m a y be useless. (With c a n n e d text, this p r o b l e m is usually avoided b e c a u s e the writer k n o w s a great deal a b o u t the r e a d e r ' s k n o w l e d g e . ) To take the r e a d e r ' s k n o w l e d g e into a c c o u n t requires an explicit m o d e l of that knowledge.

Of the four kinds of k n o w l e d g e previously identified in section 2.7, the m o s t critical for basic text quality are the k n o w l e d g e of what is obvious and the knowledge of what has already b e e n c o n v e y e d .

2.9.4 Models of Discourse

We k n o w that single sentences are too limited to express some things. M o v i n g to multisentential text necessarily c r e a t e s discourse, which involves m a n y kinds of effects that p r o g r a m s c a n n o t yet control. F o r e x a m p l e , putting o n e s e n t e n c e a f t e r a n o t h e r can be used to express time s e q u e n c e , d e d u c t i v e necessity, cause, exemplification or other relationships, without any words being used to express the relation.

Creating these effects w h e n they are desired, and avoiding t h e m w h e n t h e y are not, requires explicit models of discourse p h e n o m e n a . At a higher level, sequences of sentences and p a r a g r a p h s of a text must be organized in a principled way. This also requires explicit discourse models. Until such models are developed, texts will be a w k w a r d and misunderstanding will be c o m m o n .

2.10 D e s i g n i n g in 1986 f o r P r a c t i c a l T e x t G e n e r a t i o n

W h a t sort of practical application of text g e n e r a - tion will be possible in five years? We e x p e c t the designer to be in the following situation:

• T h e r e will be several e x a m i n a b l e s y s t e m s with devel- o p e d m e t h o d s for creating the four basic c o m p o - nents. F o r each kind of c o m p o n e n t , there will be s o m e a t t r a c t i v e p r e c e d e n t s for future work. N o one system will have a thoroughly e l a b o r a t e d ap- p r o a c h to all four.

• System design b a s e d on a d a p t a t i o n of these p r e c e d - ents will be possible. The design w o r k will involve c r e a t i n g " h a n d c r a f t e d " s y s t e m s t h a t e m b o d y a n d reconcile the good techniques. It will require the personal a t t e n t i o n of c o m p u t e r scientists, linguists, and p r o g r a m m e r s w h o h a v e b e e n i n v o l v e d in the prior research.

• The resulting s y s t e m can be e x p e c t e d to create acceptable, effective texts, limited by quality consid- erations to be a b o u t one page in length.

F o r the m e s s a g e - s y s t e m p r o b l e m used as a focal p r o b l e m for the w o r k s h o p , there were two tasks identified for text generation: a task of reporting system status and a task of reporting h o w and to w h o m particular m e s s a g e s would be r e l e v a n t f o r an identified collection of people. F o r b o t h of these tasks it seems

feasible for design of a practical text g e n e r a t i o n m o - dule to begin in five years. H o w e v e r , it is questiona- ble w h e t h e r a d e q u a t e techniques would be available to d e t e r m i n e what m e s s a g e relevance to report.

2.11 Present Research Status

T h e m o s t influential research in the next few years will be f o c u s e d on the four basic c o m p o n e n t s : linguistically justified g r a m m a r s , k n o w l e d g e r e p r e s e n t a t i o n formalisms, models of readers, and models of discourse structure and control. Part of the e f f o r t will go into d e v e l o p i n g these c o m p o n e n t s individually, p a r t into learning h o w to c o m b i n e them.

A p p r o p r i a t e l y , m o s t of the c u r r e n t e f f o r t is going into either developing single c o m p o n e n t s or combining two of them. A l t h o u g h there are several institutions and individuals working on all four of these c o m p o - nents, no one has yet d e m o n s t r a t e d a s y s t e m in which all f o u r a p p r o a c h the s c o p e s of action indicated f o r t h e m above.

Malay i m p o r t a n t topics are being neglected for lack of research support. ( T h e r e is no lack of interested people; natural language processing continually generates high interest in the A I c o m m u n i t y . We are not sure w h e t h e r there is a shortage of interested qualified people.)

M o r e i n f o r m a t i o n on the state of the art and cur- rent activity can be f o u n d in the bibliography on text generation, the last part of this report, which includes a section on research in progress.

2.12 Text Generation Bibliography

This bibliography was p r e p a r e d in c o n n e c t i o n with the a u t h o r s ' r e p o r t on the state of the art in text generation. It includes published works on generation of natural language text b y c o m p u t e r p r o g r a m s as well as s o m e prior n o n c o m p u t a t i o n a l w o r k that has b e e n used as a basis f o r such c o m p u t e r programs. It is not ex- haustive in a n y sense, and no evaluation is implied by the p r e s e n c e or a b s e n c e of a citation of a n y particular publication.

Allen, J. 1978 Recognizing Intention in Dialogue. Ph.D. thesis. University of Toronto.

Appelt, D.E. 1980a A planner for reasoning about knowledge and action. In Proceedings of the First Annual National Conference on Artificial Intelligence. Stanford University (August).

Appelt, D.E. 1980b Problem solving applied to language generation. In Proceedings of the Eighteenth Annual Meeting of the Association for Computational Linguistics.

Appelt, D.E. 1981 Planning natural language utterances to satisfy multiple goals. Forthcoming Ph.D. thesis. Stanford University. Badler, N.I. 1975 The conceptual description of physical activi-

ties. In Proceedings of the Thirteenth Annual Meeting of the Association for Computational Linguistics.

(5)

Bates, M. 1980 Language instruction without pre-stored examples. In Proceedings o f the Third Canadian Symposium on Instructional Technology. (February)

Bates, M. and Ingria, R. 1981a Controlled transformational sentence generation. In The Nineteenth Annual Meeting o f the Asso- ciation for Computational Linguistics. Stanford University (June).

Bates, M. et al. 1981b Generative tutorial systems. In Proceedings o f the 1981 ADCIS Conference (March).

Bates, M. et al. 1981c ILIAD Final Report. Bolt Beranek and Newman, Inc., Technical Report 4771 (September).

Berry, M. 1975 Introduction to Systemic Linguistics: Structures and Systems. B.T. Batsford, Ltd., London.

Berry, M. 1977 Introduction to Systemic Linguistics: Levels and Links. B.T. Batsford, Ltd., London.

Birnbaum, L., Flowers, M., and McGuire, R. 1980 Towards an A.I. model of argumentation. In Proceedings o f the First Na- tional Conference on Artificial Intelligence. Stanford University (August).

Boguraev, B.K. 1979 Automatic Resolution o f Linguistic Ambiguities. Computational Laboratory, Cambridge University, England, Technical Report 11 (August).

Bossie, S. 1981 A Tactical Component for Text Generation: Sentence Generation Using a Functional Grammar. University of Pennsyl- vania, Technical Report MS-CIS-81-5.

Brown, G.P. 1974 Some problems in German to English Machine Translation. Massachusetts Institute of Technology, Technical Report 142 (December). Project MAC.

Brown, R.H. 1974 Use o f multiple-body interrupts in discourse generation. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, Bachelors Degree thesis.

Bruce, B.C. 1975 Generation as a social action. In Proceedings o f Theoretical lssues in Natural Language Processing - I (TINLAP). Cambridge, MA (June) 64-67.

Bruce, B.C., Collins, A., Rubin, A.D., and Gentner, D. 1978 A Cognitive Science Approach to Writing. Bolt Beranek and New- man, Inc., Technical Report 89 (June).

Carbonell, J.R. and Collins, A.M. 1973 Natural semantics in artificial intelligence. In Proceedings o f the Third International Joint Conference on Artificial Intelligence. Stanford, CA: 344- 351.

Carr, B. and Goldstein, I. 1977 Overlays: A Theory o f Modeling for Computer Aided Instruction. Massachusetts Institute of Technol- ogy, Artificial Intelligence Laboratory, Memo 406 (February). Chafe, W.L. 1977 Creativity and verbalization and its implications

for the nature of stored knowledge. In Freedle, R.O., Ed., Discourse Processes: Advances in Research and Theory. Volume 1: Discourse Production and Comprehension. Ablex, N J: 41-55. Chafe, W.L. 1979 The flow of thought and the flow of language.

In Givon, T., Ed., Syntax and Semantics. Volume 12: Discourse and Syntax. Academic Press, NY.

Chester, D. 1976 The translation of formal proofs into English. Artificial Intelligence 7 (3) Fall.

Clancey, W.J. 1978a Tutoring Rules for Guiding a Case Method Dialogue. Stanford University, Department of Computer Sci- ence, Heuristic Programming Project, Technical Report HPP- 78-25 (December). Also International Journal of Man-Machine Studies I1 (1979) 25-49.

Clancey, W.J. t978b An Antibiotic Therapy Selector Which Provides for Explanation. Stanford University, Technical Report HPP- 78-26 (December).

Clancey, W.J. 1979 Dialogue management for rule-based tutorials. In Proceedings o f the Sixth International Joint Conference on Artificial Intelligence. Tokyo (Agusut) 155-161.

Clippinger, J.H. 1974 A Discourse Speaking Program as a Prelimi- nary Theory o f Discourse Behavior and a Limited Theory o f Psy- choanalytic Discourse. Ph.D. thesis, University of Pennsylvania. Clippinger, J.H. 1975 Speaking with many tongues: Some prob-

lems in modelling speakers of actual discourse. In Proceedings o f Theoretical Issues in Natural Language Processing - I (TINLAP). Cambridge, MA (June) 68-73.

Codd, E.F. et al. 1978 Rendezvous Version 1: An Experimental English-Language Query Formulation System for Casual Users o f Relational Databases. IBM Research Laboratory, San Jose, CA. Technical Report RJ2144.

Colhen, P.R. and Perrault, C.R. 1977 Overview of planning speech acts. In Proceedings o f the Fifth International Joint Conference on Artificial Intelligence. Cambridge, MA (August).

Cohen, P.R. 1978 On Knowing What to Say: Planning Speech Acts. University of Toronto, Technical Report 118.

Cohen, P.R. and Perrault, C.R. 1979 Elements of a plan-based theory of speech acts. Cognitive Science 3.

Collins, A.M., Passafiume, J., Gould, L., and Carbonell, J.G. 1973 Improving Interactive Capabilities in Computer-Assisted Instruction. Bolt Beranek and Newman, Inc., Technical Report 2631 (August).

Cullingford, R.E., Krueger, M.W., Selfridge, M., and Bienkowsky, M.A. 1981 Automated explanations as a component of a computer-aided design system. To appear in I E E E Transactions on Systems, Man, and Cybernetics.

Danes, F., Ed. 1974 Papers on Functional Sentence Perspective. Academia, Publishing House of the Czechoslovak Academy of Sciences.

Davey, A. 1979 Discourse Production. Edinburgh University Press, Edinburgh.

de Beaugrande, Robert 1980 Advances in Discourse Processes. Volume IV: Text, Discourse, and Process: Toward a Multidiscipli- nary Science o f Texts. Ablex, Norwood, NJ.

de Joia, A. and Stenton, A. 1980 Terms in Systemic Linguistics. Batsford Academic and Educational, Ltd., London.

Dehn, N. 1981a Memory in story invention. In Proceedings o f the Third Annual Conference o f the Cognitive Science Society. Univer- sity of California, Berkeley (August).

Dehn, N. 1981b Story generation after TALE-SPIN. In Proceed- ings o f the Seventh International Joint Conference on Artificial Intelligence. University of British Columbia (August).

Fawcett, R.P. 1980 Exeter Linguistic Studies. Volume 3: Cognitive Linguistics and Social Interaction. Julius Groos Verlag Heidel- berg and Exeter University.

Fillmore, C.J. 1976 The case for Case reopened. In Cole, P. and Sadock, J.M., Eds., Syntax and Semantics. Volume 8: Grammati- cal Relations. Academic Press, NY.

Forbus, K. and Stevens, A. 1981 Using Qualitative Simulation to Generate Explanations. Bolt Beranek and Newman, Inc., Techni- cal Report 4490 (March). Also Cognitive Science 3.

Friedman, J. 1969 Directed random generation of sentences. Communications o f the A C M 12 (6).

Gabriel, R.P. 1980 An Organization for Programs in Fluid Domains. Ph.D. thesis, Stanford University, 1980.

Goldman, N.M. 1974 Computer Generation o f Natural Language from a Deep Conceptual Base. Ph.D. thesis, Stanford University. Stanford Artificial Intelligence Laboratory Memo AIM-247. Goldman, N.M. 1975a The boundaries of language generation. In

Proceedings o f Theoretical Issues in Natural Language Processing - I (T1NLAP). Cambridge, MA (June) 74-78.

Goldman, N.M. 1975b Conceptual generation. In Schank, R.C., Ed., Conceptual Information Processing. North-Holland, Amster- clam.

(6)

Artificial Intelligence Laboratory, Cambridge, MA, memo 495 (October).

Grimes, I.E. 1975 The Thread o f Discourse. Mouton, The Hague. Grishman, R. 1979 Response generation in question-answering

systems. In Proceedings o f the Seventeenth Annual Meeting o f the Association for Computational Linguistics (August) 99-101. Grosz, B.J. 1979 Utterance and objective: Issues in natural lan-

guage communication. In Proceedings o f the Sixth International Joint Conference on Artificial Intelligence.

Grosz, B.J. 1980 Focusing and description in natural language dialogs. In Joshi, A. et al., Eds., Elements o f Discourse Under- standing: Proceedings o f a Workshop on Computational Aspects o f Linguistic Structure and Discourse Setting. Cambridge University Press, Cambridge.

Habel, C., Schmit, A., and Schweppe, H. 1977 On Automatic Paraphrasing o f Natural Language Expressions. Technische Univ- ersiteit, Berlin, Technical Report 3 / / 1 7 . Semantic Network Project.

Halliday, M.A.K. 1961 Categories of the theory of grammar. Word 17.

Halliday, M.A.K. 1967a Notes on transitivity and theme in Eng- lish. Journal o f Linguistics 3 (1) 37-81.

Halliday, M.A.K. 1967b Notes on transitivity and theme in Eng- lish. Journal o f Linguistics 3 (2) 199-244.

Halliday. M.A.K. 1968 Notes on transitivity and theme in English. Journal o f Linguistics 4 (2) 179-215.

Halliday, M.A.K. 1970 Language structure and language function. In Lyons, J., Ed., New Horizons in Linguistics. Penguin. Halliday, M.A.K. 1976 System and Function in Language. Oxford

University Press, London.

Halliday, M.A.K. 1978 Language as Social Semiotic. University Park Press, Baltimore, MD.

Halliday, M.A.K. and Hasan, R. 1976 Cohesion in English. Long- man, London. English Language Series, Title No. 9.

Heidorn, G. 1972 Natural Language Inputs to a Simulation Pro- gramming System. Naval Postgraduate School, Monterey, CA, Technical Report NPS-55HD72101A.

Heidorn, G. 1975 Augmented phrase structure grammar. In Proceedings o f Theoretical Issues in Natural Language Processing - I (TINLAP). Cambridge, MA (June) 1-5.

Herskovits, A. 1973 The Generation o f French from Semantic Structure. Stanford Artificial Intelligence Laboratory, Technical Report 212.

Hobbs, J. and Evans, D. 1980 Conversation as planned behavior. Cognitive Science 4, 349-377.

Hudson, R. A. 1971 North-Holland Linguistic Series. Volume 4: English Complex Sentences. North-Holland, London and Am- sterdam.

Hudson, R. A. 1976 Arguments for a Non-Transformational Grammar. University of Chicago Press, Chicago, 1976.

Hutchins, W . J . 1971 The Generation o f Syntactic Structures from a Syntactic Base. North-Holland, Amsterdam.

Kay, M. 1979 Functional grammar. In Proceedings o f the Fifth Annual Meeting o f the Berkeley Linguistic Society.

Kempen, G. 1977 Building a psychologically plausible sentence generator. Presented at the Conference on Empirical and Meth- odological Foundations of Semantic Theories for Natural Lan- guage, Nijmegen, The Netherlands.

Kempen, G. and Hoenkamp, E. 1979 A Procedural Grammar for Sentence Production. University of Nijmegen, Department of Psychology, The Netherlands, Technical Report, 1979.

Klein, S. 1965 Automatic paraphrasing in essay format. Mechani- cal Translation 8 (3).

Klein, S. 1975 Meta-compiling text grammars as a model for human behavior. In Proceedings o f Theoretical lssues in Natural

Language Processing - I (TINLAP). Cambridge, MA (June) 94-98.

Knaus, R. 1975 Incremental sentence processing. American Jour- nal o f Computational Linguistics Fiche 33.

Kripke, S. 1977 Speaker reference and semantic reference. In French, P.A. et al., Eds., Contemporary Perspectives in the Phi- losophy o f Language. University of Minnesota Press, Minneapo- lis.

Levin, J.A., and Goldman, N.M. 1978 Process Models o f Reference in Context. USC/lnformation Sciences Institute, RR-78-72. Levy, D.M. 1979a Communicative goals and strategies: Between

discourse and syntax. In Givon, T., Ed., Syntax and Semantics. Volume 12: Discourse and Syntax. Academic Press, New York. Levy, D.M. 1979b The Architecture o f the Text. Ph.D. thesis,

Stanford University, Department of Computer Science. Linde C. and Labov, W. 1975 Spatial networks as a site for the

study of language and thought. Language 50 (IV) 924-939. Mann, W.C. and Moore, J.A. 1980 Computer as Author - Results

and Prospects. USC/Information Sciences Institute, RR-79-82. Mann, W.C. and Moore, J.A. 1981a Computer generation of

multiparagraph English text. American Journal o f Computational Linguistics 7 (1) January - March.

Mann, W.C. 1981b Two discourse generators. In The Nineteenth Annual Meeting o f the Association for Computational Linguistics. Sperry Univac.

Matthiessen, C.M.I.M. 1981 A grammar and a lexicon for a text- production system. In The Nineteenth Annual Meeting o f the Association for Computational Linguistics. Sperry Univac. McCoy, K.F. 1981 Automatic Enhancement of a Database

Knowledge Representation for Natural Language Generation. University of Pennsylvania, Technical Report MS-CIS-81-6. McDonald, D.D. 1975a A preliminary report on a program for

generating natural language. In Proceedings o f the Third Interna- tional Joint Conference on Artificial Intelligence. Tibilisi, USSR (August).

McDonald, D.D. 1975b A framework for writing generation grammars for interactive computer programs. American Journal o f Computational Linguistics Fiche 33.

McDonald, D.D. 1977 Language generation: The linguistics component (short note). In Proceedings o f the Fifth International Joint Conference on Artificial Intelligence. Cambridge, MA

(August).

McDonald, D.D. 1978 Subsequent references: Syntactic and rhetorical constraints. In Theoretical Issues in Natural Language Processing - 2 (TINLAP). ACM, New York.

McDonald, D.D. 1980a Natural Language Production as a Process o f Decision-Making Under Constraints. Ph.D. thesis, Massachu- setts Institute of Technology, Dept. of Electricial Engineering and Computer Science. To appear as an MIT Artificial Intelli- gence Laboratory technical report.

McDonald, D.D. 1980b The role of discourse structure in language production. In The Proceedings o f the Third Biannual Meeting o f the S C S I O / S C E I O .

McDonald, D.D. 1981 Language production: The source of the dictionary. In The Nineteenth Annual Meeting o f the Association for Computational Linguistics. Stanford University (June). McKeown, K.R. 1979 Paraphrasing using given and new information

in a question-answer system. Master's thesis, University of Penn- sylvania, Philadelphia. Number MS-CIS-80-13. Also in Pro- ceedings o f the Seventeenth Annual Meeting o f the Association for Computational Lingustics (August) 67-72.

McKeown, K.R. 1980a Generating Descriptions and Explanations: Applications to Questions about Database Structure. University of Pennsylvania, Technical Report MS-CIS-80-9.

(7)

Proceedings o f The First Annual National Conference on Artificial Intelligence. Stanford, CA (August) 306-309.

McKeown, K . R . 1 9 8 1 Generating Natural Language: Deciding What to Say Next. University of Pennsylvania, Technical Re- port MS-CIS-81-1.

Meehan, J.R. 1975 Using planning structures to generate stories. American Journal o f Computational Linguistics Fiche 33.

Meehan, J.R. 1 9 7 7 TALE-SPIN, an interactive program that writes stories. In Proceedings o f the Fifth International Joint Conference on Artificial Intelligence (August).

Moore, R. 1980 Reasoning about Knowledge and Action. SR1 Inter- national, Artificial Intelligence Center, Technical Note 191. Perrault, C.R. and Cohen, C.R. 1978 Planning Speech Acts. Uni-

versity of Toronto, Department of Computer Science, Technical Report.

Sacerdoti, E. 1977 A Structure for Plans and Behavior. Elsevier, North-Holland, Amsterdam.

Schank, R., Goldman, N., and Reiger, C. 1 9 7 5 Inference and paraphrase by computer. Journal o f the A C M 22 (3) July, 309-328.

Schlesinger, I.M. 1977 Production and Comprehension o f Utterances. Lawrence Erlbaum Associates.

Shapiro, S.C. 1975 Generation as parsing from a network into a linear string. American Journal o f Computational Linguistics Fiche 33.

Shapiro, S . C . 1 9 7 9 Generalized augmented transition network grammars for generation from semantic networks. In Proceed- ings o f the Seventeenth Meeting o f the Association for Computa- tional Linguistics (August) 25-29.

Simmons, R. and Slocum, J. 1 9 7 2 Generating English discourse from semantic networks. Communications o f the A C M 15 (10) October, 891-905.

Sleeman, D.J. and Hendley, R.J. 1979 ACE: A system which analyses complex explanations. International Journal o f Man- Machine Studies I I 125-144.

Slocum, J. 1 9 7 3 Question Answering via Cannonical Verbs and Semantic Models: Generating English from the Model. University of Texas, Department of Computer Sciences, Austin, Technical Report NL 13.

Slocum, J. 1975 Speech generation from semantic nets. American Journal o f Computational Linguistics Fiche 33.

Stevens, A. and C. Steinberg 1981 A Typology o f Explanations and its Application to Intelligent Computer Aided Instruction. Bolt Beranek and Newman, Inc., Technical Report 4626 (March). Swartout, W.R. 1977 A Digitalis Therapy Advisor with Explanations.

Massachusetts Institute of Technology, Laboratory for Comput- er Science, Technical Report (February).

Swartout, W.R. 1981a Producing Explanations and Justifications o f Expert Consulting Programs. Massachusetts Institute of Technol- ogy, Technical Report M1T/LCS/TR-251 (January).

Swartout, W.R. 1981b Explaining and justifying expert consulting programs. In Proceedings o f the Seventh International Joint Con- ference on Artificial Intelligence. University of British Columbia

(August).

Thompson, H.S. 1977 Strategy and tactics: A model for language production. In Papers from the Thirteenth Regional Meeting. Chicago Linguistic Society.

Thompson, H.S. 1980 Stress and Salience in English: Theory and Practice. Xerox Palo Alto Research Center, Technical Report CSL-80-8 (May).

Waltz, D.L. 1978 An English language question answering system for a large relational database. Communications o f the A C M 21 (7) July.

Weiner, J.L. 1980 BLAH, a system which explains its reasoning. Artificial Intelligence 15 (November) 19-48.

Winograd, T. 1 9 7 2 Understanding Natural Language. Academic Press, Edinburgh.

Wong, H.K.T. 1 9 7 5 Generating English Sentences from Semantic Structures. University of Toronto, Department of Computer Science, Technical Report 84.

Yngve, V.H.A. 1960 A model and a hypothesis for language structure. In Proceedings o f the American Philosophical Society, 444-466.

Yngve, V.H.A. 1962 Random generation of English sentences. In The 1961 Conference on Machine Translation o f Languages and Applied Language Analysis. Her Majesty's Stationery Office, London.

2 . 1 3 R e s e a r c h in P r o g r e s s

T h i s s e c t i o n d e s c r i b e s r e s e a r c h i n t e x t g e n e r a t i o n e i t h e r c u r r e n t l y i n p r o g r e s s o r r e c e n t l y c o m p l e t e d b u t n o t y e t d e s c r i b e d c o m p r e h e n s i v e l y i n a n y p u b l i c a t i o n . L i k e t h e set of r e f e r e n c e s , it is n o t e x h a u s t i v e .

B a r b a r a Grosz a n d D o u g A p p e l t (SRI International)

B a r b a r a G r o s z a n d D o u g A p p e l t are d e v e l o p i n g a p r o b l e m - s o l v i n g a p p r o a c h to t h e d e s i g n o f t e x t , e x - t e n d i n g f r o m p r i o r w o r k b y A l l e n , C o h e n , a n d P e r - r a u l t . A h i e r a r c h i c a l p l a n n i n g s y s t e m c a l l e d K A M P ( K n o w l e d g e a n d M o d a l i t i e s P l a n n e r ) is b e i n g d e v e l - o p e d , c a p a b l e of p l a n n i n g a c t i o n s t h a t a f f e c t a n o t h e r a g e n t ' s k n o w l e d g e a n d w a n t s . It i n c l u d e s critic p r o c - esses t h a t e x a m i n e t h e p l a n g l o b a l l y f o r i n t e r a c t i o n s b e t w e e n t h e e f f e c t s of a c t i o n s a n d p r o p o s e m o d i f i c a - t i o n s to t h e p l a n t h a t will e n a b l e t h e u t t e r a n c e b e i n g p l a n n e d to r e a l i z e m u l t i p l e i l l o c u t i o n a r y acts. K A M P ' s k n o w l e d g e r e p r e s e n t a t i o n is b a s e d o n M o o r e ' s p o s s i b l e w o r l d s s e m a n t i c s a p p r o a c h to r e a s o n i n g a b o u t k n o w l - e d g e a n d a c t i o n .

D a v i d M c D o n a l d

(University of Massachusetts, Amherst)

D a v i d M c D o n a l d is t h e a u t h o r o f MUMBLE, a sys- t e m t h a t p e r f o r m s u t t e r a n c e c o n s t r u c t i o n , g r a m m a t i c a l s m o o t h i n g , a n d m a i n t e n a n c e of l i n g u i s t i c c o n s t r a i n t s f o r n a t u r a l l a n g u a g e g e n e r a t i o n b y e x p e r t p r o g r a m s . MUMBLE is a v a i l a b l e to i n t e r e s t e d r e s e a r c h e r s i n t h e c o m m o n d i a l e c t of LISP m a c h i n e LISP a n d NIL. T h e a u t h o r is c u r r e n t l y e x t e n d i n g M u m b l e ' s g r a m m a t i c a l p o w e r so t h a t it p l a n s w o r d s e l e c t i o n i n d e s c r i b i n g viLsual s c e n e s a n d also p l a n s t h e u s e of c e r t a i n c o n n e c - tives s u c h as " b u t , " " a l s o , " a n d " t h u s . "

VVilliam M a n n a n d C h r i s t i a n M a t t h i e s s e n

(Information Sciences Institute)

W i l l i a m M a n n , C h r i s t i a n M a t t h i e s s e n , a n d o t h e r s a r e d e v e l o p i n g t h e P e n m a n s y s t e m to e x p l o r e t h e p r o b l e m s of c r e a t i n g a p o r t a b l e t e x t g e n e r a t i o n f a c i l i t y u s e f u l i n m u l t i p l e k n o w l e d g e d o m a i n s . P e n m a n will s e e k to d e l i v e r k n o w l e d g e ( i n E n g l i s h ) f r o m i n s i d e a s y s t e m t h a t w a s n o t d e s i g n e d to h a v e a t e x t g e n e r a t i o n c o m p o n e n t .

(8)

mar of English has b e e n i m p l e m e n t e d and is being fitted with semantic parts.

The knowledge r e p r e s e n t a t i o n , which resembles Brachman's early KL-ONE, is being used for both the subject m a t t e r of P e n m a n ' s g e n e r a t i o n and the text plans by which P e n m a n generates text. The emphasis of the research is on providing fluent English o u t p u t from an easily controlled source.

K a t h l e e n M c K e o w n

(University of Pennsylvania)

Research is being completed on a text generation system that embodies computational solutions to the questions of what to say next and how to organize it effectively. T w o mechanisms are used to handle these problems: (1) rhetorical techniques for c o m m u n i c a - tion, e n c o d e d as schemas, guide the generation process, and (2) a focusing mechanism helps maintain discourse c o h e r e n c e . Schemas define aspects of discourse structure and are associated with explicit discourse purposes. The focusing mechanism aids in the organization of the message by constraining the selection of what to talk about next to that which ties in with the previous discourse in an a p p r o p r i a t e way. This work is being done within the f r a m e w o r k of a natural language interface to a database system; the completed system will generate responses to questions about database structure.

S t e v e n Bossie, K a t h l e e n M c C o y

(University of Pennsylvania)

Two systems are being developed at the University of Pennsylvania in conjunction with M c K e o w n ' s text g e n e r a t i o n system. One, d e v e l o p e d by K a t h l e e n M c C o y , automatically e n h a n c e s a metalevel description of a database for use by M c K e o w n ' s text generation system. This system generates subclasses of classes in a given generalization hierarchy. It uses i n f o r m a t i o n in the database and a set of axioms to create the subclasses and select salient i n f o r m a t i o n describing the subclass divisions.

Steven Bossie is developing a system that will take the o r d e r e d message created by M c K e o w n ' s text generation system and translate it into English. Bossie's system uses a functional grammar, based on a formal-

ism d e f i n e d by K a y 1979, which will allow for the direct encoding of focus constraints in the grammar. Thus, eventually, the system will use the focusing in- f o r m a t i o n p r o v i d e d by M c K e o w n ' s system to select syntactic constructions.

Rod M c G u i r e

(Yale)

Working t o w a r d his Ph.D. dissertation, R o d M c G u i r e is developing a model of knowledge representation in human m e m o r y to account for observed constraints on the c o n t e n t of oral text. In this model, s e n t e n c e s are g e n e r a t e d without building syntactic structures. In multisentential text, c o h e r e n c e arises directly from the form of r e p r e s e n t a t i o n in m e m o r y and from m e m o r y representation traversal algorithms, using a h o m o g e n e o u s r e p r e s e n t a t i o n to cover syntactic structure, rhetorical structure, and text plans.

M a d e l i n e B a t e s , R o b e r t Ingria, a n d Kirk W i l s o n

(BBN and Boston University)

The ILIAD system is an intelligent CAI system being developed by Madeline Bates, R o b e r t Ingria, Kirk Wilson (of Learning Tools, Inc., Brookline, Mass.) and others to give instruction and practice in English. The emphasis is not on teaching grammar, but the system needs to have a deep understanding of the syntactic relationships in the s e n t e n c e s used in examples and exercises. F o r this reason, the h e a r t of the ILIAD system is a s e n t e n c e g e n e r a t o r that is based on the paradigm of transformational grammar.