CoNLL97
Computational Natural Language
Learning
Proceedings of the 1997 Meeting of the ACL
Special Interest Group in Natural Language Learning
Editor: T. Mark Ellison
Madrid, July 11, 1997
CoNLL97
Computational Natural Language
Learning
Proceedings of the 1997 Meeting of the ACL
Special Interest Group in Natural Language Learning
Editor: T. Mark Ellison
Madrid, July 11, 1997
© 1997, Association for Computational Linguistics
Order additional copies from:
ACL
P.O. Box 6090
Somerset, NJ, 08875 USA
+1-908-873-3898
P R E F A C E
The field of computational natural language learning (NLL) is not a new one; research in it has been pursued for more than forty years. The last seven years, however, have seen a growth in interest and, correspondingly, in meetings addressing this topic. These have been held under the auspices of: COLING (The Unfinished Language, 90), DARPA (90/91), AAAI (MLNLO/CNLP, 91/93), IJCAI (NLL, 91), ECML (Machine Learning and Text Analysis, 93), the European Networks of Excellence ELSNET and MLNET (MLNLS, 94), and ESSLLI (96).
This year, however, is the first time that the ACL's special interest group in natural language learning have organised a meeting in conjunction with an (E)ACL conference. The papers contained in this volume are those which have been accepted to the conference.
In this meeting, we have attempted to cover as broad a range of topics within the field as possible. The range extends from message understanding, through word categorization, ambiguity resolution, learner modelling and text segmentation to the application of neural neworks for learning speech and phonology.
The combination of this vibrant field, with the occasion of joint EACL/ACL meeting make the studies collected in this volume an exciting and stimulating representation of the field.
I would like to take this opportunity to thank my fellows on the program committee, and the other reviewers who helped contributed towards this workshop.
T. Mark Ellison,
University of Edinburgh
P r o g r a m C o m m i t t e e
Walter Daelemans
Computational Linguistics, Tilburg University PO Box 90153, 5000 LE Tilburg, The Netherlands
walter©kub, nl
T. Mark Ellison
Centre for Cognitive Science, University of Edinburgh 2 Buccleuch Place, Edinburgh Eh8 9LW, Scotland.
marke~cogsci, ed. ac. uk
David Powers
Artificial Intelligence Laboratory Department of Computer Science The Flinders University of South Australia
powers©acm, org
T h e p r o g r a m c o m m i t t e e w o u l d like to t h a n k t h e f o l l o w i n g f u r t h e r r e v i e w e r s :
Steve Abney Joe Allen Aude Billard Alan Black Michael Brent Chris Brew Michael Collins James Cussens Gert Durieux Steve Finch
Steven Gillis John Goldsmith Julia Hockenmaier Jim Hurford Mehmet Kayaalp Laurence Malloy Andrei Mikheev Jeffrey M. Siskind Jakup Zavrel
CoNLL97
Proceedings of the 1997 Meeting of the ACL Special Interest
Group in Natural Language Learning
A m i t B a g g a a n d J o y c e Y u e Chai
A Trainable Message Understanding System . . . 1-8
M a r y E l a i n e Califf a n d R a y m o n d J. M o o n e y
Relational Learning of Pattern-Match Rules for Information Extraction . . . 9-15
W i d e R. H o g e n h o u t a n d Yuji M a t s u m o t o
A Preliminary Study of Word Clustering Based on Syntactic Behavior . . . 16-24
Ji D o n g h o n g , He J u n and H u a n g Changning
Learning New Compositions from Given Ones . . . 25-32
M e h m e t K a y a a l p , T e d P e d e r s e n a n d R e b e c c a Bruce
A Statistical Decision Making Method: A Case Study on Prepositional Phrase
Attachment . . . 33-42
E m i n E r k a n K o r k m a z a n d G S k t f i r k i_l~oluk
A Method for Improving Automatic Word Categorization . . . 43-49
M o n t s e M a r i t x a l a r , A r a n t z a Diaz de I l a r r a z a a n d M a i t e O r o n o z From Psycholinguistic Modelling of Interlanguage in Second Language Acqui-
sition to a Computational Model . . . 50-59
L a u r a M a y f i e l d T o m o k i y o a n d K l a u s R i e s
What makes a word: Learning base units in Japanese for speech recognition . . . 60-69
R a m i n C h a r l e s N a k i s a a n d K i m P l u n k e t t
Evolution of a Rapidly Learned Representation for Speech . . . 70-79
Miles O s b o r n e a n d Ted B r i s c o e
Learning Stochastic Categorial Grammars . . . 80-87
D a v i d M. W . P o w e r s
Learning and Application of Differential Grammars . . . 88-96
Jennifer R o d d
Recurrent Neural-Network Learning of Phonological Regularities in Turkish . . . 97-106
K h a l i l S i m a ' a n
Explanation-Based Learning of Data-Oriented Parsing . . . 107-116
C h r i s t o p h T i l l m a n n a n d H e r m a n n N e y
Word Triggers and the E M Algorithm . . . 117-124
W e r n e r W i n i w a r t e r a n d Yahiko K a m b a y a s h i
A Comparative Study of the Application of Different Learning Techniques to
Natural Language Interfaces . . . 125-135
J a k u b Zavrel, W a l t e r D a e l e m a n s a n d J o r n V e e n s t r a
T i m e t a b l e
09:00-09:15 09:15-09:40 09:40-I0:05 10:05-10:30 OpeningMontse Maritxalar, Arantza Diaz de Ilarraza and Maite Oronoz
F r o m Psycholinguistic Modelling of I n t e r l a n g u a g e in Second L a n g u a g e Acquisition to a C o m p u t a t i o n a l M o d e l
Werner Winiwarter and Yahiko Kambayashi
A C o m p a r a t i v e S t u d y of t h e A p p l i c a t i o n of Different Learning Techniques to N a t u r a l Language Interfaces
Christoph Tillmann and Hermann Ney
W o r d Triggers and t h e EM A l g o r i t h m
10:30-10:50 MORNING BREAK
10:50-11:15
11:15-ii:40
11:40-12:05
12:05-12:30
12:30-12:55
Mehmet Kayaalp, Ted Pedersen and Rebecca Bruce
A Statistical Decision Making M e t h o d : A Case S t u d y on P r e p o s i t i o n a l P h r a s e A t t a c h m e n t
Jakub Zavrel, Walter Daelemans and Jorn Veenstra
Resolving P P a t t a c h m e n t Ambiguities w i t h M e m o r y - B a s e d L e a r n i n g
Wide R. Hogenhout and Yuji Matsumoto
A P r e l i m i n a r y S t u d y of W o r d Clustering B a s e d on S y n t a c t i c B e h a v i o r
Miles Osborne and Ted Briscoe
Learning Stochastic Categorial G r a m m a r s
Khalil Sima 'an
E x p l a n a t i o n - B a s e d Learning of D a t a - O r i e n t e d P a r s i n g
12:55-14:30 LUNCH
14:30-15:15
15:15-15:40
15:40-16:05
16:05-16:30
S I G N L L M e e t i n g
Jennifer Rodd
R e c u r r e n t N e u r a l - N e t w o r k Learning of Phonological R e g u l a r i t i e s in Turkish
Ramin Charles Nakisa and Kim Plunkett
E v o l u t i o n of a R a p i d l y L e a r n e d R e p r e s e n t a t i o n for Speech
Emin Erkan I(orkmaz and GSkt~'rk U~oluk
A M e t h o d for I m p r o v i n g A u t o m a t i c W o r d C a t e g o r i z a t i o n
16:30-16:50 AFTERNOON BREAK
16:50-17:15
17:15-17:40
1 7 : 4 0 - 1 8 : 0 5
18:05-18:30
18:30-18:55
Amit Bagga and Joyce Yue Chai
A Trainable Message U n d e r s t a n d i n g S y s t e m
Mary Elaine Califf and Raymond J. Mooney
R e l a t i o n a l Learning of P a t t e r n - M a t c h Rules for I n f o r m a t i o n E x t r a c t i o n
Ji Donghong, He Jun and Huang Changning
Learning N e w C o m p o s i t i o n s f r o m Given Ones
Laura Mayfield Tomokiyo and Klaus Ries
W h a t makes a word: Learning base units in J a p a n e s e for speech recognition
David M. W. Powers
Learning and Application of Differential G r a m m a r s
18:55-19:00 Closing