:
F S M N L P ' 9 8
i
m
n
m
m
m
m
m
m
m
m
mm
m
m
m
m
m
U
m
P r o c e e d i n g s of t h e
I n t e r n a t i o n a l W o r k s h o p on
F i n i t e S t a t e M e t h o d s in
N a t u r a l L a n g u a g e P r o c e s s i n g
Edited by
L a u r i K a r t t u n e n
K e m a l O f l a z e r
J u n e 3 0 - J u l y 1, 1998
B i l k e n t U n i v e r s i t y
m
m
m
m
m
m
t
mm
m
m
m
m
m
m
m
m
m
m
m
m
mm
m
n
l
I
i
i
g
i
i
i
i n
I
i
i
I
R
m
i
i
S
i
i
i n ,,n
FSMNLP'98
P r o c e e d i n g s of the
I n t e r n a t i o n a l
W o r k s h o p o n
F i n i t e State M e t h o d s in
Natural L a n g u a g e P r o c e s s i n g
Edited by
L a u r i K a r t t u n e n
K e m a l O f l a z e r
J u n e 3 0 - J u l y 1, 1998
Bilkent University
Published by Bilkent University.
ISBN 975-7679-34-8
Copyright (~) 1998 by Bilkent University
All rights reserved. No part of this publication m a y be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording or otherwise for commercial purposes, without the written permission of the Publisher, Bilkent University Faculty of Engineering, TR-06533 Bilkent, Ankara, Turkey.
No responsibility is assumed by the Publisher for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods products, instructions or ideas contained in the material herein.
F S M N L P ' 9 8
I N T E R N A T I O N A L W O R K S H O P
O N
F I N I T E STATE M E T H O D S .
I N
N A T U R A L L A N G U A G E P R O C E S S I N G
June 30-July 1, 1998
Bilkent University
Ankara, Turkey
S u p p o r t e d By
EACL - European Chapter of the Association for Computational Linguistics TUBITAK-Turkish Scientific and Technical Research Council NATO Science for Stability Programme TU-LANGUAGE Project
P R O G R A M M E A N D ORGANIZING COMMITTEE
C O - C H A I R S
Lauri Karttunen Xerox Research Center Europe Kemal Oflazer Bilkent University
Kenneth R. Beesley Eva Ejerhed
Ronald M. Kaplan
George Kiraz
Kimmo Koskenniemi Cl,ludio L. Lucchesi Mark-Jan Nederhof Emmanuel Roche Gertjan Van Noord
Xerox Research Center Europe Umea University
Xerox PARC Bell Laboratories University of Helsinki University of Campinas
DFKI
Teragram Corp. University of Groningen
Eric Brill Jerry Hobbs Martin Kay AndrOs Kornai Tomasz Kowaltowski Mehryar Mohri Richard Sproat Yves Schab~s Atro Voutilainen
Johns Hopkins University SRI International Xerox P A R C
Bolt Beranek N e w m a n University of Campinas A T & T Labs Research Bell Laboratories Teragram Corp. University of Helsinki
P R E F A C E
Recent years has seen a substantial increase in the use of finite state techniques in many aspects of natural language processing as mature tools for building large scale finite-state systems from various research laboratories and universities become available. This trend was by no means foreseen as late.as ten years ago given the well-known demonstration by Noam Chomsky in 1957 that finite-state methods are inherently incapable of representing the full richness of constructions in a natural language. Nevertheless, it is evident now that there are many subsets of natural langugage that are adequately covered by finite- state means and that there are many other areas where finite-state approximations of more powerful formalisms are of great practical benefit. The discovery that systems of phonological rewrite rules and optimality contraints are within the finite-state domain has made an important theoretical leap forward. We expect that similar discoveries are yet to be made in other areas of linguistics.
FSMNLP'98, International Workshop on Finite State Methods in Natural Language Processing was conceived, with support and motivation from EACL, as a forum to bring together recent contributions in all aspects of the theory and applications of finite state machinery in language processing. As you may have already observed by reviewing the workshop programme and bY looking at the papers to be presented, the mix of the contributions is international and shows wide-spread interest in this technology. They range over a variety of topics of great interest to the NLP community.
We thank members of the Programme Committee who, with their prompt and dedicated reviews of the submissions, enabled us to select the contributions for these proceedings. We thank Bilkent University for providing the facilities and contributing to the logistics of the workshop. We also thank the NATO Science for Stability Programme (Phase III) which let us use funds from the TU-LANGUAGE Project, and TBITAK, the Turkish Scientific and Technical Research Council, for additional financial support. Last but not least, we thank Professor Biilent Ozgii§, Dean of the Faculty of Art, Design and Architec- ture, for his help in designing the cover and in the production of the proceedings.
We hope you enjoy the workshop.
Lauri Karttunen and Kemal Oflazer
i
ni
t
i
nn
i
i n naI
[ ]
i
i
I
i nTUESDAY,
• 9 : 4 5 - 1 0 : 0 0 • 1 0 : 0 0 - 1 1 : 0 0
• 1 1 : 0 0 - 1 1 : 3 0 • 1 1 : 3 0 - 1 2 : 0 0
• 1 2 : 0 0 - 1 2 : 3 0
• 1 2 : 3 0 - 1 4 : 0 0 • 1 4 : 0 0 - 1 4 : 3 0
* 1 4 : 3 0 - 1 5 : 0 0
• 1 5 : 0 0 - 1 5 : 3 0
W O R K S H O P P R O G R A M M E
and
T A B L E OF C O N T E N T S
June 30, 1998
P a g e N o .O p e n i n g R e m a r k s
P L E N A R Y T A L K : T h e P r o p e r T r e a t m e n t o f O p t i m a l i t y in C o m p u t a t i o n a l P h o n o l o g y . . . (1) Lauri K a r t t u n e n Xerox Research Centre Europe, France
B R E A K
C o n t e x t - f r e e P a r s i n g T h r o u g h R e g u l a r A p p r o x i m a t i o n . . . (13) Mark-Jan Nederhof DFKI, Germany
Does Tagging Help P a r s i n g ? A Case S t u d y on F i n i t e S t a t e Parsing . . . . (25) Atro Voutilalnen University of Helsinki, Finland
L U N C H
R o b u s t P a r s i n g Using a H i d d e n M a r k o v M o d e l . . . (37) Wide It. Hogenhout Canon Research Centre Europe,United Kingdom Yuji Matsumoto Nara Institute of Technology, Japan
I n c r e m e n t a l C o n s t r u c t i o n o f M i n i m a l Acycllc F i n i t e S t a t e A u t o m a t a a n d T r a n s d u c e r s . . . ( 4 8 )
Jan Daciuk University of Gdansk, Poland Bruce W. Watson Ribbit Software, Canada and
University of Pretoria, South Africa Richard E. Watson Ribbit Software, Canada
T r e a t m e n t o f e-MOves in S u b s e t C o n s t r u c t i o n . . . (57) Gertjan van Noord University of Groningen, The Netherlands
• 15:30 - 16:00 B R E A K
• 16:00 - 1 6 : 3 0
• 16:30 - 1 T : 0 0
L e a r n i n g F i n i t e S t a t e M o d e l s for L a n g u a g e U n d e r s t a n d i n g . . . (69) David Pic6 Polytechnic University of V~lencia
Enrique Vidal Spain
A M u l t i l i n g u a l N a t u r a l L a n g u a g e I n t e r f a c e to R e g u l a r Expressions . . . . (79) Aarne Ranta Xerox Research Centre Europe, l~rance
W E D N E S D A Y , J u l y 1 , 1 9 9 8
• 10:00 - 10:30
• 10:30 - 11:00
• 1 1 : 0 0 - 1 1 : 3 0
* 11:30 - 12:00
I m p l e m e n t i n g V o t i n g C o n s t r a i n t s w i t h F i n i t e S t a t e T r a n s d u c e r s . . . (91) Ke_ma] Oflazer Bilkent University, Turkey
G S k h a n T~r
F e a t u r e S t r u c t u r e s , U n i f i c a t i o n a n d F i n i t e S t a t e T r a n s d u c e r s . . . (101) R~mi Zajac New Mexico State University, U S A
U s i n g G e n e r i c i t y t o C r e a t e C u s t o m i z a b l e F i n i t e S t a t e Tools . . . (110) S a n d r o P e d r a z z i n i University of Ba~el and
Marcus H o f f m a n IDSIA S w i t z e r l a n d
C o n s t r R i n l n g S e p a r a t e d M o r p h o t a c t i c D e p e n d e n c i e s in F i n i t e S t a t e G r a m - m a r s . . . . . . (118) K e n n e t h R. Beesley XRCE, France
i
n
i
i
n
m
i
i
i
i
n
i