Proceedings of the
Second Conference
o n
Empirical Methods in
Natural Language Processing
=Z
Sponsored by
The Association for Computational Linguistics
and
SIGDAT, a Special Interest Group of the ACL
Held in cooperation with AAAI-97
Edited by
Claire Cardie
and
Ralph Weischedel
August
1-2, 1997
Brown University
Proceedings of the
Second Conference
o n
Empirical Methods in
Natural Language Processing
Sponsored by
The Association for Computational Linguistics
and
SIGDAT, a Special Interest Group of the ACL
Held in cooperation with AAAI-97
Edited by
Claire Cardie
and
Ralph Weischedel
August
1-2, 1997
Brown University
Order additional copies from:
ACL
P.O. Box 6090
Somerset, N J 08875
SPONSORS:
The Association for Computational Linguistics (ACL)
SIGDAT (ACL's SIG for Linguistic Data and Corpus-based Approaches to NLP)
INVITED
SPEAKERS:
Tom Mitchell
Michael Mauldin
ORGANIZERS:
Claire Cardie, Chair
Ralph Weischedel, Co-chair
LOCAL ARRANGEMENTS CHAIR:
Eugene Charniak
PROGRAM COMMITTEE:
Ted Briscoe
Rebecca Bruce
Michael Collins
Bruce Croft
Carl de Marcken
Joshua Goodman
Eduard Hovy
Nancy Ide
K.L. Kwok
John Lafferty
Steve Maiorano
Kemal Oflazer
Philip Resnik
Dekai Wu
David Yarowsky
(Cambridge University)
(Southern Methodist University)
(University of Pennsylvania)
(University of Massachusetts, Ambers0
(Massachusetts Institute of Technology)
(Harvard University)
(USC Information Sciences Institute)
(Vassar College)
(Queens College, CUNY)
(Carnegie Mellon University)
(ORD)
(Bilkent University)
(University of Maryland)
(Hong Kong University of Science and Technology)
(Johns Hopkins University)
FURTHER INFORMATION:
Claire Cardie
Department of Computer Science
Comell University
4142 Upson Hall
Ithaca, NY 14850 USA
e-mail: cardie@cs.cornell.edu
Ralph Weischedel
BBN Systems and Technologies
70 Fawcett Slreet
Cambridge, MA 02138
USA
e-mail: weischedel@bbn.com
8:00 - 9:00
9:00 - 9:15
9:15 - 9:40
9:40- 10:05
10:05- 10:30
10:30- 11:00
11:00- 11:25
11:25 - 11:50
11:50- 12:15
12:15- 1:45
1:45
- 2:45
2:45- 3:00
3:00 - 3:25
3:25 - 3:50
3:50-4:15
4:15 - 4:40
4:40- 5:05
5:05 - 5:30
CONFERENCE PROGRAM
Friday, August 1
Registration
and Continental Breakfast
Welcome
Adwait Ratnaparkhi
A Linear Observed Time Statistical Parser Based on Maximum Entropy Models
Joshua Goodman
Global Thresholding and Multiple-Pass Parsing
Carolyn Penstein Ros6 and Alon Lavie
An Efficient Distribution of Labor in a Two Stage Robust Interpretation Process
Break (held jointly with the Uncertainty in AI Conference)
Doug Beeferman, Adam Berger, and John Lafferty
Text Segmentation Using Exponential Models
Korin Richmond, Andrew Smith, and Einat Amitay
Detecting Subject Boundaries Within Text: A Language Independent Statistical
Approach
Ido Dagan, Yael Karov, and Dan Roth
Mistake-Driven Learning in Text Categorization
CATERED LUNCH
INVITED TALK, Tom Mitchell
Machine Learning and Extracting Information from the Web
Break
Thorsten Brants, Wojciech Skut, and Brigitte Krenn
Tagging Grammatical Functions
Jo Calder
On Aligning Trees
Break
Lawrence Saul and Femando
PereiraAggregate and Mixed-order Markov Models for Statistical Language Processing
Erika E de Lima
Assigning Grammatical Relations with a Back-off Model
I. Dan Melamed
8:00 - 8:30
8:30 - 8:55
8:55 - 9:20
9:20 - 9:45
9:45 - 10:15
10:15- 10:40
10:40 - 11:05
11:05- 11:25
11:25- 11:50
11:50- 12:15
12:15- 12:30
12:30- 1:45
1:45
- 2:45
2:45 - 3:00
3:00 - 3:25
3:25 - 3:50
3:50 - 4:15
4:15 - 4:40
4:40 - 5:05
5:05 - 5:30
CONFERENCE PROGRAM
Saturday, August 2
Registration and Continental Breakfast
Scott W. Bennett, Chinatsu Aone, and Craig LoveU
Learning to Tag Multilingual Texts Through Observation
EUcn Riloff and Jessica Shepherd
A Corpus-Based Approach for Building Semantic Lexicons
Roberto
Basili, Gianluca De Rossi, and Maria Teresa Pazienza
Inducing Terminology for Lexical Acquisition
Break
Paul Thompson
and Christopher C. Dozier
Name Searching and Information Retrieval
K.L. Kwok
Lexicon Effects on Chinese h~formation Retrieval
Break
Paola
Merlo, Matthew W. Crocker, and Cathy Berthouzoz
Attaching Multiple Prepositional Phrases: Generalized Backed-off Estimation
Eric V. Siegel
Learning Methods for Combining Linguistic Indicators to Classify Verbs
SIGDAT Business Meeting
CATERED LUNCH
INVITED TALK, Michael Mauldin
The Lycos System: Practical Information Retrieval
Break
Andrew Kehler
Probabilistic Coreference in Information Extraction
Janyce Wiebe, Tom O'Hara, Kenneth McKeever, and Thorsten OhrstrOm-Sandgren
An Empirical Approach to Temporal Reference Resolution
Break
Ji Donghong and Huang Changning
Word Sense Disambiguation Based on Structured Semantic Space
Ted Pedersen and Rebecca Bruce
Distinguishing Word Senses in Untagged Text
Hwee Tou Ng
Exemplar-Based Word Sense Disambiguation: Some Recent hnprovements
TABLE OF CONTENTS
A Linear Observed Time Statistical Parser Based on Maximum Entropy Models
A d w a i t Ramaparldai . . .
1
Global Thresholding and Multiple-Pass Parsing
J o s h u a G o o d m a n . . .
11
An Efficient Distribution of Labor in a Two Stage Robust Interpretation Process
C a r o l y n Penstein Ros~ and A l o n L a v i e . . .
26
Text Segmentation Using Exponential Models
D o u g B e e f e r m a n , A d a m Berger, and J o h n Lafferty . . .
35
Detecting Subject Boundaries Within Text: A Language Independent Statistical Approach
K o r i n R i c h m o n d , A n d r e w Smith, and Einat A m i t a y . . .
47
Mistake-Driven Learning in Text Categorization
I d o Dagan, Yael Karov, and D a n Roth . . .
55
Tagging Grammatical Functions
T h o r s t e n Brants, Wojciech Skut, and Brigitte K r e n n . . .
64
On Aligning Trees
Jo Calder . . .
75
Aggregate and Mixed-order Markov Models for Statistical Language Processing
L a w r e n c e Saul and F e m a n d o Pereira . . .
81
Assigning Grammatical Relations with a Back-off Model
E r i k a E de L i m a . . .
90
Automatic Discovery of Non-Compositional Compounds in Parallel Data
I. D a n M e l a m e d . . .
97
Learning to Tag Multilingual Texts Through Observation
Scott W. Bennett, Chinatsu Aone, and Craig L o v e l l . . .
109
A Corpus-BasedApproachfor Building Semantic Lexicons
Ellen R i l o f f and Jessica Shepherd . . .
117
Inducing Terminology for Lexical Acquisition
R o b e r t o Basili, G i a n l u c a .De Rossi, and M a r i a Teresa P a z i e n z a . . .
125
Name Searching and Information Retrieval
Paul T h o m p s o n and Christopher C. D o z i e r . . .
134
Lexicon Effects on Chinese Information Retrieval
K.L. K w o k . . .
141
Attaching Multiple Prepositional Phrases: Generalized Backed-off Estimation
P a o l a Merlo, M a t t h e w W. Crocker, and Cathy Berthouzoz . . .
149
Learning Methods for Combining Linguistic Indicators to Classify Verbs
Eric V. Siegel . . .
156
Probabilistic Coreference in Information Extraction
A n d r e w K e h l e r . . .
163
An Empirical Approach to Temporal Reference Resolution
J a n y c e Wiebe, T o m O ' H a r a , K e n n e t h M c K e e v e r , and Thorsten 0 h r s l x 0 m - S a n d g r e n . . . 174
Word Sense Disambiguation Based on Structured Semantic Space
Ji D o n g h o n g and H u a n g Changning . . .
187
Distinguishing Word Senses in Untagged Text
Ted Pedersen and R e b e c c a Bruce . . .
197
Exemplar-Based Word Sense Disambiguation: Some Recent hnprovements
H w e e Tou N g . . .
208
AUTHOR INDEX
E i n a t A m i t a y . . . 4 7
C h i n a t s u A o n e . . . 109
R o b e r t o B a s i l i . . . 125
D o u g B c c f e r m a n . . . 35
S c o t t W. B e n n e t t . . . 109
A d a m B e r g c r . . . 35
C a t h y B c r t h o u z o z . . . 149
T h o r s t c n B r a n t s . . . . . . . 6 4 R e b e c c a B r a c e . . . 197
J o C a l d e r . . . 75
H u a n g C h a n g n i n g . . .
187
M a t t h e w W. C r o c k e r . . . 149
I d o D a g a n . . . 5 5 E r i k a F. d e L i m a . . . 9 0 G i a n l u c a D e R o s s i . . . 125
Ji D o n g h o n g . . .
187
C h r i s t o p h e r C. D o z i e r . . . 1 3 4 J o s h u a G o o d m a n . . .
11
Yael K a r o v . . . 55
A n d r e w K e h l e r . . . 163
B r i g i t t e K r e n n . . . 6 4 K . L . K w o k . . . 141
J o h n L a f f e r t y . . . 35
A l o n L a v i e . . . 2 6 C r a i g L o v e l l . . . 109
K e n n e t h M c K e e v e r . . . 1 7 4 I. D a n M e l a m e d . . . 97
P a o l a M e r l o . . . 149
H w e e T o u N g . . . 2 0 8 T o m O ' H a r a . . . 1 7 4 T h o r s t e n 0 h r s t r 0 m - S a n d g r e n . . . 1 7 4 M a r i a T e r e s a P a z i e n z a . . . 125
T e d P e d e r s e n . . . ~ . . . 197
F e m a n d o P e r e i r a . . .
81
A d w a i t R a t n a p a r k h i . . .
1
K o n n R i c h m o n d . . . 47
E l l e n R i l o f f . . . ; . . .
117
C a r o l y n P e n s t e i n R o s ~ . . . 2 6 D a n R o t h . . . 55
L a w r e n c e S a u l . . . 81
J e s s i c a S h e p h e r d . . . 117 E r i c V. S i e g e l . . . 1 5 6 W o j c i e c h S k u t . . . . . . 6 4 A n d r e w S m i t h . . . 4 7 P a u l T h o m p s o n . . . 1 3 4 J a n y c e W i e b e . . . 1 7 4