Introduction to the CoNLL 2000 Shared Task Chunking

In: Proceedings of CoNLL-2000 and LLL-2000, pages 127-132, Lisbon, P o r t u g a l , 2000.. I n t r o d u c t i o n t o t h e C o N L L - 2 0 0 0 S h a r e d Task: C h u n k i n g . Erik F. Tjong K i m Sang C N T S - L a n g u a g e T e c h n o l o g y G r o u p . U n i v e r s i t y o f A n t w e r p e r i k t O u i a , u a . a c . b e . Sabine B u c h h o l z I L K , C o m p u t a t i o n a l L i n g u i s t i c s . T i l b u r g U n i v e r s i t y s. buchholz@kub, nl. A b s t r a c t . We describe t h e CoNLL-2000 shared task: dividing text into syntactically related non- overlapping groups of words, so-called text chunking. We give b a c k g r o u n d i n f o r m a t i o n on t h e d a t a sets, present a general overview of t h e systems t h a t have t a k e n p a r t in t h e shared task a n d briefly discuss their performance.. 1 I n t r o d u c t i o n . Text c h u n k i n g is a useful preprocessing step for parsing. T h e r e has been a large inter- est in recognizing non-overlapping n o u n phrases ( R a m s h a w a n d Marcus (1995) a n d follow-up pa- pers) b u t relatively little has been w r i t t e n a b o u t identifying phrases of o t h e r syntactic categories. T h e CoNLL-2000 s h a r e d task a t t e m p t s to fill this gap.. 2 T a s k d e s c r i p t i o n . Text c h u n k i n g consists of d i v i d i n g a text into phrases in such a way t h a t syntactically re- lated words become m e m b e r of t h e same phrase. T h e s e phrases are non-overlapping which m e a n s t h a t one word can only be a m e m b e r of one chunk. Here is a n e x a m p l e sentence:. [NP He ] [vP reckons ] [NP t h e current account deficit ] [vP will narrow ] [pp to ] [NP only £ 1.8 billion ] [pp in ][NP S e p t e m b e r ] . . C h u n k s have been r e p r e s e n t e d as groups of words between square brackets. A t a g next to t h e o p e n bracket denotes t h e t y p e of t h e chunk. As far as we know, t h e r e are no a n n o t a t e d cor- p o r a available which c o n t a i n specific informa- tion a b o u t d i v i d i n g sentences into chunks of words of a r b i t r a r y types. We have chosen to work w i t h a corpus w i t h parse information, t h e . Wall Street Journal WSJ part of the Penn Tree- bank II corpus (Marcus et al., 1993), and to ex- tract chunk information from the parse trees in this corpus. We will give a global description of the various chunk types in the next section.. 3 Chunk Types. The chunk types are based on the syntactic cat- egory part (i.e. without function tag) of the bracket label in the Treebank (cf. Bies (1995) p.35). Roughly, a chunk contains everything to the left of and including the syntactic head of the constituent of the same name. Some Tree- bank constituents do not have related chunks. The head of S (simple declarative clause) for ex- ample is normally thought to be the verb, but as the verb is already part of the VP chunk, no S chunk exists in our example sentence.. Besides the head, a chunk also contains pre- modifiers (like determiners and adjectives in NPs), but no postmodifiers or arguments. This is why the PP chunk only contains the preposi- tion, and not the argument NP, and the SBAR chunk consists of only the complementizer.. There are several difficulties when converting trees into chunks. In the most simple case, a chunk is just a syntactic constituent without any further embedded constituents, like the NPs in our examples. In some cases, the chunk con- tains only what is left after other chunks have been removed from the constituent, cf. "(VP loves (NP Mary))" above, or ADJPs and PPs below. We will discuss some special cases dur- ing the following description of the individual chunk types.. 3.1 NP. Our NP chunks are very similar to the ones of Ramshaw and Marcus (1995). Specifically, pos- sessive NP constructions are split in front of the possessive marker (e.g. [NP Eastern Air- lines ] [NP ' creditors ]) and the handling of co-. 127. o r d i n a t e d N P s follows t h e T r e e b a n k a n n o t a t o r s . However, as R a m s h a w a n d M a r c u s d o n o t de- scribe t h e d e t a i l s o f t h e i r c o n v e r s i o n a l g o r i t h m , r e s u l t s m a y differ in difficult cases, e.g. involv- ing N A C a n d NX. 1. A n A D J P c o n s t i t u e n t i n s i d e a n N P con- s t i t u e n t b e c o m e s p a r t o f t h e N P c h u n k : . ( N P T h e ( A D J P m o s t volatile) f o r m ) [NP t h e m o s t v o l a t i l e f o r m ]. 3 . 2 V P . I n t h e T r e e b a n k , v e r b p h r a s e s are h i g h l y e m b e d - ded; see e.g. t h e following s e n t e n c e w h i c h con- t a i n s f o u r V P c o n s t i t u e n t s . Following R a m s h a w a n d M a r c u s ' V - t y p e c h u n k s , t h i s s e n t e n c e will o n l y c o n t a i n o n e V P c h u n k : . ((S ( N P - S B J - 3 Mr. I c a h n ) ( V P m a y n o t ( V P w a n t (S ( N P - S B J *-3) ( V P t o ( V P sell ...))))) . )). [NP Mr. I c a h n ] [vP m a y n o t w a n t t o sell ] .... It is still p o s s i b l e h o w e v e r t o have o n e V P c h u n k d i r e c t l y follow a n o t h e r : [NP T h e i m p r e s s i o n ] [NP I ] [VP h a v e g o t ] [vP is ] [NP t h e y ] [vP ' d love t o d o ] [PRT away ] [pp w i t h ] [NP it ]. I n t h i s case t h e two V P c o n s t i t u e n t s d i d n o t o v e r l a p in t h e T r e e b a n k . . A d v e r b s / a d v e r b i a l p h r a s e s b e c o r a e p a r t of t h e V P c h u n k (as l o n g as t h e y are i n f r o n t of t h e m a i n verb):. ( V P c o u l d ( A D V P v e r y well) ( V P s h o w . . . ) ) "-+ [ve c o u l d v e r y well s h o w ] .... I n c o n t r a s t t o R a m s h a w a n d M a r c u s (1995), p r e d i c a t i v e a d j e c t i v e s o f t h e v e r b are n o t p a r t o f t h e V P c h u n k , e.g. i n "[NP t h e y ] [vP are ] [ADJP u n h a p p y ] ' . . I n i n v e r t e d s e n t e n c e s , t h e a u x i l i a r y v e r b is n o t p a r t o f a n y v e r b p h r a s e in t h e T r e e b a n k . C o n - s e q u e n t l y it d o e s n o t b e l o n g t o a n y V P c h u n k : . ((S ( S I N V ( C O N J P N o t only) d o e s ( N P - S B J - 1 y o u r p r o d u c t ) ( V P have (S. IE.g. (NP-SBJ (NP Robin Leigh-Pemberton) , (NP (NAC Bank (PP of (NP England))) governor) ,) which we convert to [NP Robin Leigh-Pemberton ] , Bank [pp of ] [NP England ] [NP governor ] whereas Ramshaw and Marcus state that ' "governor" is not included in any baseNP chunk'.. ( N P - S B J *-1) ( V P t o ( V P b e ( A D J P - P R D e x c e l l e n t ) ) ) ) ) ) , b u t .... [CONJP N o t o n l y ] d o e s [NP your p r o d u c t ] [vP h a v e t o b e ] [ADJP ex- cellent ] , b u t .... 3 . 3 A D V P a n d A D J P . A D V P c h u n k s m o s t l y c o r r e s p o n d t o A D V P con- s t i t u e n t s i n t h e T r e e b a n k . However, A D V P s in- side A D J P s or i n s i d e V P s if in f r o n t o f t h e m a i n v e r b are a s s i m i l a t e d i n t o t h e A D J P r e s p e c t i v e l y V P c h u n k . O n t h e o t h e r h a n d , A D V P s t h a t c o n t a i n a n N P m a k e t w o c h u n k s : . ( A D V P - T M P ( N P a year) earlier) -+ [NP a y e a r ] [ADVP earlier ]. A D J P s i n s i d e N P s are a s s i m i l a t e d i n t o t h e NP. A n d p a r a l l e l t o A D V P s , A D J P s t h a t c o n t a i n a n N P m a k e t w o c h u n k s : . ( A D J P - P R D ( N P 68 years) old) [NP 68 y e a r s ] [ADJP o l d ]. I t w o u l d b e i n t e r e s t i n g t o see h o w chang- ing t h e s e d e c i s i o n s (as c a n b e d o n e in t h e T r e e b a n k - t o - c h u n k c o n v e r s i o n s c r i p t 2) infiu- ences t h e c h u n k i n g t a s k . . 3 . 4 P P a n d S B A R . M o s t P P c h u n k s j u s t c o n s i s t o f o n e w o r d ( t h e p r e p o s i t i o n ) w i t h t h e p a r t - o f - s p e e c h t a g IN. T h i s d o e s n o t m e a n , t h o u g h , t h a t f i n d i n g P P c h u n k s is c o m p l e t e l y t r i v i a l . INs c a n also con- s t i t u t e a n S B A R c h u n k (see below) a n d s o m e P P c h u n k s c o n t a i n m o r e t h a n o n e word. T h i s is t h e case w i t h fixed m u l t i - w o r d p r e p o s i t i o n s s u c h as such as, because of, due to, w i t h p r e p o - s i t i o n s p r e c e d e d by a m o d i f i e r : well above, j u s t after, even in, particularly among or w i t h coor- d i n a t e d p r e p o s i t i o n s : inside and outside. We t h i n k t h a t P P s b e h a v e sufficiently differently f r o m N P s i n a s e n t e n c e for n o t w a n t i n g t o g r o u p t h e m i n t o o n e class (as R a m s h a w a n d M a r c u s d i d i n t h e i r N - t y p e c h u n k s ) , a n d t h a t o n t h e o t h e r h a n d t a g g i n g all N P c h u n k s i n s i d e a P P as I - P P w o u l d o n l y c o n f u s e t h e c h u n k e r . We t h e r e f o r e chose n o t t o h a n d l e t h e r e c o g n i t i o n o f t r u e P P s ( p r e p . + N P ) d u r i n g t h i s first c h u n k i n g s t e p . . ~The Treebank-to-chunk conversion script is available from http://ilk.kub.nl/-sabine/chunklink/. 128. SBAR Chunks mostly consist of one word (the complementizer) with the part-of-speech tag IN, but like multi-word prepositions, there are also multi-word complementizers: e v e n though, so that, j u s t as, e v e n if, as if, only if.. 3.5 C O N J P , P R T , I N T J , L S T , U C P . Conjunctions can consist of more t h a n one word as well: as well as, i n s t e a d of, rather than, n o t only, but also. One-word conjunctions (like and, or) are not a n n o t a t e d as C O N J P in the Tree- bank, and are consequently no C O N J P chunks in our data.. The Treebank uses the P R T constituent to annotate verb particles, and our P R T chunk does the same. T h e only multi-word particle is on and off. This chunk type should be easy to recognize as it should coincide with the part- of-speech tag RP, but through tagging errors it is sometimes also assigned IN (preposition) or RB (adverb).. I N T J is an interjection p h r a s e / c h u n k like no, oh, hello, alas, good grief!. It is quite rare.. The list marker LST is even rarer. Examples are 1., 2 , 3., .first, second, a, b, c. It might con- sist of two words: the number and the period.. T h e UCP chunk is reminiscent of the UCP (unlike coordinated phrase) constituent in the Treebank. Arguably, the conjunction is the head of the UCP, so most UCP chunks consist of conjunctions like a n d and or. UCPs are the rarest chunks and are probably not very useful for other NLP tasks.. 3.6 T o k e n s o u t s i d e . Tokens outside any chunk are mostly punctua- tion signs and the conjunctions in ordinary coor- dinated phrases. T h e word n o t may also be out- side of any chunk. This happens in two cases: Either n o t is not inside the VP constituent in the Treebank a n n o t a t i o n e.g. in. ... (VP have (VP told (NP-1 clients) (S (NP-SBJ *-1) not (VP to (VP ship (NP anything)))))). or n o t is not followed by another verb (because the main verb is a form of to be). As the right chunk b o u n d a r y is defined by the chunk's head, i.e. the main verb in this case, n o t is t h e n i n fact a postmodifier and as such not included in the chunk: "... [SBAR t h a t ] [NP there ] [vP were ] n't [NP any major problems ] .". 3.7 P r o b l e m s . All chunks were automatically extracted from the parsed version of the Treebank, guided by the tree structure, the syntactic constituent la- bels, the part-of-speech tags and by knowledge about which tags can be heads of which con- stituents. However, some trees are very complex and some annotations are inconsistent. What to think about a VP in which the main verb is tagged as NN (common noun)? Either we al- low NNs as heads of VPs (not very elegant but which is what we did) or we have a VP without a head. T h e first solution might also introduce errors elsewhere... As Ramshaw and Marcus (1995) already noted: "While this automatic derivation process introduced a small percent- age of errors on its own, it was the only practi- cal way b o t h to provide the amount of training d a t a required and to allow for fully-automatic testing.". 4 D a t a a n d E v a l u a t i o n . For the CoNLL shared task, we have chosen to work with the same sections of the Penn Treebank as the widely used d a t a set for base noun phrase recognition (Ramshaw and Mar- cus, 1995): W S J sections 15-18 of the Penn Treebank as training material and section 20 as test material 3. T h e chunks in the data were selected to m a t c h the descriptions in the previous section. An overview of the chunk types in the training d a t a can be found in ta- ble 1. De d a t a sets contain tokens (words and p u n c t u a t i o n marks), information about the lo- cation of sentence boundaries and information about chunk boundaries. Additionally, a part- of-speech (POS) tag was assigned to each token by a s t a n d a r d POS tagger (Brill (1994) trained on the Penn Treebank). We used these POS tags r a t h e r t h a n the Treebank ones in order to make sure t h a t the performance rates obtained for this d a t a are realistic estimates for d a t a for which no treebank POS tags are available.. In our example sentence in section 2, we have used brackets for encoding text chunks. In the d a t a sets we have represented chunks with three types of tags:. 3The text chunking d a t a set is available at h t t p : / / l c g - www.uia.ac.be/conll2000/chunking/. 129. count % 55081 51% 21467 20% 21281 20% 4227 4% 2207 2% 2060 2% 556 1%. 56 O% 31 0% 10 O% 2 0%. type NP (noun phrase) VP (verb phrase) P P (prepositional phrase) ADVP (adverb phrase) SBAR (subordinated clause) ADJP (adjective phrase) PRT (particles) CONJP (conjunction phrase INTJ (interjection) LST (list marker) UCP (unlike coordinated phrase). Table 1: N u m b e r o f chunks p e r p h r a s e t y p e in t h e t r a i n i n g d a t a (211727 tokens, 106978 chunks).. B-X I-X 0. first word of a c h u n k o f t y p e X non-initial word in an X c h u n k word o u t s i d e of any c h u n k . T h i s r e p r e s e n t a t i o n t y p e is based on a repre- s e n t a t i o n p r o p o s e d by R a m s h a w a n d Marcus (1995) for n o u n p h r a s e chunks. T h e t h r e e t a g groups are sufficient for e n c o d i n g t h e chunks in t h e d a t a since these are non-overlapping. Using these c h u n k tags makes it possible to a p p r o a c h the c h u n k i n g t a s k as a word classification task. We can use c h u n k tags for representing our ex- a m p l e sentence in t h e following way:. H e / B - N P r e c k o n s / B - V P t h e / B - N P c u r r e n t / I - N P a c c o u n t / I - N P d e f i c i t / I - N P w i l l / B - V P n a r r o w / I - V P t o / B - P P o n l y / B - N P # / I - N P 1 . 8 / I - N P b i l l i o n / B - N P i n / B - P P S e p t e m b e r / B - N P . / O . T h e o u t p u t of a c h u n k recognizer m a y contain inconsistencies in t h e c h u n k tags in case a word tagged I-X follows a word tagged O or I-Y, w i t h X a n d Y b e i n g different. T h e s e inconsistencies can be resolved by a s s u m i n g t h a t such I-X tags s t a r t a new chunk.. T h e p e r f o r m a n c e o n this task is m e a s u r e d w i t h t h r e e rates. First, t h e p e r c e n t a g e of d e t e c t e d phrases t h a t are correct (precision). Second, t h e p e r c e n t a g e of phrases in t h e d a t a t h a t were f o u n d by t h e chunker (recall). A n d third, t h e FZ=i rate which is equal to (f12 + 1)*precision*recall / (~2,precision+recall). w i t h ~ = 1 (van Rijsbergen, 1975). T h e lat- ter rate has b e e n used as t h e target for o p t i m i z a t i o n 4.. 5 R e s u l t s . T h e eleven s y s t e m s t h a t have been applied to t h e CoNLL-2000 s h a r e d t a s k can be divided in four groups:. 1. Rule-based systems: Villain a n d Day; Jo- hansson; D6jean.. 2. M e m o r y - b a s e d systems: Veenstra a n d Van d e n Bosch.. 3. Statistical systems: Pla, Molina a n d Pri- eto; Osborne; Koeling; Zhou, Tey a n d Su.. 4. C o m b i n e d systems: T j o n g K i m Sang; Van Halteren; K u d o h a n d M a t s u m o t o . . Vilain a n d Day (2000) a p p r o a c h e d t h e shared t a s k in t h r e e different ways. T h e m o s t success- ful was a n a p p l i c a t i o n of t h e Alembic parser which uses t r a n s f o r m a t i o n - b a s e d rules. Johans- son (2000) uses context-sensitive a n d context- free rules for t r a n s f o r m i n g part-of-speech (POS) tag sequences to c h u n k t a g sequences. D6jean (2000) has a p p l i e d t h e t h e o r y refinement sys- t e m ALLiS to t h e s h a r e d task. I n order to ob- t a i n a s y s t e m which could process XML format- t e d d a t a while using context information, he has used t h r e e e x t r a tools. Veenstra a n d Van d e n Bosch (2000) e x a m i n e d different parame- ter settings o f a m e m o r y - b a s e d learning algo- r i t h m . T h e y f o u n d t h a t m o d i f i e d value differ- ence m e t r i c a p p l i e d to P O S i n f o r m a t i o n only worked best.. A large n u m b e r of t h e s y s t e m s applied to t h e CoNLL-2000 s h a r e d task uses statistical m e t h o d s . Pla, Molina a n d P r i e t o (2000) use a finite-state version of Markov Models. T h e y s t a r t e d w i t h using P O S i n f o r m a t i o n only a n d o b t a i n e d a b e t t e r p e r f o r m a n c e w h e n lexical i n f o r m a t i o n was used. Zhou, Tey a n d Su (2000) i m p l e m e n t e d a c h u n k tagger based on HMMs. T h e initial p e r f o r m a n c e of t h e tag- ger was i m p r o v e d by a post-process correction m e t h o d based o n error driven learning a n d by. 4In the literature about related tasks sometimes the tagging accuracy is mentioned as well. However, since the relation between tag accuracy and chunk precision and recall is not very strict, tagging accuracy is not a good evaluation measure for this task.. 130. test d a t a K u d o h and Matsumoto Van Halteren Tjong K i m Sang Zhou, Tey and Su D@jean Koeling Osborne Veenstra and Van den Bosch Pla, Molina and Prieto Johansson Vilain and Day baseline. precision 93.45% 93.13% 94.04% 91.99% 91.87% 92.08% 91.65% 91.05% 90.63% 86.24% 88.82% 72.58%. recall 93.51% 93.51% 91.00% 92.25% 91.31% 91.86% 92.23% 92.03% 89.65% 88.25% 82.91% 82.14%. F ~ = i 93.48 93.32 92.50 92.12 92.09 91.97 91.94 91.54 90.14 87.23 85.76 77.07. Table 2: Performance of the eleven systems on the test data. T h e baseline results have been obtained by selecting the most frequent chunk tag for each part-of-speech tag.. incorporating chunk probabilities generated by a memory-based learning process. T h e two other statistical systems use maximum-entropy based methods. Osborne (2000) trained Ratna- parkhi's maximum-entropy POS tagger to out- put chunk tags. Koeling (2000) used a stan- dard maximum-entropy learner for generating chunk tags from words and POS tags. Both have tested different feature combinations be- fore finding an optimal one and their final re- sults are close to each other.. Three systems use system combination. Tjong K i m Sang (2000) trained and tested five memory-based learning systems to produce dif- ferent representations of the chunk tags. A combination of the five by majority voting per- formed b e t t e r t h a n the individual parts. Van Halteren (2000) used Weighted Probability Dis- tribution Voting (WPDV) for combining the results of four W P D V chunk taggers and a memory-based chunk tagger. Again the com- bination outperformed the individual systems. K u d o h and M a t s u m o t o (2000) created 231 sup- port vector machine classifiers to predict the unique pairs of chunk tags. T h e results of the classifiers were combined by a dynamic pro- gramming algorithm.. T h e performance of the systems can be found in Table 2. A baseline performance was ob- tained by selecting the chunk tag most fre- quently associated w i t h a POS tag. All systems outperform the baseline. T h e majority of the. systems reached an F~=i score between 91.50 and 92.50. Two approaches performed a lot better: the combination system W P D V used by Van Halteren and the Support Vector Machines used by K u d o h and Matsumoto.. 6 R e l a t e d W o r k . In the early nineties, Abney (1991) proposed to approach parsing by starting with finding related chunks of words. By then, Church (1988) had already reported on recognition of base n o u n phrases with statistical meth- ods. Ramshaw and Marcus (1995) approached chunking by using a machine learning method. Their work has inspired m a n y others to study the application of learning methods to noun phrase chunking 5. Other chunk types have not received the same a t t e n t i o n as NP chunks. The most complete work is Buchholz et al. (1999), which presents results for NP, VP, PP, A D J P and ADVP chunks. Veenstra (1999) works with NP, VP and P P chunks. B o t h he and Buchholz et al. use d a t a generated by the script t h a t pro- duced the CoNLL-2000 shared task d a t a sets. R a t n a p a r k h i (1998) has recognized arbitrary chunks as part of a parsing task but did not re- port on the chunking performance. P a r t of the Sparkle project has concentrated on finding var- ious sorts of chunks for the different languages. ~An elaborate overview of the work done on noun phrase chunking can be found on http://lcg-www.uia. a c . b e / - erikt/reseaxch/np-chunking.html. 131. (Carroll et al., 1997).. '7 C o n c l u d i n g R e m a r k s . We have p r e s e n t e d a n i n t r o d u c t i o n to t h e CoNLL-2000 s h a r e d task: dividing t e x t into s y n t a c t i c a l l y r e l a t e d n o n - o v e r l a p p i n g groups o f words, so-called t e x t chunking. For this t a s k we have g e n e r a t e d t r a i n i n g a n d test d a t a f r o m t h e P e n n T r e e b a n k . T h i s d a t a has b e e n processed by eleven systems. T h e b e s t p e r f o r m i n g s y s t e m was a c o m b i n a t i o n o f S u p p o r t Vector Machines s u b m i t t e d b y T a k u K u d o h a n d Yuji M a t s u m o t o . It o b t a i n e d a n FZ=i score o f 93.48 on this task.. A c k n o w l e d g e m e n t s . We w o u l d like to t h a n k t h e m e m b e r s o f t h e C N T S - L a n g u a g e T e c h n o l o g y G r o u p in A n t w e r p , B e l g i u m a n d t h e m e m b e r s o f t h e I L K g r o u p in T i l b u r g , T h e N e t h e r l a n d s for valuable discussions a n d c o m m e n t s . T j o n g K i m Sang is f u n d e d b y t h e E u r o p e a n T M R n e t w o r k Learn- ing C o m p u t a t i o n a l G r a m m a r s . Buchholz is sup- p o r t e d by t h e N e t h e r l a n d s O r g a n i z a t i o n for Sci- entific R e s e a r c h ( N W O ) . . R e f e r e n c e s . Steven Abney. 1 9 9 1 . Parsing by chunks. In Principle-Based Parsing. Kluwer Academic Pub- lishers.. Ann Bies, Mark Ferguson, Karen Katz, and Robert MacIntyre. 1995. Bracket Guidelines /or Tree- bank H Style Penn Treebank Project. Penn Tree- bank II cdrom.. Eric Brill. 1 9 9 4 . Some advances in rule-based part of speech tagging. In Proceedings o] the Twelfth National Con/erence on Artificial Intel- ligence (AAAI-9~). Seattle, Washington.. Sabine Buchholz, Jorn Veenstra, and Walter Daele- mans. 1999. Cascaded grammatical :relation as- signment. In Proceedings o] EMNLP/VLC-99. Association for Computational Linguistics.. John Carroll, Ted Briscoe, Glenn Carroll, Marc Light, Dethleff Prescher, Mats Rooth, Stefano Federici, Simonetta Montemagni, Vito Pirrelli, Irina Prodanof, and Massimo Vanocchi. 1997. Phrasal Parsing Software. Sparkle Work Package 3, Deliverable D3.2.. Kenneth Ward Church. 1988. A stochastic parts program and noun phrase parser for unrestricted text. In Second Con]erence on Applied Natural Language Processing. Austin, Texas.. H@rve D@jean. 2000. Learning syntactic structures with xml. In Proceedings o/ CoN~LL-2000 and LLL-2000. Lisbon, Portugal.. Christer Johansson. 2000. A context sensitive max- imum likelihood approach to chunking. In Pro- ceedings o] CoNLL-2000 and LLL-2000. Lisbon, Portugal.. Rob Koeling. 2000. Chunking with maximum en- tropy models. In Proceedings o/ CoNLL-2000 and LLL-2000. Lisbon, Portugal.. Taku Kudoh and Yuji Matsumoto. 2000. Use of sup- port vector learning for chunk identification. In Proceedings o~ CoNLL-2000 and LLL-2000. Lis- bon, Portugal.. Mitchell P. Marcus, Beatrice Santorini, and Mary Ann Marcinkiewicz. 1993. Building a large annotated corpus of english: the penn treebank. Computational Linguistics, 19(2).. Miles Osborne. 2000. Shallow parsing as part-of- speech tagging. In Proceedings o] CoNLL-2000 and LLL-2000. Lisbon, Portugal.. Ferran Pla, Antonio Molina, and Natividad Pri- eto. 2 0 0 0 . Improving chunking by means of lexical-contextual information in statistical lan- guage models. In Proceedings o] CoNLL-2000 and LLL-2000. Lisbon, Portugal.. Lance A. Ramshaw and Mitchell P. Marcus. 1995. Text chunking using transformation-based learn- ing. In Proceedings o] the Third A CL Workshop on Very Large Corpora. Association for Compu- tational Linguistics.. Adwait Ratnaparkhi. 1 9 9 8 . Maximum Entropy Models ]or Natural Language Ambiguity Resolu- tion. PhD thesis Computer and Information Sci- ence, University of Pennsylvania.. Erik F. Tjong Kim Sang. 2000. Text chunking by system combination. In Proceedings of CoNLL- 2000 and LLL-2000. Lisbon, Portugal.. Hans van Halteren. 2000. Chunking with wpdv models. In Proceedings o/ CoNLL-2000 and LLL- 2000. Lisbon, Portugal.. C.J. van Rijsbergen. 1975. In/ormation Retrieval. Buttersworth.. Jorn Veenstra and Antal van den Bosch. 2000. Single-classifier memory-based phrase chunking. In Proceedings o] CoNLL-2000 and LLL-2000. Lisbon, Portugal.. Jorn Veenstra. 1999. Memory-based text chunking. In Nikos Fakotakis, editor, Machine learning in human language technology, workshop at ACAI 99.. Marc Vilain and David Day. 2000. Phrase parsing with rule sequence processors: an application to the shared conll task. In Proceedings o/CoNLL- 2000 and LLL-2000. Lisbon, Portugal.. GuoDong Zhou, Jian Su, and TongGuan Tey. 2000. Hybrid text chunking. In Proceedings o/CoNLL- 2000 and LLL-2000. Lisbon, Portugal.. 132

New documents

The Dublin and Mid-East regional policy in the 'Strategic Planning Guidelines/or the Greater Dublin Area' Brady Shipman Martin et aI, 1999 to increase occupancy rate s in exist ing

It is also evident that only using the target side sentence to predict the pronoun like the baseline system does will not be very helpful since the pronoun depends on information such

Original Article Efficacy and safety of an improved technique of holmium laser enucleation of the prostate versus open prostatectomy in treating enlarged prostates combined with

Renewal and Redevelopment ofTowlls alld Cities The emp hasis in the title of the 1963 Plann ing and Deve lopment Act was a dcliberate attempt to promote a positive development role for

On top of a set of language model-based fea- tures, which form our baseline, we design a set of features to exploit linguistic annotations and re- sources for: i coreference resolution

Results: Family history of CVD, drinking, smoking, his- tory of cerebral infarction, BMI and the laboratory tests results of WBC, levels of uric acid, blood urea nitrogen, FPG, TC, TG,

NOVALINCS Gomes and Pereira Lopes, 2016 submitted 3 systems that use a phrase table from a phrase-based statistical machine translation sys- tem to compute coverage scores, based on the

– Make plans and ideally let the clinic/doctor/nurse know in advance that you wish to talk about your future care plans, so they can be prepared and make time for it... tAlKinG About

1Department of Surgical Oncology, Sir Run Run Shaw Hospital, Zhejiang University College of Medicine, Hang- zhou, Zhejiang, China; 2Department of Thyroid and Breast Surgery, Changhai

2016, with some minor differences: Be- cause all data for the shared task was parallel, a constrained model was built by employing a source side matching approach inspired by stan- dard