F r e d K a r l s s o n
D e p a r t m e n t o f G e n e r a l L i n g u i s t i c s U n i v e r s i t y o f H e l s i n k i
A PARADIGM-BASED MORPHOLOGICAL ANALYZER
1. I n t r o d u c t i o n
C o m p u t a t i o n a l m o r p h o l o g y has a d v a n c e d by l e a p s in the p a s t f e w y e a r s . S i n c e t h e p i o n e e r i n g w o r k of K a y (e.g. K a y 19 7 7 ) , m a jo r c o n t r i b u t i o n s h a v e b e e n s u b m i t t e d e s p e c i a l l y by K a r t t u n e n
( K ar tt un en & al. 1981) and K o s k e n n i e m i (1983). A c o m m o n l i n g u i s t i c t r a i t of t h i s l i n e of w o r k h a s b e e n a f a i r l y s t r i c t a d h e r e n c e to the b a s i c p r i n c i p l e s of g e n e r a t i v e p h o n o l o g y and m o r p h o l o g y ( e s p e c i a l l y of t h e IP type). T h e t h e o r i e s a n d m o d e l s p r o p o s e d h a v e b e e n d e c i s i v e l y b a s e d on the n o t i o n of r u l e s r e l a t i n g d i f f e r e n t l e v e l s of r e p r e s e n t a t i o n . T y p i c a l l y , the r u l e s d e s c r i b e m o r p h o p h o n o l o g i c a l a l t e r n a t i o n s by w h i c h s u r f a c e - l e v e l w o r d - f o r m s
d e v i a t e f r om p o s t u l a t e d l e x i c a l or u n d e r l y i n g forms. C e n t r a l c o n c e p t s h a v e a l s o b e e n t h e r e p r e s e n t a t i o n of l e x i c o n s as t r e e st ru c t u r e s , m i n i l e x i c o n s for d e s c r i b i n g m o r p h o t a c t i c s t r u c t u r e
in te r m s of p o i n t e r s to s u b s e q u e n t c l a s s e s of a l l o w e d m o r p h o l o g i c a l c a t e g o r i e s (e.g. K a r t t u n e n & al. 1981), and the i m p l e m e n t a t i o n of IP r u l e s as f i n i t e - s t a t e t r a n sd uc er s. A m a j o r a c h i e v e m e n t was K o s k e n n i e m i ' s (1983) t r u l y b i d i r e c t i o n a l l a n g u a g e - i n d e p e n d e n t f o r m a l i s m for w o r d - f o r m p r o d u c t i o n and r e c o g n it io n. N o t i o n s such
as i n t r a p a r a d i g m a t i c d e p e n d e n c i e s b e t w e e n s u b s e t s of e n d i n g s a n d / o r s t e m s , a s w e l l as p r o d u c t i v i t y a n d t h e m e c h a n i s m s of l e x i c a l ex te n s i o n , h a v e so far p l a y e d o n l y a m i n o r r o l e (however, cf. E j e r h e d and C h u r ch 's p a p e r in the p r e s e n t volume).
T h i s p a p e r d i s c u s s e s a m o r p h o l o g i c a l a n a l y z e r c a l l e d P A R M O R F
t h a t w a s d e s i g n e d for s i m u l a t i n g n o t IP r u l e s b u t p a r a d i g m a t i c r e l a t i o n s h i p s . O n e of the m o s t n o t a b l e r e c e n t t r e n d s in m o r p h o l o g i c a l t h e o r y ha s b e e n the n a t u r a l m o r p h o l o g y a d v o c a t e d e s p e c i a l l y by D r e s s i e r , M a y e r t h a l e r , and W u r z e l (e.g. D r e s s i e r 1985, W u r z e l
-1984). O n e of it s k e y c o n c e p t s is t h e n o t i o n of p a r a d i g m a t i c d e p e n d e n c y that has b e e n e l a b o r a t e d e s p e c i a l l y by W u r z e l (also cf. B y b e e 1985). T h i s b o d y of w o r k has p r o v i d e d i m p o r t a n t im pe t u s for t h e p r e s e n t e f f o r t . In p a r t i c u l a r , it is m y i n t e n t i o n to e x p l o r e h o w f e a s i b l e a p a r a d i g m v i e w of m o r p h o l o g y is in b u i l d i n g c o m p u t a t i o n a l m o d e l s of w o r d - f o r m r e c o gn it io n. A n o t h e r p o i n t of i n t e r e s t is h o w e a s i l y s u c h a m o d e l c a n be d e s i g n e d to i n c o r p o rate m o r p h o l o g i c a l p r o d u c t i v i t y and l e x i c a l ex te nsion.
An i m p o r t a n t f e a t u r e of P A R M O R F is that it r e n o u n c e s the use of m o r p h o p h o n e m i c s y m b o l s o n t h e l e x i c a l l e v e l , a n d a l s o d o e s a w a y w i t h t h e c o r r e s p o n d i n g p h o n o l o g i c a l r u l e s . D i a c r i t i c s a r e u s e d o n l y for t h e p u r p o s e of s i n g l i n g o u t m e m b e r s of t r u l y n o n p r o d u c t i v e and c l o s e d i n f l e c t i o n a l types. W h a t e v e r m o r p h o p h o n o l o - g i c a l a l t e r n a t i o n s th e r e are w i l l be e x p r e s s e d by s t a t i n g intra- p a r a d i g m a t i c d e p e n d e n c i e s b e t w e e n s t em s and e n d i n g c l as se s.
T h e c e n t r a l p r o p e r t y of P A R M O R F is t h a t t h e l e x i c o n t r e e o p e r a t i o n a l in w o r d - f o r m a n a l y s i s is b a s e d o n s t e m s t h a t a r e
d e r i v e d by p a r a d i g m a t i c p a t t e r n r u l e s fr o m ba se fo rm s w h i c h ma y be e i t h e r e n t r i e s in t h e m a i n l e x i c o n or n e w w o r d s t h a t a r e a b o u t to be i n t e g r a t e d in t h e l e x i c o n . T h e b a s e f o r m s of t h e l e x i c a l e n t r i e s as such are no t d i r e c t l y i n v o l v e d in w o r d - f o r m r e c o g n i tion. T h e P A R M O R F m a i n l e x i c o n for F i n n i s h thus c o n t a i n s i.a. the n o u n l e x e m e k a u p p a 'shop' (N.B. in s t r a i g h t f o r w a r d p h o n o l o g i c a l s h ap e w i t h o u t m o r p h o p h o n e m e s ) . For this lexeme, g e n e r a l p a t t e r n
r u l e s d e t e r m i n e f o u r s t e m s w i t h t h e i r a p p r o p r i a t e m o r p h o t a c t i c i n f o r m a t i o n (here omitted), viz. kauppa, k a u p a , k a u p p o , and kau- po. T h e s e s t e m s a r e i n s e r t e d in t h e t r e e u s e d for w o r d - f o r m r e c o g n i t i o n .
It is my h y p o t h e s i s that o n ce the i n f l e c t i o n a l b e h a v i o r of a w o r d is k n o w n , r e c o g n i t i o n of i n d i v i d u a l i n s t a n c e s of it t a k e s p l a c e in r e l a t i o n to t h e c o n c r e t e s t e m s in t h e l e x i c o n tree. N o ( a n a l o g u e s of) I P / I A r u l e s ar e i n v o k e d in the a c t u a l p r o c e s s of
w o r d - f o r m r e co gn it io n.
P A R M O R F e m b o d i e s t h e h y p o t h e s i s t h a t m o r p h o l o g i c a l p r o c e s s i n g in t h e s e n s e of " a p p l y i n g r u l e s " c o n s i s t s p r i m a r i l y in d e t e r m i n i n g h o w w o r d s so far u n k n o w n to t h e l a n g u a g e u s e r a r e
i n f l e c t e d . F o r a n y w o r d , t h i s p i e c e of k n o w l e d g e s h o u l d be s u p p l i e d by a w o r k i n g t h e o r y of m o r p h o l o g i c a l p r o d u c t i v i t y ( h er e
-f o r m a l i z e d as p a t t e r n rules). S u p p o s i n g that a l l w o r d s b e l o n g i n g to u n p r o d u c t i v e and c l o s e d i n f l e c t i o n a l s u b c l a s s e s are m a r k e d in
the lexi co n, the p a t t e r n r u l e s w i l l d e r i v e a p p r o p r i a t e s t e m sets for them, and p r o d u c t i v e d e f a u l t s t e m se ts for a l l u n m a r k e d w o r d s
(whether in the l e x i c o n or not).
T h i s a p p r o a c h to m o r p h o l o g i c a l p r o d u c t i v i t y m a k e s t h e p r o c e s s of l e x i c a l e x t e n s i o n f a l l o u t f r o m e n t i t i e s a l r e a d y in t h e g r a m m a r . S i n c e w o r d - f o r m s a r e r e c o g n i z e d j u s t by s c a n n i n g c o n c r e t e s t e m s and c o n c r e t e en di ng s, P A R M O R F s h o u l d l e n d i t s e l f to p s y c h o 1 i n g u i s t i c i n t e r p r e t a t i o n m o r e d i r e c t l y than m o d e l s in v o k i n g g e n e r a t i v e rules. T h e s e m o d e l s face the p r o b l e m of d e t e r m i n i n g how, p r e c i s e l y , p h o n o l o g i c a l r u l e s and their i m p l e m e n t a
ti o n as f i n i t e - s t a t e a u t o m a t a s h o u l d be r e l a t e d to r e a l be ha vi or .
2. L e x i c a l r e p r e s e n t a t i o n s
T h e r e a r e at l e a s t e i g h t w a y s in w h i c h t h e l e x i c a l f o r m s of w o r d s m a y be co ns tr ue d:
(1) Mj^n_imal l i s M n ^ of t h e S P E - t y p e w h e r e e v e n d i s t a n t l y r e l a t e d w o r d - f o r m s are d e r i v e d f r o m a s h a r e d l e x i c a l s o u r c e w h o s e c o m p o s i t i o n is c l a i m e d to be ( s y s t e m a t i c - ) p h o n o l o g i c a 1. T h i s u n d e r l y i n g fo rm l e x i c a l l y r e p r e s e n t s a l l w o r d - f o r m s (the w h o l e i n f l e c t i o n a l paradigm). A c e n t r a l g o a l is to m i n i m i z e the n u m b e r of l e x e m e s a n d to m a x i m i z e t h e s t a t e m e n t of m o r p h o p h o n o 1 o c i a 1 a l t e r n a t i o n s as IP r u l e s . W o r d - f o r m s a r e i n d i r e c t l y r e l a t e d to
the l e x i c a l r e p r e s e n t a t i o n s (i.e. d e r i v e d by rules).
(2) C o n s t r a i ^ n e d ini.n_inial li sti^n£ w h e r e r e m o t e l y r e l a t e d ( e s p e c i a l l y m o r p h o p h o n e m i c a 1 1 y i r r e g u l a r ) w o r d - f o r m s a r e n o t d e r i v e d fr om a c o m m o n source. T h e n u m b e r of l e x e m e s p o s t u l a t e d is
t h e r e f o r e s o m e w h a t l a r g e r t h a n u n d e r (1). T h e v a s t m a j o r i t y of w o r d s is r e p r e s e n t e d by a u n i q u e l e x i c a l fo r m as in (1). H o w e v e r , t h e s e b a s e - f o r m s as w e l l as t h e r u l e s a r e s u b j e c t to m o r e r e s t r i c t e d (naturalness) c o n d i t i o n s th an are S P E - t y p e rules. T h i s is a m o d i f i e d S P E - p o s i t i o n a d v o c a t e d b y s e v e r a l v a r i a n t s of
n a t u r a l g e n e r a t i v e p h o n o l o g y (e.g. H o o p e r 1976).
(3) U n i q u e l e x i c a l f o rm s a l l o w i n g d i a c r i t i c s and m o r p h o p h o
-n e m e s . T h i s p o s i t i o -n is e m b o d i e d i-n m o s t t w o - l e v e l i m p l e m e -n t a t i o n s b a s e d o n K o s k e n n i e m i 's (1983) m o d e l . A l e x i c a l f o r m m a y c o n t a i n s e v e r a l m o r p h o p h o n e m i c a n d d i a c r i t i c (e.g. j u n c t u r e ) sy mb ol s. O t h e r w i s e , it r e s e m b l e s (1,2), e s p e c i a l l y in the use of p h o n o l o g i c a l r u l e s (to be c o m p i l e d as f i n i t e - s t a t e a u t o m a t a ) . P a r a d i g m s are r e p r e s e n t e d by a s i n g l e base -f or m, as in (1,2).
(4) S t e m s . T h i s s o l u t i o n is a d v o c a t e d he re . I r e g a r d a l l p h o n o l o g i c a l l y d i s t i n c t b o u n d v a r i a n t s of a b a s e - f o r m as s e p a r a t e stems. A s t e m - b a s e d l e x i c o n is bo u n d to be s o m e w h a t l a r g e r th an a l e x i c o n c o n t a i n i n g u n i q u e b a s e f o r m s for m o s t w o r d s . O n e of t h e p r e s e n t p u r p o s e s is to e x p l o r e w h e t h e r the a m o u n t of r e p e t i t i o n
w i l l be p r o h i b i t i v e l y l a r g e so as to r e n d e r t h i s a p p r o a c h u n f e a s i b l e . It d e s e r v e s to be s t r e s s e d t h a t c o m m o n i n i t i a l s u b s t r i n g s , m e a n i n g s , c a t e g o r y i n f o r m a t i o n , s y n t a c t i c f e a t u r e s ,
etc., in a s e t of s t e m s m a n i f e s t i n g o n e l e x e m e w i l l n o t be r e p e a t e d but s h a r e d in the s t e m tree. W e are thus not h e a d i n g for a t h e o r y i n v o l v i n g w h o l e - s a l e l i s t i n g . - N o c o m p r e h e n s i v e s t e m - b a s e d t h e o r y of m o r p h o l o g y h a s so far b e e n a d v a n c e d , a p a r t f r o m the " t e c h n i c a l stem" s t e m a p p r o a c h (5), and so me g e n e r a l m e n t i o n of (full) stem s as a t h e o r e t i c a l p o s s i b i l i t y for l e x i c a l r e p r e s e n t a t i o n (e.g. L i n e l l 1979).
(5) T e c h n _ i c a l s t e m s . T h i s c o n c e p t r e f e r s to t h e m i n i m a l i n v a r i a n t p h o n o l o g i c a l s u b s t a n c e o c c u r r i n g in a l l (full) stems,
e.g. k a u p in Fi. k a u p p a . S u c h t e c h n i c a l s t e m s h a v e b e e n u s e d by H e l l b e r g (1978) in his d e s c r i p t i o n of S w e d i s h m o r p h o l o g y , and by K a r t t u n e n & al. (1981) in t h e i r F i n n i s h m o r p h o l o g y ( T EX FI N) . In
t h i s a p p r o a c h , s t e m a l t e r n a t i o n s a r e d e s c r i b e d e.g. b y p o s t u l a ting m i n i l e x i c o n s p o i n t e d to by the r e l e v a n t t e c h n i c a l stems.
(6) F u l l l isting h y p o t h e s i s (FLH). FL H c l a i m s that a l l w o rd - fo rm s are l i s t e d in the lexicon. T h i s v i e w is w i d e l y e n t e r t a i n e d in p s y c h o l i n g u i s t i c r e s e a r c h on w o r d - f o r m r e c o g n i t i o n (cf. Bu t- t e r w o r t h 1983). W e s h a l l d i s c a r d this p o s s i b i l i t y s i nc e it l e a d s to i m p l a u s i b l e c o n s e q u e n c e s for h i g h l y i n f l e c t e d l a n g u a g e s such as F i n n i s h . G i v e n t h a t a F i n n i s h v e r b h a s s o m e 1 5 , 0 0 0 f o r m s a n d an E n g l i s h v e r b l e ss th an five, F L H e n t a i l s that l e a r n i n g F i n n i s h v e r b a l m o r p h o l o g y w o u l d be t h o u s a n d s of t i m e s m o r e c u m b e r s o m e t h a n l e a r n i n g E n g l i s h , a n d t h a t a F i n n w o u l d n e e d m u c h m o r e n e u r a l sp a c e to i n t e r n a l i z e his v e r b s than w o u l d an E n g l i s h m a n .
-F u r t h e r m o r e , a c c o r d i n g to -FL H, u p o n l e a r n i n g a n e w v e r b a -F i n n s h o u l d h a v e to i n t e r n a l l y g e n e r a t e a l l the 15,000 form s - m o s t of w h i c h he w o u l d n e v e r use. A l l this s e e m s i m p l a u s i b l e . In face of
t h e s e r e m a r k s , F L H w i t h o u t p r e c i s i o n s a n d a m e n d m e n t s is n o t a c c e p t a b l e as a g e n e r a l ( p s y c h o l i n g u i s t i c ) t h e o r y of l e x i c a l or g a n i z a t i o n . S t e m s p r o v i d e a m o r e u n i f o r m c r o s s 1 i n g u i s t i c c h a r a c t e r i z a t i o n of t h e l e x i c o n . E.g. E n g l i s h a n d F i n n i s h d o n ' t d i f f e r d e c i s i v e l y in r e g a r d to h o w m a n y s t e m s a w o r d m a y h a v e . F i n n i s h v e r b s and n o u n s h a v e m a x i m a l l y f i v e or six stems.
(7) S e m a n t i c a l l y f e a s i b le w o r d - f o r m s . T h i s w o u l d be a m o r e r e a l i s t i c r e d u c e d v e r s i o n of F L H (to my k n o w l e d g e , not ye t e l a b o rated). It w o u l d c l a i m th at the l e x i c o n contains' w o r d - f o r m s , but o n l y t h o s e t h a t a r e s e m a n t i c a l l y f e a s i b l e . T h u s , t h e E n g l i s h l e x i c o n w o u l d not (normally) c o n t a i n e.g. p l u r a l form s for p r o p e r n a m e s or m a s s words, or p e r s o n a l form s for m e t e o r o l o g i c a l verbs.
(8) P £ p t o t y p i ^ c a 1 wo_rd- fo £m s. G i v e n t h a t m o s t w o r d s , d u e to o b v i o u s s e m a n t i c r e a s o n s , f a v o u r c e r t a i n f o r m s (e.g Fi. l o c a l n o u n s f a v o u r t h e l o c a l c a s e s , m a s s n o u n s t h e p a r t i t i v e c a s e , c o u n t a b l e s the n o m i n a t i v e ) , it is m o r e r e a s o n a b l e to s u p p o s e that the c o r e l e x i c o n of a l a n g u a g e user c o n t a i n s the v e r y w o r d - f o r m s t h a t h e / s h e h a s l e a r n t , e s p e c i a l l y t h o s e t h a t a r e in f r e q u e n t
a c t i v e use, i.e. the p r o t o t y p i c a l on es (cf. K a r l s s o n 1985).
A l l of (1-8) a r e n o t m u t u a l l y e x c l u s i v e . A n y " r e a l i s t i c " m o d e l (i.e. s t r i v i n g no t o n l y for s y s t e m d e s c r i p t i o n but a l s o for
i s o m o r p h y w i t h p s y c h o l i n g u i s t i c facts) m u s t be a b l e to a c c o u n t at l e a s t fo r f r e q u e n c y e f f e c t s w h i c h o f t e n m a n i f e s t t h e m s e l v e s o n t h e l e v e l of i n d i v i d u a l w o r d - f o r m s (cf. G a r n h a m 1 9 8 5 : 4 5 fo r an o v e r v i e w ) . T h i s w o u l d p r e s u p p o s e s p e c i a l t r e a t m e n t (e.g. s e p a r a t e l i s t i n g ) of t h e m o s t f r e q u e n t a n d d e e p l y e n g r a v e d w o r d - f o r m s , r e g a r d l e s s of w h e t h e r t h e b u l k of t h e l e x e m e s a r e r e p r e s e n t e d
a c c o r d i n g to o n e of t h e a l t e r n a t i o n s (1-5). H o w e v e r , in w h a t f o l l o w s we s h a l l o n l y c o n s i d e r the f e a s i b i l i t y of (4).
I n a p p r o a c h e s ( 1 - 3 ) , t h e b a s i c s e t - u p of w o r d - f o r m p r o c e s s i n g is this:
L E X I C O N ( S ) (often c o m p i l e d as tr ee s t r u c t ur es )
R U L E S (often i m p l e m e n t e d as f i n i t e - s t a t e tr an s d u c e r s ) S U R F A C E W O R D - F O R M S
-In c o m p u t a t i o n a l inodels, the (main and ending) l e x i c o n s are n o r m a l l y i m p l e m e n t e d as trees. T h e s e trees are d i r e c t o p e r a t i o n a l a n a l o g u e s of the r e s p e c t i v e l e x i c o n s and are t h e r e f o r e the o n l y
p r o c e s s u a l l y r e l e v a n t l e x i c a l structure. T h e l e x i c o n l i s t is an e p i p h e n o m e n o n h e l p f u l in i n s p e c t i n g the e x i s t i n g s t oc k of words.
T h e p r e s e n t a p p r o a c h is s l i g t h l y d i f f e r e n t . I p o s t u l a t e a m a i n l e x i c o n (list) c o n t a i n i n g t h e s t o c k of l e x e m e s . H e r e , e a c h
l e x e m e is r e p r e s e n t e d as a q u i n t u p l e :
< b a s e - f o r m n e x t L e x i c o n m e a n i n g s y n t F e a t u r e s cat>
E a c h l e x e m e h a s a u n i q u e b a s e - f o r m c o n s i s t i n g of p h o n e m e s o n l y . N o m o r p h o l o g i c a l m a r k i n g s a r e n e e d e d w h e n a l l s t e m s of a b a s e - f o r m a r e p r e d i c t a b l e b y g e n e r a l p a t t e r n r u l e s . E.g., a l l S w e d i s h n o u n s e n d i n g in - e l , - e n , - e r l o s e t h e i r - e - in c e r t a i n
m o r p h o l o g i c a l e n v i r o n m e n t s an d t h e r e f o r e no i n d i v i d u a l b a s e - f o r m s ne ed d i ac ri ti cs . H o w e v e r , p r e d i c t i n g the m o r p h o p h o n o l o g i c a l b e h a v i o r of t h e F i n n i s h i n f l e c t i o n a l t y p e s ves_i (nom.) : y e d e + n
(gen.) a n d las_i (nom.) : las_i + n (gen.) p r e s u p p o s e s t h a t t h e m e m b e r s of t h e f o r m e r c l o s e d , u n p r o d u c t i v e , c o m p l e x c l a s s a r e m a r k e d (say, v e s i > ). P a t t e r n r u l e s t e l l w h a t s p e c i a l s t e m s - si> -
n o u n s have. U n m a r k e d - n o u n s c o n s t i t u t e the u n m a r k e d d e f a u l t p a tt er n.
T h e F i n n i s h m a i n l e x i c o n t h u s c o n t a i n s n o m i n a l a n d v e r b a l e n t r i e s su ch as the f o l l o w i n g ones. n e x t L e x w i l l be s p e c i f i e d for e a c h s t e m by t h e p a t t e r n r u l e s , t h e m e a n i n g is h e r e j u s t r e p r e s e n t e d by a t r a n s l a t i o n into E n g l i s h , and the s y n t a c t i c f e a t u r e s
occu r in b a re ou tl in e.
(talo N I L h o u s e ( C o u n t a b l e ...) N) (vesi> N I L w a t e r (Mass ...) N)
(hullu N I L m a d N I L A) (suuri> N I L big N I L A) (raskas> N I L h e a v y N I L A)
(kannas N I L i s t h m u s ( C o n cr et e ...) N) (anta N I L g i v e (Trans A l l R e c t i o n ) V) (asu N I L l i ve (I nt ra ns I n e R ec ti on ) V)
-G i v e n the i n f o r m a t i o n s u p p l i e d by ea ch l e x i c a l entry, p a t t e r n r u l e s c o m p i l e t h e s t e m l e x i c o n t r e e a c t i v e in w o r d - f o r m r e c o g n i t i o n . T h e s t e m l e x i c o n is c r u c i a l l y d i f f e r e n t f r o m t h e m a i n l e x i c o n l i s t s i n c e it c o n t a i n s f u l l se ts of stems. T h e s t em s of e a c h l e x e m e s h a r e i n i t i a l s u b s t r i n g s , m e a n i n g , s y n t a c t i c features, and p a r t of speech, i.e. a l l l e x i c a l i n f o r m a t i o n ap a r t f r o m a l t e r n a t i n g s t e m s e g m e n t s is g i v e n j u s t o n c e . T h e c o r e of P A R M O R F is thus:
P A T T E R N R U L E S ( p r e d i ct in g stems)
S T E M T R EE W O R D - F O R M S
3. P a t t e r n ru l e s
P a t t e r n r u l e s e m b o d y t h e p r e d i c t i v e p o w e r of m o r p h o l o g y . T h e y a r e in a c t i v e u s e o n l y w h e n a n e w w o r d is a d d e d to t h e s t e m tree. G i v e n a p p r o p r i a t e i n f o r m at io n, the s t e m s of a b a s e - f o r m are p r e d i c t e d and i n s e r t e d in the s t e m tree. O n c e i n te gr at ed , P A R M O R F
p r e s u p p o s e s no m o r e (IP or lA type) p r o c e s s i n g for r e c o g n i z i n g f o r m s of a w o r d . In m a n y r e s p e c t s , t h i s m o d e l is e q u a l l y a p p l i c a b l e to c h i l d r e n ' s a c q u i s i t i o n of m o r p h o l o g y a n d to a n a d u l t ' s a d d i n g w o r d s to h i s / h e r l e x i c o n . N o t e t h a t t h i s m o d e l e m b o d i e s the c o r e of FLH w i t h o u t e n d l e s s l i s t i n g of c o n c r e t e w o r d - f o r m s , b u t a l s o w i t h o u t r u l e p r oc es si ng .
T h e p a t t e r n r u l e s a l s o e x p l i c a t e o n e a s p e c t of p a r a d i g m c o n s t i t u t i o n . T h e y d e t e r m i n e w h a t s t e m s b e l o n g t o g e t h e r and a l s o w h a t m o r p h o p h o n o l o g i c a l a l t e r n a t i o n s b e l o n g together. S u c h c l u s t e r i n g s ar e at the h e a r t of t r a d i t i o n a l pa ra di gm s.
P a t t e r n r u l e s ar e I F - T H E N - r u l e s o b e y i n g the f o l l o w i n g f o r
m a t w h e r e p a r e n t h e s e s i n d i c a t e e l e m e n t s not n e c e s s a r i l y us ed in a l l p a t t e r n rules:
-ini-IF
b a s e f o rm c o d a p a r t of s p e e c h
(number of s y ll ab le s)
( m o r p h o s y n t a c t i c f e a t u r e (s))
T H E N
s t em -c od a^
(stem-coda^ + nextLex^) (stem-coda^ + n e x t L e x ^ )
T h e c o r e of t h e I F - p a r t is t h e b a s e - f o r m c o d a ( c l o s e l y
r e l a t e d to B y b e e a n d S l o b i n ' s (1982) n o t i o n " s c h e m a " ) / i.e. t h e s h o r t e s t s e g m e n t s t r i n g e x t r a c t e d f r o m t h e en d of t h e b a s e f o r m th at s u f f i c e s for p r e d i c t i n g the stems. T h e c o d a is e x p r e s s e d as a s e q u e n c e of p h o n e m e s (plus a di a c r i t i c , w h e r e needed). T h e p a rt of s p e e c h is a l s o n e e d e d by the IF-part. S y l l a b l e nu m b e r is o f t e n
r e q u i r e d , as m i g h t be s p e c i f i c m o r p h o s y n t a c t i c f e a t u r e s (e.g. S w e d i s h s t e m p r e d i c t i o n o f t e n n e e d s g e n d e r ) . A p a r t f r o m t h e n u m b e r of s y l l a b l e s (which is d e t e r m i n e d by a s e p a r a t e a l g o r i t h m ) the I F - p a r t i n f o r m a t i o n is g i v e n in the m a i n l e x i c o n entries.
F o r n e w w o r d s , t h i s i n f o r m a t i o n m u s t be m a d e a v a i l a b l e by
c o n t e x t of use. E v i d e n t l y , i n f l e c t i o n a l b e h a v i o r c a n n o t be p r e d i c t e d w i t h o u t k n o w l e d g e of p a r t of speech, etc.
T h e T H E N - p a r t p r o v i d e s a s e t of p a i r s (at l e a s t one) e a c h c o n s i s t i n g of a s t e m - c o d a a n d a r e f e r e n c e to t h e a p p r o p r i a t e e n d i n g lexicon.
T h e i n s e r t e d f u l l st e m s are f o rm ed by a p p e n d i n g the r e s i d u e of the b a s e f o rm (i.e. w h a t is to the l e f t of the b a s e - f o r m coda)
to e a c h stem-coda. T y p i c a l F i n n i s h p a t t e r n r u l e s l o o k as f o l l o w s (by c o n v e n t i o n , n a m e s of e n d i n g tr ee s are p r e f i x e d by a slash);
IF
kko, N, 2
T H E N
kko / h u p p u ko / h u p u
IF
p p a , N , 2
T H E N
p p a / n o m / s g / s t r pa / n o m / s g / w
P P O / j / p l po / n o m / p l / w
-I F
ppa, N, 3
T H E N
p p a / n o m / s g / s t r p a / n o m / s g / w P P O / j / p l p o / a m m a t e
IF
p p a a , V
T H E N
p p aa / l o u k k a a ppa / l o u k k a pa / l o u k a
A d i s y l l a b i c g r a d a b l e n o u n e n d i n g in - k k o 'thus has two st ems a n d t h e a p p r o p r i a t e e n d i n g t r e e s a r e / h u p p u a n d / h u p u , r e s p e c t i v e l y . A d i s y l l a b i c g r a d a b l e n o u n in - p p a h a s f o u r s t e m s . A t r i s y l l a b i c no un in - pp a has the same four st e m s but a d i f f e r e n c e in w h a t e n d i n g s are a l l o w e d in w e ak g r a d e p l u r a l s (ula p o i t a vs. * k a u p o i t a ). A v e r b e n d i n g in - p p a a has th r e e stems, etc.
P a t t e r n r u l e s are n o r m a l l y d i f f e r e n t i a t e d at l e a s t for n o u n s a n d v e r b s , o f t e n a l s o for n o u n s a n d a d j e c t i v e s (not so in F i n ni sh ). A l l b a s e - f o r m c o d a s g e n e r a t e d b y t h e p a t t e r n r u l e s for a
c e r t a i n p a r t of s p e e c h are i n s e r t e d into a p a t t e r n t r e e . T h e r e is o n e p a t t e r n t r e e for e a c h d i s t i n c t p a r t of s p e e c h . T h e s e g m e n t s
of e a c h b a s e - f o r m c o d a are i n s e r t e d in r e v e r t e d order, p r e f i x e d b y a n i n t e g e r i n d i c a t i n g t h e n u m b e r of s y l l a b l e s w h e r e n e e d e d . T h u s , t h e s t r i n g s i n s e r t e d in t h e n o m i n a l p a t t e r n t r e e for t h e fi r s t th re e p a t t e r n r u l e s ju st m e n t i o n e d are o k k , a p p , 3 a p p .
T H E N - p a r t s a r e e n t r i e s u n d e r t h e l a s t n o d e of e a c h i d e n t i f i a b l e c o d a in this tree. O n c e this b a s e - f o r m p a t t e r n tree e x i s t s for a g i v e n p a r t of speech, the s t e m set for any such b a s e - f o r m is f o u n d by p i c k i n g t h e l o n g e s t m a t c h i n t h e p a t t e r n t r e e for the s e a r c h k e y c o n s i s t i n g of t h e b a s e - f o r m s e g m e n t s in r e v e r t e d o r d e r .
T h u s , w h e n t h e s t e m s e t for t h e n o u n u l a p p a is to be d e t e r m i n e d , a m a t c h for t h e s t r i n g 3 a p p a l u is s o u g h t in t h e p a t t e r n
t r e e (the i n t e g e r "3" h a v i n g b e e n p r e f i x e d b y t h e s y l l a b l e c o u n t i n g al go ri th m) . T h e l o n g e s t m a t c h foun d w i l l be 3app and the c o r r e s p o n d i n g e n t r y is re tr i e v e d . T h e four s t em s thus d e t e r m i n e d a r e i n s e r t e d in t h e s t e m t r e e a n d t h e n u s e d in t h e r e c o g n i t i o n
-103-p r o c e s s . F o r n o n g r a d a b l e t r i s y l l a b i c n o u n s i n - a t h e l o n g e s t m a t c h f o u n d w i l l b e p r o v i d i n g o n l y t w o s t e m s ( - a , - o ) .
T h e b a s e - f o r m c o d a s of t h e p a t t e r n r u l e s a r e e x p r e s s e d as s t r i n g s of p h o n e m e s and e v e n t u a l i n f l e c t i o n a l d i a c ri ti cs . T h i s l e a d s to r e p e t i t i o n e s p e c i a l l y for w o r d s s u b j e c t to c o n s o n a n t
g r a d a t i o n a n d m u t a t i o n of t h e f i n a l v o w e l ( b ot h e x e m p l i f i e d by u l a p p a ) . E.g. up to 15 i n d i v i d u a l i n s t a n c e s of c o n s o n a n t g r a d a t i o n w i l l be s e p a r a t e l y s t a t e d fo r t h e p a r a d i g m s w h e r e t h e y a c t u a l l y occur. T h e r e are thus so me 15 p a t t e r n r u l e s for d i s y l l a b i c n o u n s e n d i n g in - q , viz. - k k O y - p p o , - n t o , etc.
D e v i a t i n g f r o m g e n e r a t i v e p r a c t i c e , I h a v e d e l i b e r a t e l y c h o s e n n o t t o g e n e r a l i z e c o n s o n a n t g r a d a t i o n , v o w e l m u t a t i o n , a n d s i m i l a r m o r p h o p h o n o l o g i c a l a l t e r n a t i o n s a c r o s s p a r a d i g m s . A t f i r s t s i g h t , t h i s s e e m s t o l e a d t o p r o h i b i t i v e l y u n i l l u m i n a t i n g r e p e t i t i o n . H o w e v e r , t h e r e a r e p o s i t i v e l i n g u i s t i c a r g u m e n t s i n f a v o u r o f t h i s s o l u t i o n . P a r t i c u l a r p a r a d i g m s m i g h t c o n t a i n m o r - p h o p h o n o l o g i c a l g a p s t h a t s h o u l d s o m e h o w b e a c c o u n t e d f o r . T h u s , t r i s y l l a b i c F i . n o u n s a l l o w o n l y a f e w g r a d a b l e s t o p s a t t h e f i n a l s y l l a b l e b o u n d a r y ; p p , ] ^ , n t , n k . N o u n s o f t h e k a i k k i > - t y p e d i s a l l o w i . a . t h e g r a d a b l e c o m b i n a t i o n s I t , n t , £ t a t t h e s y l l a b l e b o u n d a r y . P a r a d i g m s l i k e * k a r a m p a j_ k a r a m m a n , * k a n t i £ k a n n e n a r e n o t j u s t a c c i d e n t a l l y l a c k i n g b u t m o r p h o p h o n o l o g i c a l l y u n g r a m m a t i c a l . T h a t i s , i n d i v i d u a l p a t t e r n r u l e s e x p l i c i t l y s t a t e t h e a l l o w e d p o s s i b i l i t i e s u p t o s y s t e m a t i c g a p s b u t e x c l u d e t h e l a t t e r , t h e r e b y a c c o u n t i n g f o r s y s t e m a t i c r e s t r i c t i o n s .
T h e p r i n c i p l e o f l o n g e s t m a t c h u s e d i n s e a r c h i n g t h e p a t t e r n t r e e g i v e s a c o n v e n i e n t a n d u n i f o r m w a y o f h a n d l i n g e x c e p t i o n s . I f t h e i n v e r t e d f o r m o f a w h o l e w o r d i s f o u n d i n t h e p a t t e r n t r e e , i t w i l l b y d e f i n i t i o n b e t h e l o n g e s t m a t c h . T h u s , e x c e p t i o n f e a t u r e s f o r i n d i v i d u a l w o r d s a r e g e n e r a l l y n o t n e e d e d .
T h e t o t a l n u m b e r of p a t t e r n r u l e s w i t h t h e a b o v e c o n c r e t e p r o p e r t i e s i n v o k e d in my f u l l d e s c r i p t i o n of F i n n i s h n o m i n a l and v e r b a l m o r p h o l o g y is s o m e 1 , 1 3 0 (600 for n o u n s , 5 3 0 fo r v e r b s ; s o m e 2 5 0 e x c e p t i o n a l p r o n o m i n a l f o r m s a r e n o t i n c l u d e d in th e fi rs t figure). T h is nu m b e r i n c l u d e s a l l i d i o s y n c r a c i e s ( r ou gh ly h a l f of thes e r u l e s c o n c e r n on e it em only). C o n s i d e r i n g that the p o w e r of t h e p a t t e r n r u l e s y s t e m is s u c h as to p r e d i c t t h e i n f l e c t i o n of a l l n o u n s , a d j e c t i v e s , a n d v e r b s in t h e l e x i c o n .
-i n c l u d -i n g a l l e x c e p t -i o n s , a n d the d e f a u l t -i n f l e c t -i o n of any su ch w o r d not in the le xi co n, and f u r t h e r m o r e e x c l u d i n g m a n y ty p e s of i m p o s s i b l e pa r a d i g m s , we w o u l d not r e g a r d the nu m b e r as " p r o h i b i t i v e l y l a r g e " , e s p e c i a l l y w h e n o n e t a k e s i n t o a c c o u n t t h a t no f u r t h e r m o r p h o p h o n o l o g i c a l r u l e s or p r o c e s s i n g is i n v o k e d in w o r d - f o r m r e co gn it io n. I.e., f u l l p r o d u c t i v e m a s t e r y of F i n n i s h morpho(no) l o g y p r e s u p p o s e s l e a r n i n g s o me 1,100 c o n c r e t e p h o n e m e -
l e v e l rules.
4. E n d i n g l e x i c o n s
S i m i l a r l y b e h a v i n g e n d i n g s are g r o u p e d into e n d i n g l e x i c o n s
w h i c h are t r i p l e s w i t h the f o l l o w i n g st ru ct ur e:
<name, ot h e r L e x , e n d i n g s >
E a c h e n d i n g l e x i c o n ha s a na me ( c o n v e n t i o n a l l y p r e f i x e d by a s l a s h ) n o r m a l l y c h o s e n so as to g i v e a m n e m o n i c h i n t of w h a t ki n d s of s t em s or w o r d s it is n o r m a l l y a p p e n d e d to. T h e c o m p o n e n t
"oth er Le x" p r o v i d e s a ( p o s s i b l y nu l l ) l i st of o t h e r e n d i n g l e x i c o n s p a r a d i g m a t i c a l l y i n c l u d e d in the p r e s e n t one. T h i s f a c i l i t y p r o v i d e s a c o n v e n i e n t o p p o r t u n i t y of s t a t i n g p a r a d i g m a t i c r e l a t i o n s h i p s b e t w e e n d i s t r i b u t i o n a 1 1 y r e l a t e d s u b s e t s of e n d i n g s . F i n a l l y , t h e c o m p a r t m e n t " e n d i n g s " is a ( p o s s i b l y e m p t y ) s e t of e n d i n g s b e l o n g i n g to the c u r r e n t e n d i n g l e x i c o n (i.e. p o s s i b l y e m p t y b e c a u s e an e n d i n g l e x i c o n m a y c o n s i s t e x c l u s i v e of r e f e r e n c e s to o t h e r e n d i n g l e x i c o n s u n d e r o t h e r L e x ) . E a c h e n d i n g , in turn, is a triple:
<item, n e xt Le x, en t r y >
w h e r e "item" is the e n d i n g in p h o n e m i c shape, "nextLex" a r e f e r e n c e to t h e n e x t m o r p h o t a c t i c p o s i t i o n , a n d " e n t r y " c o n t a i n s a
l i s t of m o r p h o l o g i c a l ca te g o r i e s . V o w e l h a r m o n y is an e x c e p t i o n to t h e p h o n e m i c p r i n c i p l e of i t e m s t r u c t u r e , i.e. s u f f i x v o w e l h a r m o n y p a i r s are l e x i c a l l y r e p r e s e n t e d as the a r c h i p h o n e m e s A,
105-0, U, w h i c h a r e s p e l l e d o u t as a- a, o-o, u - y w h e n t h e e n d i n g l e x i c o n s a r e c o m p i l e d i n t o t r e e s u s e d in a c t u a l p r o c e s s i n g .
T h e e n d i n g s a n d e n t r i e s a r e o f t e n l i s t e d as w h o l e s , e s p e c i a l l y in c l o s e - k n i t c o m b i n a t i o n s of e.g. n u m b e r a n d c a s e for
nouns. S u ch c o m b i n a t i o n s are o f t e n s u b j e c t to b i d i r e c t i o n a l d e p e n d e n c i e s that are h a r d to c a p t u r e othe rw is e. T h e / j / p l l e x i c o n b e l o w c o n t a i n s g o o d e x a m p l e s of t h i s d e p e n d e n c e . T h e p l u r a l a l l o m o r p h j o c c u r s o n l y if the f o l l o w i n g ptv. or gen. c a s e m o r p h s t a r t s w i t h a v o w e l , and the l a t t e r oc cu r o n l y if pi.
2
precedes. F u r t h e r m o r e , for g r a d a b l e n o u n s the -jA, - jen - c o m b i n a t i o n s are t i e d to s t r o n g - g r a d e s t e m s o n l y (k o i v i k k o j e n vs. *]iP2
yi.koj^en). T h i s c o m p l e x p a r a d i g m a t i c i n t e r d e p e n d e n c e b e t w e e n a c e r t a i n stem, a c e r t a i n n u m b e r m o r p h , a n d a c e r t a i n c a s e m o r p h h a s p r o v e nl a b o r i o u s to c a p t u r e by ( m o r p h o ) p h o n o 1 o g i c a 1 r u l e s . U n d e r t h e p r e s e n t a p p r o a c h , it s u f f i c e s to p o i n t f r o m o n e s t e m to o n e lexicon.
A p s y c h o l i n g u i s t i c a r g u m e n t for t r e a t i n g (some) e n d i n g s e q u e n c e s as w h o l e s c o m e s f r o m t h e o b s e r v a t i o n t h a t c h i l d r e n a c q u i r i n g i n f l e c t i o n a l l a n g u a g e s s e l d o m m a k e e r r o r s i n v o l v i n g the o r d e r of m o r p h e m e s in a w o r d (cf. B y b e e 1 9 8 5 : 1 1 4 f f . f o r a n o v e r view) .
T h e f o l l o w i n g are t y p i c a l e x a m p l e s of e n d i n g le xicons. The n a m e is g i v e n on the f i r s t line, o t h e r L e x on the second, an d the
endings, if any, are indented.
(/ no m/ s g / s t r
( / c l i t / n o m / i l l / V n /poss3) (A / p o s s 4 (PTV S G ) ) (nA / p o s s 4 (ESS S G ) ))
(/ no m / s g / w
N I L
(n / c l i t (GEN S G ) ) (llA / p o s s 4 (ADE S G ) ) (ItA / p o s s 4 (ABL S G ) )
(lie / p o s s S (ALL S G ) ) (ssA / p o s s 4 (INE S G ) ) (stA / p o s s 4 (ELA S G ) )
-(ksi / e l i t (TRA S G ) ) (kse / p o s s 6 (TRA S G ) ) (ttA / p o s s 4 (ABE S G ) ) (t / e l i t (NOM P L ) ))
( / n o m / p l / w
N I L
(illA / p o s s 4 (ADE P L ) ) (iltA / p o s s 4 (ABL P L ) ) (ille / p o s s S (ALL P L ) ) (issA / p o s s 4 (INE P L ) ) (istA / p o s s 4 (ELA P L ) )
(iksi / e l i t (TRA P L ) ) (ikse / p o s s 6 (TRA P L ) ) (ittA / p o s s 4 (ABE P L ) ) (in / e l i t (INS P L ) ) (i / p o s s 2 (INS P L ) ))
(/i/pl
N I L
(iA / p o s s 4 (PTV P L ) )
(ien / e l i t (GEN P L ) ) (ie / p o s s 2 (GEN P L ) ) (iin / e l i t (ILL P L ) )
(ii / p o s s 2 (ILL P L ) ) (inA / p o s s 4 (ESS P L ) ) (ine / p o s s 6 (COM S G / P L ) ))
(/j/pl N I L
(jA / p o s s 4 (PTV P L ) ) (jen / e l i t (GEN P L ) ) (je / p o s s 2 (GEN P L ) ) (ihin / e l i t (ILL P L ) ) (ihi / p o s s 2 (ILL P L ) )
(inA / p o s s 4 (ESS P L ) ) (ine / p o s s 6 (COM SG /PL)))
-(/huppu
(/ n o m / s g / s t r / j / p l ) )
(/hupu
( / n o m / s g / w / n o m / p l / w ) )
(/ no m/ 2 s / a l l
(/huppu / h u p u ) )
(/ pu o l i s k o
( / n o m / 2 s / a l l / i t A / i d e n ) )
( / i t A/ id en N I L
(itA / p o s s 4 (PTV PL)) (iden / c l i t (GEN P L ) ) (itten / c l i t (GEN P L ) )
(ide / p o s s 2 (GEN P L ) ) (itte / p o s s 2 (GEN P L ) ))
E n d i n g s in t h e s a m e e n d i n g l e x i c o n b e h a v e a l i k e . A n e n d i n g l e x i c o n c o n s t i t u t e s a k i n d of " p a r a d i g m a t i c n a t u r a l c l a s s " . Thus, / n o m / s g / s t r c o n t a i n s e n d i n g s o c c u r r i n g af ter s t r o n g - g r a d e sg. s t e m s of ( c e r t a i n ) g r a d a b l e n o u n s . T h e s e e n d i n g s a r e p t v . - A a n d ess. -nA, p l u s c e r t a i n c l i t i c s , p o s s e s s i v e s , a n d i l l a t i v e s i n c l u d e d v i a the s p e c i f i c a t i o n s in otherLex. / n o m / s g / w c o n t a i n s t h e c o r r e s p o n d i n g w e a k - g r a d e sg. e n d i n g s , / n o m / p l / w t h e w e a k - g r a d e pi. endings.
T h e p a r a d i g m f o r m a l i s m e n a b l e s us to c a p t u r e c o m p l e x i n t e r s e c t i n g p a r a d i g m a t i c n e t w o r k s b y w a y of o t h e r L e x r e f e r e n c e s . Thus, the l e x i c o n / h u p p u ( c o v er in g s t r o n g - g r a d e sg. and pi. stem s l i k e h u p p u , l a k k o ) c o n t a i n s the m e m b e r s of / n o m / s g / s t r and / j / p l
but no e n d i n g s of its own. / h u p u ( c o v e r i n g the c o r r e s p o n d i n g sg. a n d pi. w e a k - g r a d e s t e m s ) c o n t a i n s t h e m e m b e r s of / n o m / s g / w , / n o m / p l / w . T h e n one ma y c o nt in ue ; / n o m / 2 s / a l l c o v e r s the c o r r e s p o n d i n g n o n - g r a d a b l e s t e m s ( w o r d s l i k e t a l o , h u l l u ) a n d is d e s c r i b e d by r e f e r r i n g v i a o t h e r L e x to /h up pu , /hupu. Y e t an o t h e r l a y e r m a y be a d d e d by d e s c r i b i n g t r i s y l l a b i c n o n - g r a d a b l e n o u n s
-108-( e . g . p u o 1 i ^ s k o ) a s / p u o l i s k o c o n s i s t i n g o f / n o m / 2 s / a l l a n d / i t A / i d e n . T h i s c a p t u r e s t h e g e n e r a l i z a t i o n t h a t t h e s e n o u n s d e p a r t f r o m t h e d i s y l l a b i c o n e s o n l y i n h a v i n g s o m e m o r e a l t e r n a t i v e p l u r a l e n d i n g s .
I n o t h e r w o r d s , r e f e r e n c e s v i a o t h e r L e x a r e r e c u r s i v e l y b r o k e n d o w n b y t r a c i n g a l l t h e l e x i c o n s i n v o k e d . T h e h i e r a r c h i c a l p a r a d i g m a t i c l e x i c o n n e t w o r k m a y b e d i s p l a y e d a s f o l l o w s :
A /uien
/nom/sg/str
ii i Vri i ’ / p ^ 05/c je r s / s i.| / I l N ' . t i i
/clit/nom
/^ll/Vn
/poss^
/mainen
T h e f u l l d e s c r i p t i o n of F i n n i s h c o n t a i n s 1 3 4 e n d i n g l e x i cons. At ru n- ti me , two o p t i o n s are a v a i l a b l e for c o m p i l i n g e n d i n g l e x i c o n s to t r e e s . I n t h e m i n i m a l v e r s i o n , e a c h e n d i n g t r e e c o n t a i n s o n l y the e n d i n g s l i s t e d in the r e s p e c t i v e le xicon, and
w h e n a w o r d - f o r m is to be a n a l y z e d , e v e n t u a l o t h e r L e x r e f e r e n c e s are a l l c h e c k e d s e p a r a t e l y and r e c u r s i v e l y by j u m p i n g f r om tr ee to tree. E.g. w h e n the / p u o l i s k o tr ee is c o n s u l t e d , a l l 13 tree s
-.109-in t h e d i s p l a y a b o v e a r e r u n t h r o u g h . T h e 13 4 m i n i m a l e n d i n g t r ee s r e q u i r e s o me 1,000 nodes. T h e m a x i m a l o p t i o n l u m p s t o g e t h e r into o n e tree the e n d i n g s of the c u r r e n t l e x i c o n p l u s a l l e n d i n g s f o u n d by r e c u r s i v e l y c h e c k i n g the o t h e r L e x r e f e r e n c e s (e.g. a l l 13 t r e e s u n d e r / p u o l i s k o ) . In t h i s m o d e , t h e l e x i c o n t r e e s r e q u i r e s o m e 8 , 0 0 0 n o d e s . O f c o u r s e , u s i n g m a x i m a l e n d i n g t r e e s
s p e e d s up the r e c o g n i t i o n p r o c e s s ( r o u gh ly by a fa ct or of three). T h i s ki nd of p a r a d i g m a t i c d e s c r i p t i o n d o e s c a p t u r e s i g n i f i c a n t g e n e r a l i z a t i o n s . It a l s o m a k e s i n t e r e s t i n g p r e d i c t i o n s , e.g.
th at p a r a d i g m l e v e l l i n g or e x t e n s i o n is l i k e l y to c o n c e r n a l l the m e m b e r s of a g i v e n e n d i n g l e x i c o n s (in d u e course).
5. I m p l e m e n t a t i o n and e v a l u a t i o n
T h e f o r m a l i s m f o r e x p r e s s i n g p a t t e r n r u l e s , s t e m s , a n d
e n d i n g l e x i c o n s is l a n g u a g e - i n d e p e n d e n t . T h e p a t t e r n r u l e s must, of course, be d e t e r m i n e d by the l i n g u i s t b e f o r e th ey c a n be read b y t h e p r o g r a m , i.e. b e f o r e t h e p a t t e r n t r e e is c o n s t r u c t e d . T h e
p r o g r a m r e a d s l e x i c a l e n t r i e s of t h e s p e c i f i e d t y p e u p o n c o n s t r u c t i n g the s t e m tree.
So far, I h a v e o n l y t e s t e d the m o d e l on Finnish. T h e c u r r e n t
s i z e of the F i n n i s h m a i n l e x i c o n is r o u g h l y 9,000 items (of w h i c h 4 , 3 0 0 a r e n o u n s a n d 2 , 0 0 0 v e r b s ) . O n t h e a v e r a g e , a Fi. n o u n h a s 2,5 s t e m s a n d a v e r b 3,2 s t e m s (in t h e s e n s e of p h o n o l o g i c a l l y
d i s t i n c t f r om the base-form). W h e n a l l st e m s of th es e 9,000 items a r e c o m p i l e d i n t o t h e s t e m t r e e , it s s i z e is r o u g h l y 4 1 , 0 0 0 nodes. A r o ug h c o m p a r i s o n to K o s k e n n i e m i ' s (1983; p e r s o n a l c o m m u nication) Fi. l e x i c o n s h o w s th at a f u l l s t e m - a p p r o a c h l e s s t h a n
d o u b l e s t h e number o f n o d e s i n t h e main l e x i c o n t r e e . I find this r o u g h r a t i o i n t e r e s t i n g as it p r o v e s that a s t e m - b a s e d l e x i c o n is
n o t p r o h i b i t i v e l y m u c h l a r g e r t h a n a l e x i c o n b a s e d o n u n i q u e l e x i c a l forms. For IE l a n g u a g e s s t e m - b a s e d l e x i c o n s w o u l d be e v e n m o r e m a n a g e a b l e t h an in Finnish.
T h e "cost" of the s t e m - b a s e d a p p r o a c h is thus a d o u b l i n g of
c o u r s e s t r e a m l i n e s and s p e e d s up the a c t u a l p r o c e s s of w o r d - f o r m r e c o g n i t i o n . U s i n g m a x i m a l e n d i n g t r e e s , w o r d - f o r m r e c o g n i t i o n o v e r t h e 9 , 0 0 0 - i t e m s t e m t r e e t a k e s 30 m s o n t h e a v e r a g e ( i n
c l u d i n g m u l t i p l e a n a l y s e s of homonyms). S h o r t u n a m b i g u o u s w o r d s are a n a l y z e d in 10 -15 ms.
T h e p r o g r a m p r o v i d e s for p r o d u c t i v e m o r p h o l o g i c a l a n a l y s i s
of a n y c o m p o u n d j u s t b y t u r n i n g a s w i t c h . In n o r m a l m o d e , a l l a n a l y s e s are produced. A n o t h e r s w i t c h c o n s t r a i n s the a n a l y z e r to
p r o d u c i n g on e a n a l y s i s only. Th e g i v e n e f f i c i e n c y f i g u r e s p e r t a i n to this n o n - c o m p o u n d mode.
A c k n o w l e d g e m e n t
I a m i n d e b t e d t o K i m m o K o s k e n n i e m i a n d M a r t t i N y m a n f o r i n s i g h t f u l c o m m e n t s .
R e f e r e n c e s
B u t t e r w o r t h , B r i a n 1983. " L e x i c a l R e p r e s e n t a t i o n " . In B. B u t t e r -
w o r t h , ed., Lang^uag^e P£ od uc t i . o n , V o l . 2, N e w Y o r k : A c a d e m i c Press, 257-294.
B y b e e , J.L. 1985. M o r p h o 1 ogy. A S t u d y of t h e R e l a t i ^ o n b e t w e e n M e a n i n g and F o r m . A m s t e r d a m : Be nj am in s.
B y b e e , J.L. a n d S l o b i n , D.I. 1982. " R u l e s a n d s c h e m a s in t h e d e v e l o p m e n t and use of the E n g l i s h p a s t tense". L a n g u a g e 58, 265-289.
D r e s s i e r , W o l f g a n g U. 1985. M o r p h o n o l o g y . A n n Arbor: K a r o m a P u b lishers.
G a rn ha m, A l a n 1985. P s y c h o l in gu istics. C e n t r a l T o p i c s . M e t h u e n :
L o n d o n and N e w York.
H e l l b e r g , S t a f f a n 1978. T h e M o r p h o l o g ^ y of^ P l . £ s e n t - D a y S w e d i s h . S t o c k h o l m : A l m q v i s t & W i k s e l l I n t e r n a t i o n a l .
Hooper, J.B. 1976. ^ I n t r o d u c t i o n to N a t u r a l G e n e r a t i v e P h o n o l
-g y . N e w York; A c a d e m i c Press.
K a r l s s o n , F r e d 1985. " P a r a d i g m s a n d W o r d - F o r m s " . S t u d i a t y c z n e 1 , 135-154.
K a r l s s o n , F r e d & K o s k e n n i e m i , K i m m o 1985. "A P r o c e s s M o d e l of M o r p h o l o g y and Lexicon". F o l i a L i n g u i s t i c a 19:1/2, 207-229.
K a rt tu ne n, Lauri, Root, R e be cc a, & U s z k o r e i t , H a n s 1981. "TEXFIN: M o r p h o l o g i c a l A n a l y s i s of F i n n i s h by Computer". Pa pe r read at t h e 7 1 s t A n n u a l M e e t i n g of t h e S A S S , A l b u q u e r q u e , N e w M e xi co .
Ka y, M a r t i n 1977. " M o r p h o l o g i c a l a n d S y n t a c t i c A n a l y s i s " . In A. Z a m p o l l i , ed.. L i n g u i s t i c S t r u c t u r e s P r o c e s s i n g , A m s t e r d a m ; N o r t h - H o l l a n d , 131-234.
K o s k e n n i e m i , K i m m o 1983. T w o l e v e l M o r p h o l ogy; A T h e o r y for A u t o
-£££^M£^:i.£Il* U n i v e r s i t y of H e l s i n k i , D e p a r t m e n t of G e n e r a l L i n g u i s t i c s , P u b l i c a t i o n s No. 11.
L i n e l l , Per 1979. P s y c h o l o g i c a l R e a l ity in P h o n o l o g y . C a m b r i d g e U n i v e r s i t y Press.
M a y e r t h a l e r , W i l l i 1981. M o r p h o l o g i s c h e Natiir 1 i c h k e i t . W i e s b a d e n ; A t h e n a i o n .
W u r z e l , W.U. 1984. F l e x i o n s m o r p h o l o g ie und Natiir 1 i c h k e i t . B e rl in ; A k a d e m i e - V e r l a g .