On the Coverage o f a Morphological Analyser based
on '^Svensk Ordbok** [A Dictionary o f Swedish]
Anna Sågvall Hein
Uppsala University
Introduction
In th e p ro je c t a L exicon-oriented P a rser f o r Sw edish a stem d ic tio n a ry (S å g v a ll H e in & S jö g re e n 1991; S jö g re e n , fo rth c o m .) c o v e rin g th e 5 8 ,5 3 6 e n try le m m a s o f S ven sk O rdbok (1 9 8 6 ) a lo n g w ith a c o m p le te in fle c tio n a l g ra m m a r o f S w e d ish (S å g v all H e in , fo rth c o m .) w a s g e n e ra te d . T h is la n g u a g e d e sc rip tio n to g e th e r w ith th e U ppsala C hart P rocessor, U C P (S å g v a ll H e in 1987) c o n stitu te a m o rp h o lo g ic a l a n a ly z e r o f S w ed ish , h e n c e fo rth re fe rre d to a s SM U , sh o rt fo r S w e d is h M o rp h o lo g y in th e U C P fra m ew o rk .
S o far, th e re a re n o w o rd fo rm a tio n ru le s in th e S M U g ra m m a r, an d w o rd s o u tsid e th e s c o p e o f S v e n sk O rd b o k d o n ’t g e t an an aly sis^ . E v e n th o u g h c lo se d in its p re s e n t v e rs io n , th e c o v e ra g e o f S M U is w e ll-d e fin e d ; p rio r to a n y p ro c e s s in g w e m a y c o n s u lt S v e n sk O rd b o k to fin d o u t fo r a n y w o rd form w h e th e r it w ill g e t an an a ly sis o r n o t; th e d ic tio n a ry p ro v id e s a n in tu itiv e , f a m ilia r fo rm a t th ro u g h w h ic h w e m a y e x p lo re th e (p re se n t) c o m p e te n c e o f th e S M U a n a ly s e r w ith o u t a n y p rio r k n o w le d g e o f its fo rm a lism s o r o p e ra tio n . S M U is a lso w ell-d eH n ed in th e s e n se , th a t f o r a n y o f its le m m a s. S v e n sk O rd b o k p ro v id e s lin k s to th e c o rre sp o n d in g le x e m e s (b a sic s e n se s ), a n d fo r e a c h le x e m e a d efin itio n .
In o u r o n g o in g w o rk o n a m a c h in e -tra c ta b le d ic tio n a ry fo r S w e d ish , w e a re a p p ro a c h in g p ro b le m s
c o n c e rn in g th e d istin c tio n b e tw e e n g e n e ra l an d d o m a in s p e c ific v o c a b u la ry , a n d th e p re s e n t c o v e ra g e o f S M U is o u r s ta rtin g -p o in t fo r d e lim itin g a g e n e ra l S w e d is h v o c a b u la ry . F o r a n e v a lu a tio n o f th e g e n e ra lity o f th e d ic tio n a ry , th e a n a ly s e r h a s b e e n a p p lie d to d iffe re n t s e ts o f S w e d ish tex t. F o r o n e o f th e m , c o n s is tin g o f th e 10,2 2 4 m o s t fre q u e n t ty p e s o f th e 7 ,3 m illio n w o rd n e w s p a p e r c o rp u s o f T h e L a n g u a g e B a n k (G e lle rsta m 19 8 9 ) th e w o rd s o u tsid e h e s c o p e o f th e a n a ly s e r h a v e b e e n e x a m in e d at s o m e d e ta il. H e re w e w ill p re s e n t th e re s u lts a c h ie v e d so far, a n d a lso d isc u ss th e ir im p a c t o n o u r c o n tin u e d w o rk o n th e d ic tio n a ry .
F irs t, h o w e v e r, w e w ill b rie fly c h a ra c te riz e th e S M U a n a ly s e r w ith re g a rd to m o rp h o lo g ic a l d e sc rip tio n s, and d ic tio n a ry re p re s e n ta tio n o f in fle ctio n .
The morphological descriptions o f the SMU analyser
T h e m o rp h o lo g ic a l d e sc rip tio n s g e n e ra te d b y th e a n a ly s e r a re e x p re sse d a s a ttrib u te -v a lu e s tru c tu res (S å g v all H ein & A h re n b e rg 1985; cf. d i r e a e d a c y clic g ra p h s, d a g s , fo r s h o rt, S h ie b e r 1986).
It c o m p rise s f o u r g e n e ra l a a rib u te s , i.e. L E M fo r le m m a , W O R D .C A T fo r w o rd c a te g o ry (p art o f sp e e c h ), D IC .S T E M fo r d ic tio n a ry ste m , an d IN F L fo r in fle c tio n ty p e), a n d , fo u r a ttrib u tes sp ecific to th e n o u n s i.e. G E N D E R , N U M B e r, F O R M (sp e c ie s), and C A S E . T h e g e n eral a ttrib u te s are p re s e n t in th e d e s c rip tio n s o f all th e w o rd s, re g a rd le ss o f p a rt o f s p e e c h (n o u n , ad je c tiv e , p ro n o u n , v e rb , a d v e rb , n u m e ra l, p re p o s itio n , c o n ju n c tio n , in te rje c tio n , a rtic le , and in fin itiv e m aricer).
FESTERNAS
(* = ( LEM =FEST.NN
W O R D .CA TgN OUN IN FL = PA 1T E R N PIL M DIC.STEM =FEST GENDER=UTR NU M B=PLU R FO R M =D EF CA SE=GEN)
Figure 1. An analysis o f the noun festernas [o f the parties]
T h e v a lu e o f th e le m m a a ttrib u te is id e n tic a l to th e b a s ic fo rm o f th e le m m a w ith a (tw o le tte r) w o rd c la s s m a rk e r f o r th e d istin c tio n b e tw e e n h o m o g ra p h le m m a s, i.e. s p rin g a l.n n [ d iin k ; slot] an d
sprin g a 2 .vb [ru n ]. In a d d itio n , th e h o m o g ra p h s a re n u m b e re d as th e y a re in S v e n sk O rd b o k (cf. ^ s p rin g /a su b st. [n o u n ] an d ^ s p rin g /a v e rb ), w h e re b y im m e d ia te re fe re n c e to th is b a c k g ro u n d m a te ria l is fac ilita te d . I f th e re are tw o h o m o g ra p h le m m a s o f th e sa m e w o rd c a te g o ry , th e h o m o g ra p h n u m b e r a lo n e w ill k e e p th em ap art, e.g . b o k L n n [b o o k ] (p lu r. böcker) an d bok2.nn
[b e e c h (tree)] (p lu r. bokar). T h e w e ll-d e fln e d le m m a m a rk e r s u p p o rts th e d istin c tio n b e tw e en e x te rn a l an d in te rn a l h o m o g ra p h y , a b a s is fo r s u b s e q u e n t le m m a tiz a tio n . F u rth e r, it p ro v id e s a b a sis f o r th e s e le c tio n o f d o m a in tu n ed le m m a d ic tio n a rie s fro m a g e n e ra l d ictio n a ry ; th e le m m a s sp ecifle to th e d o m a in a re rec o g n iz e d b y th e m o rp h o lo g ic a l a n a ly sis o f te x ts ty p ical o f th a t d o m ain . T h e d ic tio n a ry ste m a ttrib u te m a y s e rv e th e sa m e fu n c tio n in b u ild in g a d o m a in tu n e d stem d ic tio n a ry fro m a g e n e ra l ste m d ic tio n a ry .
T h e v a lu e o f th e in fle c tio n a ttrib u te is a p a ttern w o rd w h ic h is a lso th e n a m e o f a n inflectional rule d e fln e d in th e g ra m m a r. T h e in c lu sio n o f th is in fo rm a tio n in th e m o rp h o lo g ic a l d e scrip tio n s p ro v id e s a b a sis f o r fre q u e n c y s tu d ie s o f in fle c tio n a l ty p e s in c u rre n t tex t.
In a d d itio n to th e g e n e ra l a ttrib u te s, e a c h p a rt o f sp e e c h (e x c e p t fo r th e p re p o sitio n s an d th e in fin itiv e m a rk e r) is c h a ra c te ris e d b y its o w n s e t o f attrib u te v a lu e p a irs. In fig. 2 w e p re se n t th e m o rp h o lo g ic a l d e s c rip tio n s re s u ltin g fro m th e a n a ly se s o f a n a d je c tiv a l p a ra d ig m , th e a d je c tiv e
FESTUG {* = (
FESTUGA
(* = ( (* = (
LEM =FESTLIGj\ V W ORD.CAT=ADJ IN FL=PA TrER N .B LEK DIC.STEM =FESTLIG GEN D ER*U TR NUM B=SING FO RM =IN D EF CX)MP=POS)
LEM =FE ST L IG A V W ORD.CA T«AD J INFL=PATTERN.BLEK D IC .STEM =FESTU G COM P=POS
NUM B=SING FORM =DEF)
[image:3.595.101.470.130.700.2](LEM =FESTLIG .A V W ORD.CAT=ADJ INFL=PATTERN.BLEK DIC.STEM =FESTLIG GENDER=UTR NUM B=SING FO RM =DEF SEX=M ASC COM P=POS) LEM =FESTLIG.AV W ORD.CAT=ADJ INFL=PATTERN.BLEK DIC.STEM =FESTLIG COM P=SUP FORM =INDEF))
Figure 2. Analyses of the adjective festlig [festive; grand] FESTUGT (* = ( FESTUGE (* = FESTUGARE (* = (
LEM =FESTL1Gj\ V W O R D .C A T -A D J IN FL=PA TTERN.BLEK DIC.STEM =FESTLIG G EN D ERsNEUTR N U M B sSIN G F O R M sIN D E F CO M P=POS)
LEM =FESTLIG .A V W ORD.CA T=AD J IN FL=PA 1TER N .B L EK D IC .S T E M = F E S H JG COM P=POS
N U M B=PLU R)
LEM =FESTLIG .A V W ORD.CA T=AD J IN FL=PA TTERN.BLEK DIC.STEM =FESTLIG CO M P=CO M P
FESTUGAST (* = (
FESTUGASTE
(• = ( L E M = F E S T L IG ^ V W O R D .CA T»A D J IN F L = P A T T E R N 3L E K DIC.STEM =FESTLIG CO M P=SU P
FO RM =DEF))
P a r t o f sp e e c h A t t r i b u t e V a lu e s
NN GENDER N eutr U tr
NUM B Sing P lur
FO RM Indef D ef
CA SE Basic G en
PROPR +
ABBREV +
AV GENDER N eutr U tr
NUM B Sing Plur
FO RM Indef D ef
CO M P Pos C om p Sl^)
FU NC A ttr Pred
SEX Masc
(■«■ CASE for the description o f nom inalized adjectives)
VB TENSE Pres Sup Pret
IN FF In f PP A p
AL
PN
N L
IM P +
CONJ +
(+ N U M B. GEND ER, FORM . FUNC, SEX. CASE for the description o f the participles)
GEND ER N e u trU tr
NUM B Sing P lur
FO R M Indef D ef
PRO N.TY PE Pers Poss Rel A TTR.TY PE Select Q uant C om p
DET.TYPE T ot D et
G EN D ER N eutr U tr
NU M B Sing Plur
FO R M Indef D ef
C A SE Basic G en Obj
SEX M asc
P A R T m V +
D U A L +
NO U N .IN D EF +
ATTR.TYPE Select Q uant GEND ER N eutr U tr
NUM B Sing P lur
FO R M Indef D ef
CASE Basic Gen
SEX M asc
C O M P Pos Com p Sup
SUBJU +
AB
KN
P re p o sitio n s , i n te ije a io n s , an d th e in fin itiv e m a rk e r h a v e n o a ttrib u tes.
Table l . A n Overview o f the attributes assigned by the SMU analyser
Inflection in Svensk Ordbok and in the SMU dictionary
In fig. 3 w e p re s e n t th e in fle c tio n a l v a rie ty o f th e -ar-declination fo r a n illu stra tio n o f d ie re la tio n b e tw e e n th e in fle c tio n a l fo rm a t o f S O B and th a t o f th e S M U stem d ic tio n a ry . In th e sim p le st case,
th e re is o n ly o n e ste m , in S O B an d in S M U , a n d , fu rth e r, th e stem is id e n tic a l to th e b a s ic G em m a) fo rm o f th e w o rd (se e 1 a n d 2 ). In su c h c a se s, th e e n try fo rm o f S O B re p re se n ts a t th e sa m e tim e th e le m m a a n d th e ste m , w h e re a s in th e S M U fo rm a l d ic tio n a ry th e tw o c o n c e p ts h a v e to b e in d iv id u a lly re p re se n te d . T h e p a tte rn ru le s o f th e S M U d ic tio n a ry c o v e r th e in fle ctio n a l in fo rm a tio n g iv e n in S O B in te rm s o f w o rd cla s s an d sig n ific a n t e n d in g s , a n d , also , th e g ram m atic a l b a c k g ro u n d in fo rm a tio n to w h ic h th e s e m o rp h o lo g ic a l k e y s re fe r, i.e. m o ip h o ta c tic n ile s d etem n in in g th e
in fle c tio n a l b e h a v io u r o f th e n o u n s. S o m e tim e s. S O B re c o g n iz e s a stem id e n tic a l to a n in itial s u b s trin g o f th e le m m a fo rm and d e lim its it fro m th e su c c e e d in g e n d in g b y a sla sh , a s in 3 - 8, and 11. T h e ste m c o n c e p t th u s a d o p te d is th e tech n ica l stem (H e llb e ig 1978). T h e S M U a n a ly s e r tre a ts th e s e c a se s in a d iffe re n t w a y , i.e . b y m e a n s o f a g e n e ra l re w ritin g ru le ( 4 - 8) o r stem h a n d lin g o p e ra tio n s in th e p a tte rn ru le s (as in 3 and 11). In th e firs t c a se , th e n o n -v o w e l ste m a lte rn a n t is
re g a rd e d a s th e c a n o n ic a l fo rm o f th e ste m , a n d its v o w e l c o u n te rp a rt is re d u c e d to th is fo rm b y a se c o n d a ry v o w e l d e le tio n ru le^. T h u s th e re is o n ly o n e stem in d ic tio n a ry , a n d , w h a t is m o re im p o rta n t, w e d o n ’t h a v e to a c c e p t a s e n d in g s s trin g s su c h a s eln, lor, nen, n a rt etc. In th e seco n d c a se , S M U a n a ly s e s th e ste m in tw o ste p s, i.e. a s a d ictionary stem fo llo w e d b y a stem b u ild in g
e le m e n t, e .g . p o jk -e , drdm -m , Idm -m el. A w o rd fo rm s u c h a sp o jk e n is a n a ly se d p o jk-e+ n , Idm meln
[image:4.595.94.454.118.416.2]resp o n sib le fo r th e rec o g n itio n o f th e stem b u ild in g se g m e n ts an d th e ir d istrib u tio n in th e p a ra d ig m (se e fu rth e r S å g v a ll H e in , fo rth c o m .).
In a ll. th e re are 132 (ste m ) p a tte rn ru le s fo r th e n o u n s , 4 0 f o r th e a d je c tiv e s , 8 9 fo r th e v e rb s , 1 fo r th e a rticle s, 3 0 fo r th e p ro n o u n s, 5 f o r th e n u m e ra ls , 9 f o r th e a d v e rb s , 2 f o r th e c o n ju n c tio n s , an d o n e e a c h fo r th e p re p o sitio n s, th e in te ije c tio n s , a n d th e in fm itiv e m a rk e r. In m o s t c a s e s th e a n a ly sis is b a se d o n o n e stem ^ , a n d in th e s e c a se s, th e stem w ith its p a tte rn ru le is a s u ffic ie n t c h a ra c te risa tio n o f th e in fle ctio n a l b e h a v io u r o f th e le m m a as su ch . If, o n th e o th e r h a n d , th e le m m a is re p re sen te d b y m o re th a n o n e stem in th e d ic tio n a ry (cf. 17 a n d 1 8 in fig . 3 ). th e s e t o f stem s
in v o lv e d alo n g w ith th e irp a ttern w o rd s d e te rm in e th e in fle c tio n o f th e le m m a , th e le m m a in fle c tio n as o p p o se d to th e stem in fle c tio n . F re q u e n c y d a ta o n th e in fle c tio n a l ty p e s o f th e S M U w o rd s h a v e b e e n p re se n te d e ls e w h e re (S å g v a ll H e in & S jö g re e n 1991).
S O B
1 g fird subsL - e n - a r 2 subsL - n - a r
3 p o jk /e subsu - e n - a r
4 sp e g /e l subsL - e ln - l a r
5 s o c k /e n subsL - n e n - n a r 6 b o tt/e n subsL -n e n el. = ,
~ n a r 7 m y r t /e n subsL - e n - n a r
8 f in g /e r subsL - r e t el. - e m
- r a r
9 d r ö m subsL - m e n - m a r
10 läm A n e l s u b s t - m e ln - l a r
11 s u m A n e r su b st -m e r n - r a r
1 2 s o m m a r subsu - ( e ) n som rar
13 h a m m a r e subsu ham m ar(e)n, =
el. ham rar, besu plur.
14 h im m e l subsu him m el(e)n el.
him len, him lar
15 a f to n subsu c f tonen c ftn a r
16 d jä v u l subsu - e n djävlar
17 m o d e r av. ^ m o r subsu
m odern m ödrar
18 to k subsu - e n - a r el. t o k /e r - e r n - a r
19 s t a d g a r subsu plur.
stem
SMU
lemma
pattern
g å rd g å rd .n n .sto l
s jö sjd.rm .fru
p o jk -e p o jk e .n n .g o sse sp e g (e )l sp e g e l.n n .n y c k e l so c k (e )n s o c k e n jin .ö k e n b o tt(e )n b o tte n .n n .b o tte n
m y rt(e )n m y rte n .n n .frb k e n fin g (e )r
fingernn
.fin g e rd rö m drO m .nn .k am d rö m -m
la m -m e l lå m m e l.n n .lå m m e l lä m-1
s u m -m e r su m m e r.n n .h u m m e r s u m -r
so m so m m a r.n n .s o m m a r so m -m a r
h a m - r h a m m a re .n n .k a m m a re h a m -m a r
h a m -m a re
h im -m e l h im m eL n n .h im m e l h im-1
aft-on
afton.nn
.m o rg o n a ft-nd jå v -u l d jå v u l.im .d jä v u l d jä v-1
m o d e r m o d e r J in .m o d e r m ö d ra r m o d e r jin j n ö d r a r m o r m o d e r J in .f a r to k to k .n n .sto l to k e r
tok.nn
.to k e r sta d g s ta d g a n n n .v ä g n a rThe Scope o f Svensk Ordbok and the SMU analyser
F o r th e re c o g n itio n , an d s u b s e q u e n t ex a m in a tio n , o f m is s in g en trie s in th e d ic tio n a ry , w e a p p ly the S M U a n a ly s e r to d iffe re n t se ts o f S w e d is h t e x t S o far, w e h a v e a n a ly se d fo u r te x t m a te ria ls o f s u b s ta n tia l siz e , i.e. th e 10,2 2 4 m o s t fre q u e n t ty p e s o f th e 7 ,3 m illio n w o rd n e w s p a p e r c o rp u s o f th e S w e d is h L a n g u a g e B a n k (G e lle rstam 1989), refe rre d to as P ressF req, th e {rfiarm acological tex t o f th e S w e d is h d ru g c a ta lo g u e F A SS (1 9 8 5 ) (660,(XX) c u rre n t w o rd s ), th e P rofessional P rose c o rp u s o f th e S krivsyn ia x p ro je c t C Telem an 19 7 4 ), refe rre d to a s P rcfP ro se (7 8 ,0 3 6 6 c u rre n t w o rd s), and, fin a lly , th e c o rp u s o f th e d e fin itio n s o f S v e n s k O rd b o k , refe rre d to as D efV o c (3 6 0 ,1 4 4 c u rre n t
w o rd s ). P ro fP ro s e c o n sists o f fo u r ty p e s o f te x t o f e q u al siz e (te x tb o o k s, n e w sp a p e rs, d e b a te b o o k s, a n d b ro c h u re s).
c o r p u s to k e n s t y p e s ty /to c o v e r e d u n c o v e r e d
P re s s 5 ,0 9 1 ,9 6 5 10,2 2 4 0 , 0 2 8 ,4 2 4 (8 2 % ) 1 ,7 9 0 (1 8 % ) F re q
P ro f 7 8 ,0 3 6 13,7 6 6 0 ,1 8 1 0 ,0 8 3 (7 3 % ) 3 ,6 8 3 (2 7 % ) P ro s e
[image:6.595.101.473.274.359.2]D e fV o c 3 6 0 ,1 4 4 4 3 ,9 3 4 0 , 1 2 3 1 ,3 5 0 (7 1 % ) 12,5 8 4 (2 9 % ) F A S S 6 6 4 ,3 1 4 3 9 ,8 8 4 0 ,0 6 9 ,7 6 7 (2 5 % ) 3 0 ,1 1 7 (7 5 % )
Table 2. Results of the application o f the SOB-based SMU analyser to four sets o f Swedish text
A s m ig h t b e e x p e c te d , th e a n a ly s e r c o v e rs b e s t in re la tio n to th e h ig h fre q u e n t w o rd s o f th e
n e w s p a p e r te x t an d w o rs t w ith re s p e c t to th e h ig h ly d o m a in -sp e c ific p h a rm a c o lo g ic a l tex t. In b e tw e e n c o m e s th e m o re g e n e ra l te x t o f P ro fP ro s e and th a t o f th e d e fin itio n c o rp u s, c o v ered , b a s ic a lly , to th e sa m e e x te n t In tab le s 3 to 6 w e p re s e n t m o re d e ta ile d d a ta o n th e re su lts o f th e p ro c e s s in g o f e a c h o f th e te x t m a te ria ls. T h e y in c lu d e in fo rm a tio n o n h o m o g ra p h y o f th re e k in d s, i.e. G em m a) in te rn a l. G em m a) e x te rn a l, and m ix e d (in tern al an d e x te rn a l). (F o r e a c h te x t m ate ria l, w e a lso h a v e d e ta ile d d a ta o n th e d iffe re n t s u b ty p e s o f h o m o g ra p h ie s th a t w e re fo u n d , e.g. i n t A V , e x t. A B /A B , e x t. A B /K N , exL A B /N N , an d th e ir fre q u e n c ie s.)
N u m b e r o f p a r s e s
le x ic a l c o v e r a g e
t y p e s %
te x t u a l c o v e ra g e to k e n s %
0 1 ,7 9 0 17,5% 2 9 9 ,2 3 5 5 ,9 % 1 6 ,1 4 2 6 0 ,0 % 2 ,2 7 0 ,9 6 1 4 4 ,6 % 2 (in t.) 5 9 3 5 ,8 % 1 5 2 ,3 3 4 3 ,0 % 2 (e x t.) 9 5 9 9 ,4 % 1 ,3 9 1 ,6 2 3 2 7 ,3 % 3 (in t.) 158 1,5% 2 4 ,9 3 4 0 ,5 % 3 (e x t.) 191 1,9% 4 3 5 ,0 6 6 8 ,5 % 3 (m ix .) 2 0 6 2,0% 5 4 ,1 7 6 1,1% 4 (e x t.) 33 0 ,3 % 4 3 ,8 1 4 0 ,9 % 4 (m ix .) 6 9 0 ,7 % 2 4 ,5 4 4 0 ,5 % 5 (ex L ) 14 0,1% 2 5 8 ,7 1 3 5 ,1 % 5 (m ix .) 5 4 0 ,5 % 2 0 ,3 8 2 0 ,4 %
6 (e x t.) 6 0,1% 8 3 ,6 6 9 1,6% 6 (m ix .) 7 0,1% 3 2 ,1 1 8 0,6%
7 (m ix .) 1 0,0% 9 6 0,0%
8 (m ix .) 1 0,0% 2 8 0 0,0%
T o ta l: 1 0 ,2 2 4 1 0 0,0% 5 ,0 9 1 ,9 6 5 10 0 ,0 %
[image:6.595.101.439.489.710.2]T h e le m m a c a n b e u n a m b ig u o u s ly d e te n n in e d fo r 6 ,8 9 3 ty p e s (6 7 ,4 % ) a n d 2 ,4 4 7 ,9 5 6 to k e n s (4 8 ,1 % ). In P r e s s F r e q w o r d s o u ts id e t h e s c o p e o f S O B b e lo w , w e g iv e a n a c c o u n t o f th e k in d s o f w o rd s th a t g o t n o p a rses.
N u m b e r o f le x ic a l c o v e r a g e ^ t e x t u a l c o v e r a g e p a r s e s t y p e s % to k e n s %
0 3 ,6 8 3 2 6 ,8 % 6 ,8 0 6 8 ,7 %
1 7 ,8 2 8 5 6 ,9 % 3 8 ,0 1 5 4 8 ,7 % 2 (in i.) 8 0 0 5 ,8 % 2 ,6 9 3 3 ,5 % 2 (exL ) 8 2 9 6,0% 1 7 ,6 3 8 2 2,6%
3 ( i n D 161 1,2% 2 8 2 0 ,4 %
3 (exL ) 149 1,1% 6 ,0 6 8 7 ,8 % 3 (m ix .) 168 1,2% 7 5 8 1,0%
4 (cxL ) 2 4 0,2% 4 5 5 0,6%
4 (m ix .) 63 0 ,5 % 3 8 4 0 ,5 %
5 (exL ) 8 0,1% 3 ,2 8 2 4 ,2 %
5 (m ix .) 4 4 0 ,3 % 2 6 6 0 ,3 %
6 (ex t.) 3 0,0% U O l 1,5%
6 (m ix .) 4 0,0% 186 0,2%
7 (m ix .) 1 0,0% 1 0,0%
8 (m ix .) 1 0,0% 1 0,0%
[image:7.595.102.441.173.396.2]T o tal; 1 3 ,7 6 6 1 0 0,0% 7 8 ,0 3 6 1 0 0%
Table 4. Results o f the application o f the SOB-based SMU analyser to ProfProse
T h e le m m a ca n b e u n a m b ig u o u s ly d e te rm in e d fo r 8 ,7 8 9 ty p e s (6 3 ,8 % ) a n d 4 0 ,9 9 0 to k e n s (5 2 ,5 % ). R o u g h ly 13% (4 7 9 ) o f th e w o rd s th a t g o t n o a n a ly sis a re n u m e ric a l e g r e s s i o n s .
N u m b e r o f le x ic a l c o v e ra g e ^
p a r s e s ty p e s %
0 1 2 ,584 2 8 ,6 % 1 2 6 ,3 4 8 6 0 ,0 % 2 (in t.) 1,654 3 ,8 % 2 (ex t.) 2 ,2 0 5 5 ,0 % 3 ( i n t ) 251 0,6% 3 (ex t.) 2 5 8 0,6% 3 (m ix .) 3 7 2 0,8% 4 ( i n t ) 1 0,0% 4 (ex t.) 4 0 0,1% 4 (m ix .) 137 0 ,3 %
5 ( e x t ) 9 0,0% 5 (m ix .) 63 0,1%
6 ( e x t ) 3 0,0% 6 (m ix .) 7 0,0% 7 (m ix .) 1 0,0%
8 (m ix .) 1 0,0%
T o tal: 4 3 ,9 3 4 1 0 0,0%
[image:7.595.102.284.450.679.2]T h e le m m a s o f 2 8 ^ 3 4 o f th e ty p e s (6 4 % ) ca n b e d e te rm in e d u n a m b ig u o u sly . O n ly 4 3 0 ( ~ 3 ^ % )
o f th e 0-p a rs e s a re n u m e ric a l e x p re ssio n s.^
N u m b e r o f p a r s e s
le x ic a l c o v e ra g e ^
t y p e s %
0 3 0 ,1 1 7 7 5 ,5 %
1 7 ,5 9 8 19,1%
2 (in t.) 8 7 4 0,2% 2 (cxL ) 7 6 6 0,2% 3 ( in L ) 147 0.0% 3 (exL ) 116 0,0% 3 (m ix .) 142 0.0% 4 ( e x t ) 17 0.0% 4 (m ix .) 5 2 0.0% 5 (e x t.) 6 0.0% 5 (m ix .) 4 2 0.0%
6 (ex t.) 3 0.0% 6 (m ix .) 3 0.0% 8 (m ix .) 1 0.0%
[image:8.595.101.283.161.366.2]T o ta l: 3 9 ,8 8 4 1 0 0.0%
Table 6. Results o f the application o f the SOB-based SMU analyser to PASS
T h e l e m m a s o f 8 ,6 1 9 o f t h e t y p e s ( 2 2 % ) c a n b e d e t e r m i n e d u n a m b i g u o u s ly . 1 1 ,1 4 8 ( 2 7 % ) o f th e 0 - p a r s e s a r e n u m e r i c a l e x p r e s s io n s o r h y b r id s o f n u m b e r s , s p e c i a l s ig n s , a n d s in g le
l e t t e r s , s u c h a s 7 5 - 8 0 ,8 5 % , 4 $ 8 , F lO O , E 2 1 8 , C -t-+ , 0 ,5 - 0 ,7 ,2 0 : e e tc . ( A p h a r m a c o l o g i c a l
s te m d i c t i o n a r y c o v e r i n g th e n o n - n u m e r ic a l w o r d s o u t s i d e t h e s c o p e o f S O B h a s b e e n b u ilt ( s e e S å g v a ll H e i n e t a l., f o r t h c o m .) ) I n f ig . 4 w e p r e s e n t a d r u g d e s c r i p ti o n f r o m P A S S to
i l l u s t r a t e t h e s p e c i a l c h a r a c t e r o f th is te x t.
Abboticin*
Abbott
Dosgniwlst 200 mg
Antibiotikum Grupp 7B 3005 Deklantlon. I dosgnnulat innehiJIer: Erythromycin, set- hylsuccin. respond, erythromycin. 200 mg, mannitol. 1,5 g, constit. et aroma q. s.
Egenskaper. Dosgranulaten innehiller erytromycinetyl- luccinat motsvarande 200 mg erytromycin. Granulatet loses i litet vatten (2-3 dessertskedar-20-30 ml). Bered ningsformen kr speciellt avsedd fOr bam och kr kven Ikmplig som jourförpackning. Beredningen kr sockerfri och har körsbkrssmaJc.
Erytromycinetylsuccinat kr en ester av erytromycin och efter absorption sker hydrolys till friti aktivt erytromy cin. Se f Ö Ab b o tic inubletter.
Indikatloiier. Se Ab b o tic inubletter. Kontraindikationer. Se Ab b o tic intabletler.
Försiktighet. Se ABBOTICIN Ubletter.
Graviditet och amning. Se Ab b o tic in tabletter. Biverkningar. Se Ab b o tic in ubletter.
Dosering. Dosen fOr ban beriknas efter 30-50 mg per kg kroppsvikt och dygn fördelat p i 2-4 doseringstillfållen. 1 dosgranulat■ 200 mg erytromycin. För bam upp till 4 kg beräknas dosen i det enskilda fallet. Vid kroppsvikt över stigande 4 kg kan om dygnsdosen fördelas pä tvä dose- ringstillfällen följande schema vanligen tillämpas:
Dygns- Lämplig Vikt dosering förpackning för 10 kg dosgranulat mg/kg/dygn dagars behandling
4 - 7 'fix2 29-50 1x30 8-14 1x2 29-50 1x30 15-24 2x2 33-53 2x30 25-34 3x2 35-48 2x30
Inträffar gastrointestinala problem rekommenderas upp delning av dygnsdosen pä 3 eller 4 adminisireringsiillfäl- len. För vuxna och bam över 35 kg ges 3 dosgramilat 3 ^ n g e r per dygn. Optimal absorption erhälles om dosen intages omedelbart före mältid.
Interaktion. Se Ab b o tic intabletter. Förpackningar och prlMr. Dosgranulat 200 mg
30 st 55:30
[image:8.595.101.481.503.692.2]PressFreq words outside the scope o f SOB
P ro p e r n o u n s 1,433 (8 0 ,1 % ) A b b re v ia tio n s 137 (7 ,7 % ) C o m p o u n d s 127 (7 .1 % ) N u m e ric a l e x p re ssio n s 4 5 (2 ,5 % ) D e riv a tiv e s 2 0 (1.1% ) F o re ig n w o rd s 17 (0 .9 % ) S y n ta g m a tic w o rd s 5 (0 ,3 % ) P a rtia l p h ra se s 4 (0,2% ) In fle ctio n a l fo rm s 2 (0.1% )
T o ta l 1 ,7 9 0 (1 0 0% )
Table 7. Kinds o f uncovered types in P r e s ^ e q
P r o p e r n o u n s . T h e d o m in a tin g c a te g o ry is th a t o f th e p r o p e r n o u n s (in c lu d in g a sm a ll n u m b e r o f p ro p e r n o u n a b b re v ia tio n s, e.g. A B F , A IK , D N , D D R , K F U M ). N o n o rm a liz a tio n o f s p e llin g v a ria tio n h a s b e e n c a rrie d o u t, so far, so e a c h a p p e a ra n c e h a s b e e n c o u n te d as an in d e p e n d e n t u n it, e.g . Arvia, anna, and ANNA-, E rik an d E ric; B e rn a rd an d B ernhard, B en g tsso n ,B en g tso n , a n d B -son, L idingö an d L id -0 etc. R o u g h ly , h a lf th e n u m b e r o f th e p ro p e r n a m e s re fe r to p e o p le (a p p ro x im a te ly , 4 4 0 first n a m e s an d 2 8 0 se co n d n a m e s.)
T h e h ig h n u m b e r o f p ro p e r n o u n s , in sp e c ific , p e rso n a l p r o p e r n o u n s , se e m s to b e a c h a ra c te ris tic fea tu re o f n e w s p a p e r tex t. M o st o f th e p ro p e r n o u n s th a t w e fo u n d w ill b e e n te re d in to th e d ic tio n a ry w ith a m a rk in g o f th e ir o rig in (P re ssF re q ) as a c lu e to fu tu re w o rk o n d o m ain .
A b b r e v ia tio n s . T h e a b b re v ia tio n c a te g o ry c o m p rise s a b b re v ia tio n s (e x c l. th o se o f th e p ro p e r n o u n s) in v a rio u s o rth o g ra p h ic sh a p es, e.g . b l.a an d bl_a, F E B ,fe b . an d fe b r , k r an d kr. e tc . T h e y w ill all b e in c lu d e d in th e co re o f o u r d ic tio n a ry (se e Ö stlin g , th is v o lu m e ), a n d s o m e o f th e m b e tre a te d as fu n c tio n al c o re p h ra s e s (S å g v a ll H e in e t al. 1990) an d re p re s e n te d in th e d ic tio n a ry as
su ch , e.g. b l a an d bl. a., d. v. s. a n d d v s etc . T h e fig u re s p re s e n te d in ta b le 9 a te , h o w e v e r, b a se d o n in d iv id u al te x t w o rd s, fo r in sta n c e bl, bl., a, a. etc.
C o m p o u n d s . A s is w e ll-k n o w n , th e S w e d ish c o m p o u n d s m a k e u p an o p e n c a te g o ry , a n d in ta b le 10 w e p re s e n t a n o v e rv ie w o f th e d iffe re n t k in d s o f P re ss F re q c o m p o u n d s th a t w e re fo u n d to b e o u tsid e th e p re s e n t sc o p e o f th e S M U d ictio n a ry .
T y p e o f c o m p o u n d
N N - N N - » N N N L - N N - + N N N N -A V A V A V -N N N N N L -A V - » A V N L -N N -N N - > N N A V -V B - > V B N N -N N - » A B P - N N - » A B P - N N - > N N P - V B - » V B N N -N N -V B - > V B
T o ta l
N o o f m e m b e r s
9 7
10
6
43
2
2
E x a m p le s
a rb e ts u p p g ifte r, h ä ls o -a n d r-a ^ l-a ts , 3 0 -t-a le t m e d e ls to ra , n o rd ö stra n y p ris
5 0 -å rig S O -årsåld em n y b y g g d a fö rh o p p n in g s v is h ä ro m å re t ö v e rå k la g a re e fte ra ru n äld te x t-T V -T e x ta t
[image:9.595.101.342.139.276.2]129
F o llo w in g B lflb eig (1 9 8 8 ) w e re fe r th e p a rtic ip le s to th e v e rb c a te g o ry . F u rth e r, p re p o sitio n s, and a d v e rb s, a re tre a te d as m e m b e rs o f o n e c o m m o n c a te g o ry (in th e c o m p o u n d c o n te x t), d e n o te d P.
M a n y o f th e p ro d u c tiv e c o m p o u n d ty p e s are in d ic a tiv e o f d o m a in (ec o n o m y , p o litic s, so cial s e c u rity , sp o rt, c u ltu re , w e a th e r, T V an d b ro a d c a stin g ). T h e e ffe c t o f c o m p o u n d in g o n d o m a in is a n im p o rta n t iss u e in o u r fu tu re w o rk o n a d o m a in -se n sitiv e e x te n sio n o f th e d ictio n a ry . It is o n e o f th e c rite ria th a t sh o u ld b e ta k e n in to a c c o u n t w h e n c o n sid e rin g a ru le -b a se d (as o p p o se d to a le x lc a liz e d ) tre a tm e n t o f c o m p o u n d s.
N u m e r i c a l e x p re s s io n s . T h e m e m b e rs o f th e n u m e ric a l c a te g o ry a re 3 9 e x p re ssio n s c o n sistin g o f A ra b ia n n u m e ra ls , an d h y b rid s o f n u m b e ra ls, sp e cia l s ig n s, a n d s in g le le tte rs, s u c h as 0 3 111 19840, 1 4 3 0 , 2 5 :E, a n d D 0502. (S o m e R o m a n n u m e ra ls w e re a lso fo u n d , e.g . V ll, b u t refe rre d to th e p r o p e r n o u n c a te g o ry a s c a n d id a te s fo r (p arts o f p h ra sa l) p r o p e r n o u n s.)
D e r iv a tiv e s , M o s t o f th e d e riv e d w o rd s o u tsid e th e sc o p e o f S M U (1 4 o f 2 0 ) c a n b e fo u n d in S v en sk O rd b o k s a s m o rp h o lo g ica l exam ples (se e ta b le 11). T h is m e a n s , th a t th e ir e x iste n c e is c o n firm e d , a n d th a t th e ir m e a n in g s (d e fin itio n s ) sh o u ld b e d e riv a b le fro m d ie d e fin itio n s o f th e w o rd s G exem es) th a t th e y illu s tra te . W h e n a le m m a h a s m o re th a n o n e le x e m e (d e fin itio n ), th e m o rp h o lo g ic a l e x a m p le te lls u s , o n w h a t d e fin itio n th e m e a n in g o f th e d e riv e d w o rd sh o u ld b e b a sed , a t le a st p rim a rily (se e osäker, in ta b le 11). F iv e d e riv a tiv e s a re n o t p re se n te d as m o rp h o lo g ic a l e x a m p les, b u t d e riv e d fro m o n e -le x e m e le m m a s, an d so , th e d e fin itio n o n w h ic h to b a se th e ir d e riv e d m ean in g is u n iq u e ly d e te rm in e d . T h e re m a in in g c a se , h o w e v e r, w ill c a u se o v e rg e n e ra tio n , spelm åssigt is d e riv d fro m a n o u n w ith 9 d e fin itio n s an d an a d je c tiv a l su ffix w ith 2 d e fin itio n s; th e d e riv a tio n a l p o w e r o f th e le x e m e is in n o w a y c o n stra in e d , an d w e wiU h a v e to c o n s id e r th em all e q u a lly w ell fit a s b a se s o f th e d e riv e d w o rd s. (In all, th e re a re 39,8 3 1 m o rp h o lo g ic a l e x a m p le s in S O B .)
T y p e E n t r y in S O B M o r p h , e x .
a v v e c k lin g a v v e c k la v e rb + a v v e c k lin g e n
m o b b n in g m o b b a v e rb + u tv is n in g u tv is a v e rb
u tv is n in g a r
sä n k n in g s ä n k a v e rb + (1s t le x e m e) p e n s io n e rin g p e n s io n e r a v e rb - ( 1 le x e m e )
p e n s io n e rin g e n
e n ig h e t e n ig ad j. +
o s ä k e rh e t o s ä k e r adj. + (3 rd le x e m e ) sk ic k lig h e t s k ic k lig adj. +
tro v ä rd ig h e t t r o v ä r d i g adj. - ( 1 le x e m e ) ö p p e n h e t ö p p e n adj. -f (4 th lexem es] fö rfa tta rin n a n f ö r f a t t a r e subsL
so c ia ld e m o k ra tis k s o c ia l d e m o k r a t subsL + so c ia ld e m o k ra tis k a
m ittfä lta re m i t t f ä l t subsL - 6 ( 1 le x e m e ) m ittfä lta re n
[image:10.595.101.404.406.657.2]s p e lm ä s s ig t s p e l su b st. - (9 le x e m e s) - m ä s s ig - ( 2 le x e m e s )
Table 9. Uncovered derivatives in PressFreq
p h ra se o lo g ic a l e x p re ssio n s, an d so th e y w o n ’t b e e n te re d a s in d iv id u a l e n trie s in to th e d ic tio n a ry .
glasnost, a lo n e , w ill b e e n te re d in to th e S M U d ic tio n a ry , a n d m a rk e d w ith re s p e c t to o rig in (P ressF req ).
S y n ta g m a tic w o r d s . F o u r m is s in g ty p e s a re e x a m p le s o f v a ry in g w ritin g c o n v e n tio n s , i.e. ivä g (cf.
i väg), g o d n a tt (cf. g o d natt), långtifrån (cf. lå n g t ifrån), fra rn fö ra llt (cf. fr a r r fö r allt). T o th e sa m e c a te g o ry w e re f e r d ie c o llo q u ia l gom iddag (cf. g o d m id d a g .) 'The o n e w o rd v a ria n ts w ill a ll b e in c lu d e d in d ie S M U d ic tio n a ry (th e la st o n e m a rk e d a s c o llo q u ia l).
P a r t i a l p h m s e s . F o u r m is s in g ty p e s a re o ld in fle ctio n a l fo rm s a p p e a rin g in p h ra s e o lo g ic a l e x p re s s io n s o n ly , i.e. go d o (till g o d o [to s o m e o n e ’s c re d it etc .]; i g o d o [a m ic a b ly e tc .] ), sjö ss (till sjö ss [at se e]), vintras (i v in tra s [last w in te r]), an d so m ra s (i so m ra s [la s t su m m e r]).(n 7 / g o d o a n d
tillsjö ss c a n b e fo u n d a m o n g th e e x a m p le s o f lg o d a n d ^ ^ ö .) T h e f o u r e x p re s s io n s w ill b e e n te re d in to th e S M U d ic tio n a ry as p h rase s.
I n f le c tio n a l fo rm s . T w o u n d e riv e d in fle c tio n a l fo rm s w e re fo u n d to b e im a c c o u n te d fo r, i.e. m å st
(su p in e o f th e v e rb m å ste [m u st]) an d törs (p re se n t te n s e o f th e v e rb tör/as o r to rd ja s [d are ]), e v e n th o u g h törs is p a rt o f a n e x a m p le o f th e v e rb . F re q u e n t a s th e y a re fo u n d to b e in n e w s p a p e r te x t, b o th fo rm s w ill b e in c lu d e d in th e S M U d ic tio n a ry .
Conclusions
T h e S M U a n a ly se r, o p e ra tin g o n S w e d ish te x t, w o rk s w e ll a s a to o l f o r d istin g u is h in g b e tw e e n g e n e ra l v o c a b u la ry , as d e fin e d b y th e le m m a e n trie s o f S v e n sk O rd b o k (i.e. its e x p lic itly d e fln e d v o c a b u la ry ), a n d w o rd s o u tsid e th a t sc o p e. A s a re s u lt o f th e m o rp h o lo g ic a l a n a ly sis, m e m b e rs o f th e g e n eral v o c a b u la ry a re id e n tifie d and d e sc rib e d in te rm s o f le m m a , p a rt o f s p e e c h , a n d fo rm , an d h o m o g ra p h ie s a re re c o g n iz e d in a c c o rd a n c e w ith th e le m m a d istin c tio n s m a d e in S O B .
'The p ro c e ssin g o f fo u r d iffe re n t S w e d ish m a te ria ls h a s sh o w n , th a t th e S O B le x ic a l c o v e ra g e (in te rm s o f ty p e s) ra n g e s fro m 8 2 % to 2 5 % . T h e h ig h e s t fig u re s a re v a lid f o r h ig h fre q u e n c y w o rd s o f n e w s p a p e r tex t, P re ss F re q , and th e lo w e st o n e s f o r h ig h ly s p e c ia liz e d p h a rm a c o lo g ic a l te x t. In b e tw e en w e fin d so m e g o o d 7 0 % re la tin g to g e n e ra l L S P (L a n g u a g e f o r S p e c ia l P u rp o s e ) t e x t
T h e w o rd s o u tsid e th e sc o p e o f th e a n a ly s e r in d ic a te d o m a in an d ty p e o f th e a n a ly s e d t e x t 'The
b ig am o u n t o f n u m e ric a l e x p re s s io n s (an d h y b rid s o f n u m e ra ls , sp e c ia l s ig n s an d s in g le le tte rs ), i.e. 2 7 % o f th e u n a n a ly z e d w o rd s, s ta n d o u t in th e p h a rm a c o lo g ic a l te x t a s d o e s th a t o f p r o p e r n o u n s
(c lo se to 80% o f th e u n a n a ly z e d w o rd s) in P re ssF re q .
T h e P re ssF re q z e ro -p a rse s h a v e b e e n e x a m in e d in so m e d e ta il, an d c a te g o riz e d in to : p ro p e rn o u n s , a b b re v iatio n s, c o m p o u n d s, n u m e ric a l e x p re ssio n s, d e riv a tiv e s, fo re ig n w o rd s, s y n ta g m a tic w o rd s, p a rtial p h ra se s, an d s im p le in fle c tio n a l fo rm s. A b b re v ia tio n s , s y n ta g m a tic w o rd s , p a rtia l p h ra se s, a n d in fle ctio n a l fo rm s form a , b a s ic a lly , c lo s e d s e t o f a g e n e ra l c h a ra c te r (in a ll, le s s th a n 1 SO ite m s). ’T hey w ill all b e in c lu d e d in th e g e n e ra l p a rt o f th e S M U d ic tio n a ry (a s o n e -w o rd u n its o r a s p h ra se s).
M o s t o f th e fo re ig n w o rd s (e x c e p t f o r glasnost) se em to b e p a rt o f p h ra s e o lo g ic a l e x p re s s io n s (p ro p e rn o u n s ), an d so far, th e y w ill b e d isre g a rd e d , b u tg /o r n o s rb e e n te re d in th e d ic tio n a ry , m a rk e d b y o rig in (P re ssF re q ), a s a firs t c lu e to d o m a in . T h e p ro p e r n o u n s , fo rm in g a b ig , b u t, b a s ic a lly , c lo se d an d d o m a in -re la te d c a te g o ry , w ill b e h a n d le d in th e sa m e m a n n e r. ’The n u m e ric a l e x p re s
sio n s, fo rm in g a n o p e n c a te g o ry , w ill b e h a n d le d b y m e a n s o f ru le s, d e fln e d in th e S M U g ra m m a r (se e S åg v all H e in 1987).
A m o n g th e z e ro -p a rse d e riv a tiv e s, six ty p e s w e re fo u n d , i.e. v e rb -to -n o u n b y m e a n s o f th e su ffix
-ing (th e p ro cess; 8 c a se s), a d j-to -n o u n b y m e a n s o f th e s u fflx -h et (p re s e n c e o f th e propierty; 5 ca se s), n o u n -to -a d j b y m e a n s o f ~isk (th e p ro p e rty ; 2 c a se s), n o u n -to -n o u n b y m e a n s o f th e su fflx
o f -m ä ssig (a c c o rd in g to th e n o u n etc.; 1 c ase). T h e first fo u r ty p e s w ill b e h a n d le d b y m e a n s o f w o rd fo rm a tio n ru le s in th e g ra m m a r, w h e re a s th e re m a in in g tw o c a se s w ill b e e n te red in to the d ic tio n a ry , m a ik e d b y o rig in . T h is tre a tm e n t is sup fio rted b y S O B , p re se n tin g th e first fo u r ty p es a s m o rp h o lo g ic a l e x a m p le s.
T h e m o s t d iffic u lt c a te g o ry to h a n d le is th a t o f th e c o m p o u n d s, b e in g a n o p e n , p ro d u c tiv e c a te g o ry , w ith a c o m p le x se m a n tic s , d o m in a tin g th e z e ro -p a rse s o f g e n e ra l L S P te x t (se e S åg v all H e in 1990). F u rth e r, c o m p o u n d in g h a s a b e a rin g tm d o m ain . In o u r c o n tin u e d w o rk an th e d ic tio n a ry w e w ill a p p ro a c h th e p ro b le m s o f th e c o m p o u n d s fro m th e p o in t o f v ie w o f th e e ffe c t o f c o m p o u n d in g o n d o m a in . T h e m a te ria l p re s e n te d b y th e a p p lic a tio n o f th e S M U a n a ly s e r to te x t o f d iffe re n t ty p e a n d d o m a in is a v a lu a b le s o u rc e f o r s u c h stu d ie s.
Notes
1 C F . S W E T W O L b y K a rls s o n (fo n h c o m .) p e rfo rm in g ru le -b a se d s tru c tu ra l a n a ly sis o f c o m p o u n d s a n d d e riv a tiv e s.
2 T h e se c o n d a ry v o w e l d e le tio n ru le in p a rt o f th e in fle c tio n a l g ra m m a r and in v o k e d b y th e d ic tio n a ry se a rc h p ro ce ss.
3 in d ie s e n se o f d ic tio n a ry stem
4 T e x tu a l fre q u e n c y d a ta w e re n o t at h a n d w h e n th e a n a ly sis w a s c a rrie d o u t, so o n ly lex ical c o v e ra g e c a n b e a c c o u n te d fo r h ere.
5 I n a p ilo t s tu d y o f a fra g m e n t o f (2 ,5 0 0 ) ty p e s) o f D e fV o c th e (5 7 2 ) ty p e s o u tsid e th e sc o p e o f S O B w e re e x a m in e d (se e S å g v a ll H e in 1990).
6 E v e n th o u g h m ittfä lta re d o e s n ’t a p p e a r as a m o rp h o lo g ic a l e x a m p le , th e re la tiv e m ittfä ltssp ela re
d o e s.
References
B lå b e rg , O . 1988. A stu d y o f S w e d is h c o m p o u n d s. U m e å U n iv e rsity . D e p a rtm e n t o f G e n e ral L in g u is tic s . R e p o rt N o 2 9 .
P A SS. F a rm a ceu tiska sp e c ia lite te r i S verig e. 1985. [R ia rm a c e u tic a l S p e c ia ltie s in S w ed en .] L IN F O .
G e lle rs ta m , M . 1989. T h e L a n g u a g e B a n k . T h e D e p a rtm e n t o f
C o m p u ta tio n a l L in g u is tic s . U n iv e rs ity o f G o th e n b u rg .
H e llb e rg , S . 1978. T he m o rp h o lo g y o f p resen t-d a y S w edish. S to c k h o lm . K a rls s o n , F . S W E T W O L : A c o m p re h e n s iv e m o rp h o lo g ic fd a n a ly z e r fo r S w ed ish . F o rth c o m in g .
ö s tl i n g , A . A S w e d is h C o re V o c a b u la ry fo r M a c h in e T ra n sla tio n . T h is v o lu m e .
S h ie b e r, S. 19 8 6 . A n in tro d u c tio n to u n ific a tio n -b a s e d a p p ro a c h e s to g ra m m a r. C S L I. L e c tu re N o te s N u m b e r 4 . S jö g re e n , C . 1988. C re a tin g a d ic tio n a ry fro m a le x ic a l d a ta b a se . In : S tu d ies in . co m p u ter-a id ed lexico lo g y. S to c k h o lm . P p . 2 9 9 -3 3 8 .
S å g v a ll H e in , A . 1987. P a rsin g b y m e a n s o f U p p s a la C h a rt P ro c e s s o r, (U C P ). In : L . B o le (ed .)
N a tu ra l la n g u a g e p a rsin g system s. B e rlin & H e id e lb e rg . P p . 2 0 3 -2 6 6 .
S å g v a ll H e in , A . 1988. T o w a rd s a c o m p re h e n s iv e S w e d ish p a rs in g d ic tio n a ry . In: S tu d ies in co m p u ter-a id ed lexico lo g y. S to c k h o lm . P p . 2 6 8 -2 9 8 .
Sflgvall H ein , A . T h e S M U in fle c tio n a l g ra m m a r. U p p s a la U n iv e rsity . D e p a rtm e n t o f L in g u istic s. F o rth c o m in g .
S åg v all H ein , A . & A h re n b e rg , L . 1985. A p a rs e r fo r S w ed ish . S ta tu s R e p o rt fo r S v e .U c p . J u n e 1985. U p p sa la U n iv e rsity . C e n te r f o r C o m p u ta tio n a l L in g u is tic s . U C D L -R -8 5 -2 .
S å g v a ll H e in , A . & S jö g re e n , C . 1991. E tt s v e n sk t sta m le x ik o n f ö r d a ta m a s k in e ll m o rfo lo g is k a n a ly s. E n ö v e r s ik t [A S w e d ish ste m d ic tio n a ry f o r ctH n p u tatio n al m o rp h o lo g ic a l a n a ly sis. A n o v e rv ie w .] In : M . T h e la n d e r e t al. (e d s .) S ven ska n s b eskrivn in g 18. L u n d . P p . 3 4 8 -3 6 0 .
S å g v a ll H e in , A ., ö s tlin g , A . & W ik h o lm , E . 1990. P h ra se s in th e C o re V o c a b u la ry . U p p s a la U n iv e rsity . C e n te r fo r C o m p u ta tio n a l L in g u istic s.
S å g v a ll H e in , A ., S ta rb ä c k , P . &. W ik h o lm , E . A p h a rm a c o lo g ic a l ste m d ic tio n a ry b a s e d o n F A S S , P h a rm a c o lo g ic a l S p e c ia ltie s in S w e d e n 1985. U p p s a la U n iv e rs ity . D e p a rtm e n t o f L in g u istic s. F o rth c o m in g .
S ven sk O rdbok. 1986. [A D ictio n a ry o f Sw edish.] S to c k h o lm .
T e le m a n , U . 1974. M a n u a l fö r b eskrivn in g a v ta la d o ch skriven sven ska . L tm d .
A n n a S å g v a ll H e in U p p sa la U n iv e rsity D e p a rtm e n t o f L in g u istic s C o m p u tatio n a l L in g u istic s B o x 5 1 3
S - 7 5 1 2 0 U p p sa la