• No results found

On the Coverage of a Morphological Analyser based on “Svensk Ordbok” [A Dictionary of Swedish]

N/A
N/A
Protected

Academic year: 2020

Share "On the Coverage of a Morphological Analyser based on “Svensk Ordbok” [A Dictionary of Swedish]"

Copied!
13
0
0

Loading.... (view fulltext now)

Full text

(1)

On the Coverage o f a Morphological Analyser based

on '^Svensk Ordbok** [A Dictionary o f Swedish]

Anna Sågvall Hein

Uppsala University

Introduction

In th e p ro je c t a L exicon-oriented P a rser f o r Sw edish a stem d ic tio n a ry (S å g v a ll H e in & S jö g re e n 1991; S jö g re e n , fo rth c o m .) c o v e rin g th e 5 8 ,5 3 6 e n try le m m a s o f S ven sk O rdbok (1 9 8 6 ) a lo n g w ith a c o m p le te in fle c tio n a l g ra m m a r o f S w e d ish (S å g v all H e in , fo rth c o m .) w a s g e n e ra te d . T h is la n g u a g e d e sc rip tio n to g e th e r w ith th e U ppsala C hart P rocessor, U C P (S å g v a ll H e in 1987) c o n stitu te a m o rp h o lo g ic a l a n a ly z e r o f S w ed ish , h e n c e fo rth re fe rre d to a s SM U , sh o rt fo r S w e d is h M o rp h o lo g y in th e U C P fra m ew o rk .

S o far, th e re a re n o w o rd fo rm a tio n ru le s in th e S M U g ra m m a r, an d w o rd s o u tsid e th e s c o p e o f S v e n sk O rd b o k d o n ’t g e t an an aly sis^ . E v e n th o u g h c lo se d in its p re s e n t v e rs io n , th e c o v e ra g e o f S M U is w e ll-d e fin e d ; p rio r to a n y p ro c e s s in g w e m a y c o n s u lt S v e n sk O rd b o k to fin d o u t fo r a n y w o rd form w h e th e r it w ill g e t an an a ly sis o r n o t; th e d ic tio n a ry p ro v id e s a n in tu itiv e , f a m ilia r fo rm a t th ro u g h w h ic h w e m a y e x p lo re th e (p re se n t) c o m p e te n c e o f th e S M U a n a ly s e r w ith o u t a n y p rio r k n o w le d g e o f its fo rm a lism s o r o p e ra tio n . S M U is a lso w ell-d eH n ed in th e s e n se , th a t f o r a n y o f its le m m a s. S v e n sk O rd b o k p ro v id e s lin k s to th e c o rre sp o n d in g le x e m e s (b a sic s e n se s ), a n d fo r e a c h le x e m e a d efin itio n .

In o u r o n g o in g w o rk o n a m a c h in e -tra c ta b le d ic tio n a ry fo r S w e d ish , w e a re a p p ro a c h in g p ro b le m s

c o n c e rn in g th e d istin c tio n b e tw e e n g e n e ra l an d d o m a in s p e c ific v o c a b u la ry , a n d th e p re s e n t c o v e ra g e o f S M U is o u r s ta rtin g -p o in t fo r d e lim itin g a g e n e ra l S w e d is h v o c a b u la ry . F o r a n e v a lu a tio n o f th e g e n e ra lity o f th e d ic tio n a ry , th e a n a ly s e r h a s b e e n a p p lie d to d iffe re n t s e ts o f S w e d ish tex t. F o r o n e o f th e m , c o n s is tin g o f th e 10,2 2 4 m o s t fre q u e n t ty p e s o f th e 7 ,3 m illio n w o rd n e w s p a p e r c o rp u s o f T h e L a n g u a g e B a n k (G e lle rsta m 19 8 9 ) th e w o rd s o u tsid e h e s c o p e o f th e a n a ly s e r h a v e b e e n e x a m in e d at s o m e d e ta il. H e re w e w ill p re s e n t th e re s u lts a c h ie v e d so far, a n d a lso d isc u ss th e ir im p a c t o n o u r c o n tin u e d w o rk o n th e d ic tio n a ry .

F irs t, h o w e v e r, w e w ill b rie fly c h a ra c te riz e th e S M U a n a ly s e r w ith re g a rd to m o rp h o lo g ic a l d e sc rip tio n s, and d ic tio n a ry re p re s e n ta tio n o f in fle ctio n .

The morphological descriptions o f the SMU analyser

T h e m o rp h o lo g ic a l d e sc rip tio n s g e n e ra te d b y th e a n a ly s e r a re e x p re sse d a s a ttrib u te -v a lu e s tru c tu ­ res (S å g v all H ein & A h re n b e rg 1985; cf. d i r e a e d a c y clic g ra p h s, d a g s , fo r s h o rt, S h ie b e r 1986).

(2)

It c o m p rise s f o u r g e n e ra l a a rib u te s , i.e. L E M fo r le m m a , W O R D .C A T fo r w o rd c a te g o ry (p art o f sp e e c h ), D IC .S T E M fo r d ic tio n a ry ste m , an d IN F L fo r in fle c tio n ty p e), a n d , fo u r a ttrib u tes sp ecific to th e n o u n s i.e. G E N D E R , N U M B e r, F O R M (sp e c ie s), and C A S E . T h e g e n eral a ttrib u te s are p re s e n t in th e d e s c rip tio n s o f all th e w o rd s, re g a rd le ss o f p a rt o f s p e e c h (n o u n , ad je c tiv e , p ro n o u n , v e rb , a d v e rb , n u m e ra l, p re p o s itio n , c o n ju n c tio n , in te rje c tio n , a rtic le , and in fin itiv e m aricer).

FESTERNAS

(* = ( LEM =FEST.NN

W O R D .CA TgN OUN IN FL = PA 1T E R N PIL M DIC.STEM =FEST GENDER=UTR NU M B=PLU R FO R M =D EF CA SE=GEN)

Figure 1. An analysis o f the noun festernas [o f the parties]

T h e v a lu e o f th e le m m a a ttrib u te is id e n tic a l to th e b a s ic fo rm o f th e le m m a w ith a (tw o le tte r) w o rd c la s s m a rk e r f o r th e d istin c tio n b e tw e e n h o m o g ra p h le m m a s, i.e. s p rin g a l.n n [ d iin k ; slot] an d

sprin g a 2 .vb [ru n ]. In a d d itio n , th e h o m o g ra p h s a re n u m b e re d as th e y a re in S v e n sk O rd b o k (cf. ^ s p rin g /a su b st. [n o u n ] an d ^ s p rin g /a v e rb ), w h e re b y im m e d ia te re fe re n c e to th is b a c k g ro u n d m a te ria l is fac ilita te d . I f th e re are tw o h o m o g ra p h le m m a s o f th e sa m e w o rd c a te g o ry , th e h o m o g ra p h n u m b e r a lo n e w ill k e e p th em ap art, e.g . b o k L n n [b o o k ] (p lu r. böcker) an d bok2.nn

[b e e c h (tree)] (p lu r. bokar). T h e w e ll-d e fln e d le m m a m a rk e r s u p p o rts th e d istin c tio n b e tw e en e x te rn a l an d in te rn a l h o m o g ra p h y , a b a s is fo r s u b s e q u e n t le m m a tiz a tio n . F u rth e r, it p ro v id e s a b a sis f o r th e s e le c tio n o f d o m a in tu n ed le m m a d ic tio n a rie s fro m a g e n e ra l d ictio n a ry ; th e le m m a s sp ecifle to th e d o m a in a re rec o g n iz e d b y th e m o rp h o lo g ic a l a n a ly sis o f te x ts ty p ical o f th a t d o m ain . T h e d ic tio n a ry ste m a ttrib u te m a y s e rv e th e sa m e fu n c tio n in b u ild in g a d o m a in tu n e d stem d ic tio n a ry fro m a g e n e ra l ste m d ic tio n a ry .

T h e v a lu e o f th e in fle c tio n a ttrib u te is a p a ttern w o rd w h ic h is a lso th e n a m e o f a n inflectional rule d e fln e d in th e g ra m m a r. T h e in c lu sio n o f th is in fo rm a tio n in th e m o rp h o lo g ic a l d e scrip tio n s p ro v id e s a b a sis f o r fre q u e n c y s tu d ie s o f in fle c tio n a l ty p e s in c u rre n t tex t.

In a d d itio n to th e g e n e ra l a ttrib u te s, e a c h p a rt o f sp e e c h (e x c e p t fo r th e p re p o sitio n s an d th e in fin itiv e m a rk e r) is c h a ra c te ris e d b y its o w n s e t o f attrib u te v a lu e p a irs. In fig. 2 w e p re se n t th e m o rp h o lo g ic a l d e s c rip tio n s re s u ltin g fro m th e a n a ly se s o f a n a d je c tiv a l p a ra d ig m , th e a d je c tiv e

(3)

FESTUG {* = (

FESTUGA

(* = ( (* = (

LEM =FESTLIGj\ V W ORD.CAT=ADJ IN FL=PA TrER N .B LEK DIC.STEM =FESTLIG GEN D ER*U TR NUM B=SING FO RM =IN D EF CX)MP=POS)

LEM =FE ST L IG A V W ORD.CA T«AD J INFL=PATTERN.BLEK D IC .STEM =FESTU G COM P=POS

NUM B=SING FORM =DEF)

[image:3.595.101.470.130.700.2]

(LEM =FESTLIG .A V W ORD.CAT=ADJ INFL=PATTERN.BLEK DIC.STEM =FESTLIG GENDER=UTR NUM B=SING FO RM =DEF SEX=M ASC COM P=POS) LEM =FESTLIG.AV W ORD.CAT=ADJ INFL=PATTERN.BLEK DIC.STEM =FESTLIG COM P=SUP FORM =INDEF))

Figure 2. Analyses of the adjective festlig [festive; grand] FESTUGT (* = ( FESTUGE (* = FESTUGARE (* = (

LEM =FESTL1Gj\ V W O R D .C A T -A D J IN FL=PA TTERN.BLEK DIC.STEM =FESTLIG G EN D ERsNEUTR N U M B sSIN G F O R M sIN D E F CO M P=POS)

LEM =FESTLIG .A V W ORD.CA T=AD J IN FL=PA 1TER N .B L EK D IC .S T E M = F E S H JG COM P=POS

N U M B=PLU R)

LEM =FESTLIG .A V W ORD.CA T=AD J IN FL=PA TTERN.BLEK DIC.STEM =FESTLIG CO M P=CO M P

FESTUGAST (* = (

FESTUGASTE

(• = ( L E M = F E S T L IG ^ V W O R D .CA T»A D J IN F L = P A T T E R N 3L E K DIC.STEM =FESTLIG CO M P=SU P

FO RM =DEF))

P a r t o f sp e e c h A t t r i b u t e V a lu e s

NN GENDER N eutr U tr

NUM B Sing P lur

FO RM Indef D ef

CA SE Basic G en

PROPR +

ABBREV +

AV GENDER N eutr U tr

NUM B Sing Plur

FO RM Indef D ef

CO M P Pos C om p Sl^)

FU NC A ttr Pred

SEX Masc

(■«■ CASE for the description o f nom inalized adjectives)

VB TENSE Pres Sup Pret

IN FF In f PP A p

(4)

AL

PN

N L

IM P +

CONJ +

(+ N U M B. GEND ER, FORM . FUNC, SEX. CASE for the description o f the participles)

GEND ER N e u trU tr

NUM B Sing P lur

FO R M Indef D ef

PRO N.TY PE Pers Poss Rel A TTR.TY PE Select Q uant C om p

DET.TYPE T ot D et

G EN D ER N eutr U tr

NU M B Sing Plur

FO R M Indef D ef

C A SE Basic G en Obj

SEX M asc

P A R T m V +

D U A L +

NO U N .IN D EF +

ATTR.TYPE Select Q uant GEND ER N eutr U tr

NUM B Sing P lur

FO R M Indef D ef

CASE Basic Gen

SEX M asc

C O M P Pos Com p Sup

SUBJU +

AB

KN

P re p o sitio n s , i n te ije a io n s , an d th e in fin itiv e m a rk e r h a v e n o a ttrib u tes.

Table l . A n Overview o f the attributes assigned by the SMU analyser

Inflection in Svensk Ordbok and in the SMU dictionary

In fig. 3 w e p re s e n t th e in fle c tio n a l v a rie ty o f th e -ar-declination fo r a n illu stra tio n o f d ie re la tio n b e tw e e n th e in fle c tio n a l fo rm a t o f S O B and th a t o f th e S M U stem d ic tio n a ry . In th e sim p le st case,

th e re is o n ly o n e ste m , in S O B an d in S M U , a n d , fu rth e r, th e stem is id e n tic a l to th e b a s ic G em m a) fo rm o f th e w o rd (se e 1 a n d 2 ). In su c h c a se s, th e e n try fo rm o f S O B re p re se n ts a t th e sa m e tim e th e le m m a a n d th e ste m , w h e re a s in th e S M U fo rm a l d ic tio n a ry th e tw o c o n c e p ts h a v e to b e in d iv id u a lly re p re se n te d . T h e p a tte rn ru le s o f th e S M U d ic tio n a ry c o v e r th e in fle ctio n a l in fo rm a tio n g iv e n in S O B in te rm s o f w o rd cla s s an d sig n ific a n t e n d in g s , a n d , also , th e g ram m atic a l b a c k g ro u n d in fo rm a tio n to w h ic h th e s e m o rp h o lo g ic a l k e y s re fe r, i.e. m o ip h o ta c tic n ile s d etem n in in g th e

in fle c tio n a l b e h a v io u r o f th e n o u n s. S o m e tim e s. S O B re c o g n iz e s a stem id e n tic a l to a n in itial s u b s trin g o f th e le m m a fo rm and d e lim its it fro m th e su c c e e d in g e n d in g b y a sla sh , a s in 3 - 8, and 11. T h e ste m c o n c e p t th u s a d o p te d is th e tech n ica l stem (H e llb e ig 1978). T h e S M U a n a ly s e r tre a ts th e s e c a se s in a d iffe re n t w a y , i.e . b y m e a n s o f a g e n e ra l re w ritin g ru le ( 4 - 8) o r stem h a n d lin g o p e ra tio n s in th e p a tte rn ru le s (as in 3 and 11). In th e firs t c a se , th e n o n -v o w e l ste m a lte rn a n t is

re g a rd e d a s th e c a n o n ic a l fo rm o f th e ste m , a n d its v o w e l c o u n te rp a rt is re d u c e d to th is fo rm b y a se c o n d a ry v o w e l d e le tio n ru le^. T h u s th e re is o n ly o n e stem in d ic tio n a ry , a n d , w h a t is m o re im p o rta n t, w e d o n ’t h a v e to a c c e p t a s e n d in g s s trin g s su c h a s eln, lor, nen, n a rt etc. In th e seco n d c a se , S M U a n a ly s e s th e ste m in tw o ste p s, i.e. a s a d ictionary stem fo llo w e d b y a stem b u ild in g

e le m e n t, e .g . p o jk -e , drdm -m , Idm -m el. A w o rd fo rm s u c h a sp o jk e n is a n a ly se d p o jk-e+ n , Idm meln

[image:4.595.94.454.118.416.2]
(5)

resp o n sib le fo r th e rec o g n itio n o f th e stem b u ild in g se g m e n ts an d th e ir d istrib u tio n in th e p a ra d ig m (se e fu rth e r S å g v a ll H e in , fo rth c o m .).

In a ll. th e re are 132 (ste m ) p a tte rn ru le s fo r th e n o u n s , 4 0 f o r th e a d je c tiv e s , 8 9 fo r th e v e rb s , 1 fo r th e a rticle s, 3 0 fo r th e p ro n o u n s, 5 f o r th e n u m e ra ls , 9 f o r th e a d v e rb s , 2 f o r th e c o n ju n c tio n s , an d o n e e a c h fo r th e p re p o sitio n s, th e in te ije c tio n s , a n d th e in fm itiv e m a rk e r. In m o s t c a s e s th e a n a ly sis is b a se d o n o n e stem ^ , a n d in th e s e c a se s, th e stem w ith its p a tte rn ru le is a s u ffic ie n t c h a ra c te risa tio n o f th e in fle ctio n a l b e h a v io u r o f th e le m m a as su ch . If, o n th e o th e r h a n d , th e le m m a is re p re sen te d b y m o re th a n o n e stem in th e d ic tio n a ry (cf. 17 a n d 1 8 in fig . 3 ). th e s e t o f stem s

in v o lv e d alo n g w ith th e irp a ttern w o rd s d e te rm in e th e in fle c tio n o f th e le m m a , th e le m m a in fle c tio n as o p p o se d to th e stem in fle c tio n . F re q u e n c y d a ta o n th e in fle c tio n a l ty p e s o f th e S M U w o rd s h a v e b e e n p re se n te d e ls e w h e re (S å g v a ll H e in & S jö g re e n 1991).

S O B

1 g fird subsL - e n - a r 2 subsL - n - a r

3 p o jk /e subsu - e n - a r

4 sp e g /e l subsL - e ln - l a r

5 s o c k /e n subsL - n e n - n a r 6 b o tt/e n subsL -n e n el. = ,

~ n a r 7 m y r t /e n subsL - e n - n a r

8 f in g /e r subsL - r e t el. - e m

- r a r

9 d r ö m subsL - m e n - m a r

10 läm A n e l s u b s t - m e ln - l a r

11 s u m A n e r su b st -m e r n - r a r

1 2 s o m m a r subsu - ( e ) n som rar

13 h a m m a r e subsu ham m ar(e)n, =

el. ham rar, besu plur.

14 h im m e l subsu him m el(e)n el.

him len, him lar

15 a f to n subsu c f tonen c ftn a r

16 d jä v u l subsu - e n djävlar

17 m o d e r av. ^ m o r subsu

m odern m ödrar

18 to k subsu - e n - a r el. t o k /e r - e r n - a r

19 s t a d g a r subsu plur.

stem

SMU

lemma

pattern

g å rd g å rd .n n .sto l

s jö sjd.rm .fru

p o jk -e p o jk e .n n .g o sse sp e g (e )l sp e g e l.n n .n y c k e l so c k (e )n s o c k e n jin .ö k e n b o tt(e )n b o tte n .n n .b o tte n

m y rt(e )n m y rte n .n n .frb k e n fin g (e )r

fingernn

.fin g e r

d rö m drO m .nn .k am d rö m -m

la m -m e l lå m m e l.n n .lå m m e l lä m-1

s u m -m e r su m m e r.n n .h u m m e r s u m -r

so m so m m a r.n n .s o m m a r so m -m a r

h a m - r h a m m a re .n n .k a m m a re h a m -m a r

h a m -m a re

h im -m e l h im m eL n n .h im m e l h im-1

aft-on

afton.nn

.m o rg o n a ft-n

d jå v -u l d jå v u l.im .d jä v u l d jä v-1

m o d e r m o d e r J in .m o d e r m ö d ra r m o d e r jin j n ö d r a r m o r m o d e r J in .f a r to k to k .n n .sto l to k e r

tok.nn

.to k e r sta d g s ta d g a n n n .v ä g n a r

(6)

The Scope o f Svensk Ordbok and the SMU analyser

F o r th e re c o g n itio n , an d s u b s e q u e n t ex a m in a tio n , o f m is s in g en trie s in th e d ic tio n a ry , w e a p p ly the S M U a n a ly s e r to d iffe re n t se ts o f S w e d is h t e x t S o far, w e h a v e a n a ly se d fo u r te x t m a te ria ls o f s u b s ta n tia l siz e , i.e. th e 10,2 2 4 m o s t fre q u e n t ty p e s o f th e 7 ,3 m illio n w o rd n e w s p a p e r c o rp u s o f th e S w e d is h L a n g u a g e B a n k (G e lle rstam 1989), refe rre d to as P ressF req, th e {rfiarm acological tex t o f th e S w e d is h d ru g c a ta lo g u e F A SS (1 9 8 5 ) (660,(XX) c u rre n t w o rd s ), th e P rofessional P rose c o rp u s o f th e S krivsyn ia x p ro je c t C Telem an 19 7 4 ), refe rre d to a s P rcfP ro se (7 8 ,0 3 6 6 c u rre n t w o rd s), and, fin a lly , th e c o rp u s o f th e d e fin itio n s o f S v e n s k O rd b o k , refe rre d to as D efV o c (3 6 0 ,1 4 4 c u rre n t

w o rd s ). P ro fP ro s e c o n sists o f fo u r ty p e s o f te x t o f e q u al siz e (te x tb o o k s, n e w sp a p e rs, d e b a te b o o k s, a n d b ro c h u re s).

c o r p u s to k e n s t y p e s ty /to c o v e r e d u n c o v e r e d

P re s s ­ 5 ,0 9 1 ,9 6 5 10,2 2 4 0 , 0 2 8 ,4 2 4 (8 2 % ) 1 ,7 9 0 (1 8 % ) F re q

P ro f­ 7 8 ,0 3 6 13,7 6 6 0 ,1 8 1 0 ,0 8 3 (7 3 % ) 3 ,6 8 3 (2 7 % ) P ro s e

[image:6.595.101.473.274.359.2]

D e fV o c 3 6 0 ,1 4 4 4 3 ,9 3 4 0 , 1 2 3 1 ,3 5 0 (7 1 % ) 12,5 8 4 (2 9 % ) F A S S 6 6 4 ,3 1 4 3 9 ,8 8 4 0 ,0 6 9 ,7 6 7 (2 5 % ) 3 0 ,1 1 7 (7 5 % )

Table 2. Results of the application o f the SOB-based SMU analyser to four sets o f Swedish text

A s m ig h t b e e x p e c te d , th e a n a ly s e r c o v e rs b e s t in re la tio n to th e h ig h fre q u e n t w o rd s o f th e

n e w s p a p e r te x t an d w o rs t w ith re s p e c t to th e h ig h ly d o m a in -sp e c ific p h a rm a c o lo g ic a l tex t. In b e tw e e n c o m e s th e m o re g e n e ra l te x t o f P ro fP ro s e and th a t o f th e d e fin itio n c o rp u s, c o v ered , b a s ic a lly , to th e sa m e e x te n t In tab le s 3 to 6 w e p re s e n t m o re d e ta ile d d a ta o n th e re su lts o f th e p ro c e s s in g o f e a c h o f th e te x t m a te ria ls. T h e y in c lu d e in fo rm a tio n o n h o m o g ra p h y o f th re e k in d s, i.e. G em m a) in te rn a l. G em m a) e x te rn a l, and m ix e d (in tern al an d e x te rn a l). (F o r e a c h te x t m ate ria l, w e a lso h a v e d e ta ile d d a ta o n th e d iffe re n t s u b ty p e s o f h o m o g ra p h ie s th a t w e re fo u n d , e.g. i n t A V , e x t. A B /A B , e x t. A B /K N , exL A B /N N , an d th e ir fre q u e n c ie s.)

N u m b e r o f p a r s e s

le x ic a l c o v e r a g e

t y p e s %

te x t u a l c o v e ra g e to k e n s %

0 1 ,7 9 0 17,5% 2 9 9 ,2 3 5 5 ,9 % 1 6 ,1 4 2 6 0 ,0 % 2 ,2 7 0 ,9 6 1 4 4 ,6 % 2 (in t.) 5 9 3 5 ,8 % 1 5 2 ,3 3 4 3 ,0 % 2 (e x t.) 9 5 9 9 ,4 % 1 ,3 9 1 ,6 2 3 2 7 ,3 % 3 (in t.) 158 1,5% 2 4 ,9 3 4 0 ,5 % 3 (e x t.) 191 1,9% 4 3 5 ,0 6 6 8 ,5 % 3 (m ix .) 2 0 6 2,0% 5 4 ,1 7 6 1,1% 4 (e x t.) 33 0 ,3 % 4 3 ,8 1 4 0 ,9 % 4 (m ix .) 6 9 0 ,7 % 2 4 ,5 4 4 0 ,5 % 5 (ex L ) 14 0,1% 2 5 8 ,7 1 3 5 ,1 % 5 (m ix .) 5 4 0 ,5 % 2 0 ,3 8 2 0 ,4 %

6 (e x t.) 6 0,1% 8 3 ,6 6 9 1,6% 6 (m ix .) 7 0,1% 3 2 ,1 1 8 0,6%

7 (m ix .) 1 0,0% 9 6 0,0%

8 (m ix .) 1 0,0% 2 8 0 0,0%

T o ta l: 1 0 ,2 2 4 1 0 0,0% 5 ,0 9 1 ,9 6 5 10 0 ,0 %

[image:6.595.101.439.489.710.2]
(7)

T h e le m m a c a n b e u n a m b ig u o u s ly d e te n n in e d fo r 6 ,8 9 3 ty p e s (6 7 ,4 % ) a n d 2 ,4 4 7 ,9 5 6 to k e n s (4 8 ,1 % ). In P r e s s F r e q w o r d s o u ts id e t h e s c o p e o f S O B b e lo w , w e g iv e a n a c c o u n t o f th e k in d s o f w o rd s th a t g o t n o p a rses.

N u m b e r o f le x ic a l c o v e r a g e ^ t e x t u a l c o v e r a g e p a r s e s t y p e s % to k e n s %

0 3 ,6 8 3 2 6 ,8 % 6 ,8 0 6 8 ,7 %

1 7 ,8 2 8 5 6 ,9 % 3 8 ,0 1 5 4 8 ,7 % 2 (in i.) 8 0 0 5 ,8 % 2 ,6 9 3 3 ,5 % 2 (exL ) 8 2 9 6,0% 1 7 ,6 3 8 2 2,6%

3 ( i n D 161 1,2% 2 8 2 0 ,4 %

3 (exL ) 149 1,1% 6 ,0 6 8 7 ,8 % 3 (m ix .) 168 1,2% 7 5 8 1,0%

4 (cxL ) 2 4 0,2% 4 5 5 0,6%

4 (m ix .) 63 0 ,5 % 3 8 4 0 ,5 %

5 (exL ) 8 0,1% 3 ,2 8 2 4 ,2 %

5 (m ix .) 4 4 0 ,3 % 2 6 6 0 ,3 %

6 (ex t.) 3 0,0% U O l 1,5%

6 (m ix .) 4 0,0% 186 0,2%

7 (m ix .) 1 0,0% 1 0,0%

8 (m ix .) 1 0,0% 1 0,0%

[image:7.595.102.441.173.396.2]

T o tal; 1 3 ,7 6 6 1 0 0,0% 7 8 ,0 3 6 1 0 0%

Table 4. Results o f the application o f the SOB-based SMU analyser to ProfProse

T h e le m m a ca n b e u n a m b ig u o u s ly d e te rm in e d fo r 8 ,7 8 9 ty p e s (6 3 ,8 % ) a n d 4 0 ,9 9 0 to k e n s (5 2 ,5 % ). R o u g h ly 13% (4 7 9 ) o f th e w o rd s th a t g o t n o a n a ly sis a re n u m e ric a l e g r e s s i o n s .

N u m b e r o f le x ic a l c o v e ra g e ^

p a r s e s ty p e s %

0 1 2 ,584 2 8 ,6 % 1 2 6 ,3 4 8 6 0 ,0 % 2 (in t.) 1,654 3 ,8 % 2 (ex t.) 2 ,2 0 5 5 ,0 % 3 ( i n t ) 251 0,6% 3 (ex t.) 2 5 8 0,6% 3 (m ix .) 3 7 2 0,8% 4 ( i n t ) 1 0,0% 4 (ex t.) 4 0 0,1% 4 (m ix .) 137 0 ,3 %

5 ( e x t ) 9 0,0% 5 (m ix .) 63 0,1%

6 ( e x t ) 3 0,0% 6 (m ix .) 7 0,0% 7 (m ix .) 1 0,0%

8 (m ix .) 1 0,0%

T o tal: 4 3 ,9 3 4 1 0 0,0%

[image:7.595.102.284.450.679.2]
(8)

T h e le m m a s o f 2 8 ^ 3 4 o f th e ty p e s (6 4 % ) ca n b e d e te rm in e d u n a m b ig u o u sly . O n ly 4 3 0 ( ~ 3 ^ % )

o f th e 0-p a rs e s a re n u m e ric a l e x p re ssio n s.^

N u m b e r o f p a r s e s

le x ic a l c o v e ra g e ^

t y p e s %

0 3 0 ,1 1 7 7 5 ,5 %

1 7 ,5 9 8 19,1%

2 (in t.) 8 7 4 0,2% 2 (cxL ) 7 6 6 0,2% 3 ( in L ) 147 0.0% 3 (exL ) 116 0,0% 3 (m ix .) 142 0.0% 4 ( e x t ) 17 0.0% 4 (m ix .) 5 2 0.0% 5 (e x t.) 6 0.0% 5 (m ix .) 4 2 0.0%

6 (ex t.) 3 0.0% 6 (m ix .) 3 0.0% 8 (m ix .) 1 0.0%

[image:8.595.101.283.161.366.2]

T o ta l: 3 9 ,8 8 4 1 0 0.0%

Table 6. Results o f the application o f the SOB-based SMU analyser to PASS

T h e l e m m a s o f 8 ,6 1 9 o f t h e t y p e s ( 2 2 % ) c a n b e d e t e r m i n e d u n a m b i g u o u s ly . 1 1 ,1 4 8 ( 2 7 % ) o f th e 0 - p a r s e s a r e n u m e r i c a l e x p r e s s io n s o r h y b r id s o f n u m b e r s , s p e c i a l s ig n s , a n d s in g le

l e t t e r s , s u c h a s 7 5 - 8 0 ,8 5 % , 4 $ 8 , F lO O , E 2 1 8 , C -t-+ , 0 ,5 - 0 ,7 ,2 0 : e e tc . ( A p h a r m a c o l o g i c a l

s te m d i c t i o n a r y c o v e r i n g th e n o n - n u m e r ic a l w o r d s o u t s i d e t h e s c o p e o f S O B h a s b e e n b u ilt ( s e e S å g v a ll H e i n e t a l., f o r t h c o m .) ) I n f ig . 4 w e p r e s e n t a d r u g d e s c r i p ti o n f r o m P A S S to

i l l u s t r a t e t h e s p e c i a l c h a r a c t e r o f th is te x t.

Abboticin*

Abbott

Dosgniwlst 200 mg

Antibiotikum Grupp 7B 3005 Deklantlon. I dosgnnulat innehiJIer: Erythromycin, set- hylsuccin. respond, erythromycin. 200 mg, mannitol. 1,5 g, constit. et aroma q. s.

Egenskaper. Dosgranulaten innehiller erytromycinetyl- luccinat motsvarande 200 mg erytromycin. Granulatet loses i litet vatten (2-3 dessertskedar-20-30 ml). Bered­ ningsformen kr speciellt avsedd fOr bam och kr kven Ikmplig som jourförpackning. Beredningen kr sockerfri och har körsbkrssmaJc.

Erytromycinetylsuccinat kr en ester av erytromycin och efter absorption sker hydrolys till friti aktivt erytromy­ cin. Se f Ö Ab b o tic inubletter.

Indikatloiier. Se Ab b o tic inubletter. Kontraindikationer. Se Ab b o tic intabletler.

Försiktighet. Se ABBOTICIN Ubletter.

Graviditet och amning. Se Ab b o tic in tabletter. Biverkningar. Se Ab b o tic in ubletter.

Dosering. Dosen fOr ban beriknas efter 30-50 mg per kg kroppsvikt och dygn fördelat p i 2-4 doseringstillfållen. 1 dosgranulat■ 200 mg erytromycin. För bam upp till 4 kg beräknas dosen i det enskilda fallet. Vid kroppsvikt över­ stigande 4 kg kan om dygnsdosen fördelas pä tvä dose- ringstillfällen följande schema vanligen tillämpas:

Dygns- Lämplig Vikt dosering förpackning för 10 kg dosgranulat mg/kg/dygn dagars behandling

4 - 7 'fix2 29-50 1x30 8-14 1x2 29-50 1x30 15-24 2x2 33-53 2x30 25-34 3x2 35-48 2x30

Inträffar gastrointestinala problem rekommenderas upp­ delning av dygnsdosen pä 3 eller 4 adminisireringsiillfäl- len. För vuxna och bam över 35 kg ges 3 dosgramilat 3 ^ n g e r per dygn. Optimal absorption erhälles om dosen intages omedelbart före mältid.

Interaktion. Se Ab b o tic intabletter. Förpackningar och prlMr. Dosgranulat 200 mg

30 st 55:30

[image:8.595.101.481.503.692.2]
(9)

PressFreq words outside the scope o f SOB

P ro p e r n o u n s 1,433 (8 0 ,1 % ) A b b re v ia tio n s 137 (7 ,7 % ) C o m p o u n d s 127 (7 .1 % ) N u m e ric a l e x p re ssio n s 4 5 (2 ,5 % ) D e riv a tiv e s 2 0 (1.1% ) F o re ig n w o rd s 17 (0 .9 % ) S y n ta g m a tic w o rd s 5 (0 ,3 % ) P a rtia l p h ra se s 4 (0,2% ) In fle ctio n a l fo rm s 2 (0.1% )

T o ta l 1 ,7 9 0 (1 0 0% )

Table 7. Kinds o f uncovered types in P r e s ^ e q

P r o p e r n o u n s . T h e d o m in a tin g c a te g o ry is th a t o f th e p r o p e r n o u n s (in c lu d in g a sm a ll n u m b e r o f p ro p e r n o u n a b b re v ia tio n s, e.g. A B F , A IK , D N , D D R , K F U M ). N o n o rm a liz a tio n o f s p e llin g v a ria tio n h a s b e e n c a rrie d o u t, so far, so e a c h a p p e a ra n c e h a s b e e n c o u n te d as an in d e p e n d e n t u n it, e.g . Arvia, anna, and ANNA-, E rik an d E ric; B e rn a rd an d B ernhard, B en g tsso n ,B en g tso n , a n d B -son, L idingö an d L id -0 etc. R o u g h ly , h a lf th e n u m b e r o f th e p ro p e r n a m e s re fe r to p e o p le (a p p ro x im a te ly , 4 4 0 first n a m e s an d 2 8 0 se co n d n a m e s.)

T h e h ig h n u m b e r o f p ro p e r n o u n s , in sp e c ific , p e rso n a l p r o p e r n o u n s , se e m s to b e a c h a ra c te ris tic fea tu re o f n e w s p a p e r tex t. M o st o f th e p ro p e r n o u n s th a t w e fo u n d w ill b e e n te re d in to th e d ic tio n a ry w ith a m a rk in g o f th e ir o rig in (P re ssF re q ) as a c lu e to fu tu re w o rk o n d o m ain .

A b b r e v ia tio n s . T h e a b b re v ia tio n c a te g o ry c o m p rise s a b b re v ia tio n s (e x c l. th o se o f th e p ro p e r n o u n s) in v a rio u s o rth o g ra p h ic sh a p es, e.g . b l.a an d bl_a, F E B ,fe b . an d fe b r , k r an d kr. e tc . T h e y w ill all b e in c lu d e d in th e co re o f o u r d ic tio n a ry (se e Ö stlin g , th is v o lu m e ), a n d s o m e o f th e m b e tre a te d as fu n c tio n al c o re p h ra s e s (S å g v a ll H e in e t al. 1990) an d re p re s e n te d in th e d ic tio n a ry as

su ch , e.g. b l a an d bl. a., d. v. s. a n d d v s etc . T h e fig u re s p re s e n te d in ta b le 9 a te , h o w e v e r, b a se d o n in d iv id u al te x t w o rd s, fo r in sta n c e bl, bl., a, a. etc.

C o m p o u n d s . A s is w e ll-k n o w n , th e S w e d ish c o m p o u n d s m a k e u p an o p e n c a te g o ry , a n d in ta b le 10 w e p re s e n t a n o v e rv ie w o f th e d iffe re n t k in d s o f P re ss F re q c o m p o u n d s th a t w e re fo u n d to b e o u tsid e th e p re s e n t sc o p e o f th e S M U d ictio n a ry .

T y p e o f c o m p o u n d

N N - N N - » N N N L - N N - + N N N N -A V A V A V -N N N N N L -A V - » A V N L -N N -N N - > N N A V -V B - > V B N N -N N - » A B P - N N - » A B P - N N - > N N P - V B - » V B N N -N N -V B - > V B

T o ta l

N o o f m e m b e r s

9 7

10

6

4

3

2

2

E x a m p le s

a rb e ts u p p g ifte r, h ä ls o -a n d r-a ^ l-a ts , 3 0 -t-a le t m e d e ls to ra , n o rd ö stra n y p ris

5 0 -å rig S O -årsåld em n y b y g g d a fö rh o p p n in g s v is h ä ro m å re t ö v e rå k la g a re e fte ra ru n äld te x t-T V -T e x ta t

[image:9.595.101.342.139.276.2]

129

(10)

F o llo w in g B lflb eig (1 9 8 8 ) w e re fe r th e p a rtic ip le s to th e v e rb c a te g o ry . F u rth e r, p re p o sitio n s, and a d v e rb s, a re tre a te d as m e m b e rs o f o n e c o m m o n c a te g o ry (in th e c o m p o u n d c o n te x t), d e n o te d P.

M a n y o f th e p ro d u c tiv e c o m p o u n d ty p e s are in d ic a tiv e o f d o m a in (ec o n o m y , p o litic s, so cial s e c u rity , sp o rt, c u ltu re , w e a th e r, T V an d b ro a d c a stin g ). T h e e ffe c t o f c o m p o u n d in g o n d o m a in is a n im p o rta n t iss u e in o u r fu tu re w o rk o n a d o m a in -se n sitiv e e x te n sio n o f th e d ictio n a ry . It is o n e o f th e c rite ria th a t sh o u ld b e ta k e n in to a c c o u n t w h e n c o n sid e rin g a ru le -b a se d (as o p p o se d to a le x lc a liz e d ) tre a tm e n t o f c o m p o u n d s.

N u m e r i c a l e x p re s s io n s . T h e m e m b e rs o f th e n u m e ric a l c a te g o ry a re 3 9 e x p re ssio n s c o n sistin g o f A ra b ia n n u m e ra ls , an d h y b rid s o f n u m b e ra ls, sp e cia l s ig n s, a n d s in g le le tte rs, s u c h as 0 3 111 19840, 1 4 3 0 , 2 5 :E, a n d D 0502. (S o m e R o m a n n u m e ra ls w e re a lso fo u n d , e.g . V ll, b u t refe rre d to th e p r o p e r n o u n c a te g o ry a s c a n d id a te s fo r (p arts o f p h ra sa l) p r o p e r n o u n s.)

D e r iv a tiv e s , M o s t o f th e d e riv e d w o rd s o u tsid e th e sc o p e o f S M U (1 4 o f 2 0 ) c a n b e fo u n d in S v en sk O rd b o k s a s m o rp h o lo g ica l exam ples (se e ta b le 11). T h is m e a n s , th a t th e ir e x iste n c e is c o n firm e d , a n d th a t th e ir m e a n in g s (d e fin itio n s ) sh o u ld b e d e riv a b le fro m d ie d e fin itio n s o f th e w o rd s G exem es) th a t th e y illu s tra te . W h e n a le m m a h a s m o re th a n o n e le x e m e (d e fin itio n ), th e m o rp h o lo g ic a l e x a m p le te lls u s , o n w h a t d e fin itio n th e m e a n in g o f th e d e riv e d w o rd sh o u ld b e b a sed , a t le a st p rim a rily (se e osäker, in ta b le 11). F iv e d e riv a tiv e s a re n o t p re se n te d as m o rp h o lo g ic a l e x a m p les, b u t d e riv e d fro m o n e -le x e m e le m m a s, an d so , th e d e fin itio n o n w h ic h to b a se th e ir d e riv e d m ean in g is u n iq u e ly d e te rm in e d . T h e re m a in in g c a se , h o w e v e r, w ill c a u se o v e rg e n e ra tio n , spelm åssigt is d e riv d fro m a n o u n w ith 9 d e fin itio n s an d an a d je c tiv a l su ffix w ith 2 d e fin itio n s; th e d e riv a tio n a l p o w e r o f th e le x e m e is in n o w a y c o n stra in e d , an d w e wiU h a v e to c o n s id e r th em all e q u a lly w ell fit a s b a se s o f th e d e riv e d w o rd s. (In all, th e re a re 39,8 3 1 m o rp h o lo g ic a l e x a m p le s in S O B .)

T y p e E n t r y in S O B M o r p h , e x .

a v v e c k lin g a v v e c k la v e rb + a v v e c k lin g e n

m o b b n in g m o b b a v e rb + u tv is n in g u tv is a v e rb

u tv is n in g a r

sä n k n in g s ä n k a v e rb + (1s t le x e m e) p e n s io n e rin g p e n s io n e r a v e rb - ( 1 le x e m e )

p e n s io n e rin g e n

e n ig h e t e n ig ad j. +

o s ä k e rh e t o s ä k e r adj. + (3 rd le x e m e ) sk ic k lig h e t s k ic k lig adj. +

tro v ä rd ig h e t t r o v ä r d i g adj. - ( 1 le x e m e ) ö p p e n h e t ö p p e n adj. -f (4 th lexem es] fö rfa tta rin n a n f ö r f a t t a r e subsL

so c ia ld e m o k ra tis k s o c ia l d e m o k r a t subsL + so c ia ld e m o k ra tis k a

m ittfä lta re m i t t f ä l t subsL - 6 ( 1 le x e m e ) m ittfä lta re n

[image:10.595.101.404.406.657.2]

s p e lm ä s s ig t s p e l su b st. - (9 le x e m e s) - m ä s s ig - ( 2 le x e m e s )

Table 9. Uncovered derivatives in PressFreq

(11)

p h ra se o lo g ic a l e x p re ssio n s, an d so th e y w o n ’t b e e n te re d a s in d iv id u a l e n trie s in to th e d ic tio n a ry .

glasnost, a lo n e , w ill b e e n te re d in to th e S M U d ic tio n a ry , a n d m a rk e d w ith re s p e c t to o rig in (P ressF req ).

S y n ta g m a tic w o r d s . F o u r m is s in g ty p e s a re e x a m p le s o f v a ry in g w ritin g c o n v e n tio n s , i.e. ivä g (cf.

i väg), g o d n a tt (cf. g o d natt), långtifrån (cf. lå n g t ifrån), fra rn fö ra llt (cf. fr a r r fö r allt). T o th e sa m e c a te g o ry w e re f e r d ie c o llo q u ia l gom iddag (cf. g o d m id d a g .) 'The o n e w o rd v a ria n ts w ill a ll b e in c lu d e d in d ie S M U d ic tio n a ry (th e la st o n e m a rk e d a s c o llo q u ia l).

P a r t i a l p h m s e s . F o u r m is s in g ty p e s a re o ld in fle ctio n a l fo rm s a p p e a rin g in p h ra s e o lo g ic a l e x p re s s io n s o n ly , i.e. go d o (till g o d o [to s o m e o n e ’s c re d it etc .]; i g o d o [a m ic a b ly e tc .] ), sjö ss (till sjö ss [at se e]), vintras (i v in tra s [last w in te r]), an d so m ra s (i so m ra s [la s t su m m e r]).(n 7 / g o d o a n d

tillsjö ss c a n b e fo u n d a m o n g th e e x a m p le s o f lg o d a n d ^ ^ ö .) T h e f o u r e x p re s s io n s w ill b e e n te re d in to th e S M U d ic tio n a ry as p h rase s.

I n f le c tio n a l fo rm s . T w o u n d e riv e d in fle c tio n a l fo rm s w e re fo u n d to b e im a c c o u n te d fo r, i.e. m å st

(su p in e o f th e v e rb m å ste [m u st]) an d törs (p re se n t te n s e o f th e v e rb tör/as o r to rd ja s [d are ]), e v e n th o u g h törs is p a rt o f a n e x a m p le o f th e v e rb . F re q u e n t a s th e y a re fo u n d to b e in n e w s p a p e r te x t, b o th fo rm s w ill b e in c lu d e d in th e S M U d ic tio n a ry .

Conclusions

T h e S M U a n a ly se r, o p e ra tin g o n S w e d ish te x t, w o rk s w e ll a s a to o l f o r d istin g u is h in g b e tw e e n g e n e ra l v o c a b u la ry , as d e fin e d b y th e le m m a e n trie s o f S v e n sk O rd b o k (i.e. its e x p lic itly d e fln e d v o c a b u la ry ), a n d w o rd s o u tsid e th a t sc o p e. A s a re s u lt o f th e m o rp h o lo g ic a l a n a ly sis, m e m b e rs o f th e g e n eral v o c a b u la ry a re id e n tifie d and d e sc rib e d in te rm s o f le m m a , p a rt o f s p e e c h , a n d fo rm , an d h o m o g ra p h ie s a re re c o g n iz e d in a c c o rd a n c e w ith th e le m m a d istin c tio n s m a d e in S O B .

'The p ro c e ssin g o f fo u r d iffe re n t S w e d ish m a te ria ls h a s sh o w n , th a t th e S O B le x ic a l c o v e ra g e (in te rm s o f ty p e s) ra n g e s fro m 8 2 % to 2 5 % . T h e h ig h e s t fig u re s a re v a lid f o r h ig h fre q u e n c y w o rd s o f n e w s p a p e r tex t, P re ss F re q , and th e lo w e st o n e s f o r h ig h ly s p e c ia liz e d p h a rm a c o lo g ic a l te x t. In b e tw e en w e fin d so m e g o o d 7 0 % re la tin g to g e n e ra l L S P (L a n g u a g e f o r S p e c ia l P u rp o s e ) t e x t

T h e w o rd s o u tsid e th e sc o p e o f th e a n a ly s e r in d ic a te d o m a in an d ty p e o f th e a n a ly s e d t e x t 'The

b ig am o u n t o f n u m e ric a l e x p re s s io n s (an d h y b rid s o f n u m e ra ls , sp e c ia l s ig n s an d s in g le le tte rs ), i.e. 2 7 % o f th e u n a n a ly z e d w o rd s, s ta n d o u t in th e p h a rm a c o lo g ic a l te x t a s d o e s th a t o f p r o p e r n o u n s

(c lo se to 80% o f th e u n a n a ly z e d w o rd s) in P re ssF re q .

T h e P re ssF re q z e ro -p a rse s h a v e b e e n e x a m in e d in so m e d e ta il, an d c a te g o riz e d in to : p ro p e rn o u n s , a b b re v iatio n s, c o m p o u n d s, n u m e ric a l e x p re ssio n s, d e riv a tiv e s, fo re ig n w o rd s, s y n ta g m a tic w o rd s, p a rtial p h ra se s, an d s im p le in fle c tio n a l fo rm s. A b b re v ia tio n s , s y n ta g m a tic w o rd s , p a rtia l p h ra se s, a n d in fle ctio n a l fo rm s form a , b a s ic a lly , c lo s e d s e t o f a g e n e ra l c h a ra c te r (in a ll, le s s th a n 1 SO ite m s). ’T hey w ill all b e in c lu d e d in th e g e n e ra l p a rt o f th e S M U d ic tio n a ry (a s o n e -w o rd u n its o r a s p h ra se s).

M o s t o f th e fo re ig n w o rd s (e x c e p t f o r glasnost) se em to b e p a rt o f p h ra s e o lo g ic a l e x p re s s io n s (p ro p e rn o u n s ), an d so far, th e y w ill b e d isre g a rd e d , b u tg /o r n o s rb e e n te re d in th e d ic tio n a ry , m a rk e d b y o rig in (P re ssF re q ), a s a firs t c lu e to d o m a in . T h e p ro p e r n o u n s , fo rm in g a b ig , b u t, b a s ic a lly , c lo se d an d d o m a in -re la te d c a te g o ry , w ill b e h a n d le d in th e sa m e m a n n e r. ’The n u m e ric a l e x p re s ­

sio n s, fo rm in g a n o p e n c a te g o ry , w ill b e h a n d le d b y m e a n s o f ru le s, d e fln e d in th e S M U g ra m m a r (se e S åg v all H e in 1987).

A m o n g th e z e ro -p a rse d e riv a tiv e s, six ty p e s w e re fo u n d , i.e. v e rb -to -n o u n b y m e a n s o f th e su ffix

-ing (th e p ro cess; 8 c a se s), a d j-to -n o u n b y m e a n s o f th e s u fflx -h et (p re s e n c e o f th e propierty; 5 ca se s), n o u n -to -a d j b y m e a n s o f ~isk (th e p ro p e rty ; 2 c a se s), n o u n -to -n o u n b y m e a n s o f th e su fflx

(12)

o f -m ä ssig (a c c o rd in g to th e n o u n etc.; 1 c ase). T h e first fo u r ty p e s w ill b e h a n d le d b y m e a n s o f w o rd fo rm a tio n ru le s in th e g ra m m a r, w h e re a s th e re m a in in g tw o c a se s w ill b e e n te red in to the d ic tio n a ry , m a ik e d b y o rig in . T h is tre a tm e n t is sup fio rted b y S O B , p re se n tin g th e first fo u r ty p es a s m o rp h o lo g ic a l e x a m p le s.

T h e m o s t d iffic u lt c a te g o ry to h a n d le is th a t o f th e c o m p o u n d s, b e in g a n o p e n , p ro d u c tiv e c a te g o ry , w ith a c o m p le x se m a n tic s , d o m in a tin g th e z e ro -p a rse s o f g e n e ra l L S P te x t (se e S åg v all H e in 1990). F u rth e r, c o m p o u n d in g h a s a b e a rin g tm d o m ain . In o u r c o n tin u e d w o rk an th e d ic tio n a ry w e w ill a p p ro a c h th e p ro b le m s o f th e c o m p o u n d s fro m th e p o in t o f v ie w o f th e e ffe c t o f c o m p o u n d in g o n d o m a in . T h e m a te ria l p re s e n te d b y th e a p p lic a tio n o f th e S M U a n a ly s e r to te x t o f d iffe re n t ty p e a n d d o m a in is a v a lu a b le s o u rc e f o r s u c h stu d ie s.

Notes

1 C F . S W E T W O L b y K a rls s o n (fo n h c o m .) p e rfo rm in g ru le -b a se d s tru c tu ra l a n a ly sis o f c o m ­ p o u n d s a n d d e riv a tiv e s.

2 T h e se c o n d a ry v o w e l d e le tio n ru le in p a rt o f th e in fle c tio n a l g ra m m a r and in v o k e d b y th e d ic tio n a ry se a rc h p ro ce ss.

3 in d ie s e n se o f d ic tio n a ry stem

4 T e x tu a l fre q u e n c y d a ta w e re n o t at h a n d w h e n th e a n a ly sis w a s c a rrie d o u t, so o n ly lex ical c o v e ra g e c a n b e a c c o u n te d fo r h ere.

5 I n a p ilo t s tu d y o f a fra g m e n t o f (2 ,5 0 0 ) ty p e s) o f D e fV o c th e (5 7 2 ) ty p e s o u tsid e th e sc o p e o f S O B w e re e x a m in e d (se e S å g v a ll H e in 1990).

6 E v e n th o u g h m ittfä lta re d o e s n ’t a p p e a r as a m o rp h o lo g ic a l e x a m p le , th e re la tiv e m ittfä ltssp ela re

d o e s.

References

B lå b e rg , O . 1988. A stu d y o f S w e d is h c o m p o u n d s. U m e å U n iv e rsity . D e p a rtm e n t o f G e n e ral L in g u is tic s . R e p o rt N o 2 9 .

P A SS. F a rm a ceu tiska sp e c ia lite te r i S verig e. 1985. [R ia rm a c e u tic a l S p e c ia ltie s in S w ed en .] L IN F O .

G e lle rs ta m , M . 1989. T h e L a n g u a g e B a n k . T h e D e p a rtm e n t o f

C o m p u ta tio n a l L in g u is tic s . U n iv e rs ity o f G o th e n b u rg .

H e llb e rg , S . 1978. T he m o rp h o lo g y o f p resen t-d a y S w edish. S to c k h o lm . K a rls s o n , F . S W E T W O L : A c o m p re h e n s iv e m o rp h o lo g ic fd a n a ly z e r fo r S w ed ish . F o rth c o m in g .

ö s tl i n g , A . A S w e d is h C o re V o c a b u la ry fo r M a c h in e T ra n sla tio n . T h is v o lu m e .

S h ie b e r, S. 19 8 6 . A n in tro d u c tio n to u n ific a tio n -b a s e d a p p ro a c h e s to g ra m m a r. C S L I. L e c tu re N o te s N u m b e r 4 . S jö g re e n , C . 1988. C re a tin g a d ic tio n a ry fro m a le x ic a l d a ta b a se . In : S tu d ies in . co m p u ter-a id ed lexico lo g y. S to c k h o lm . P p . 2 9 9 -3 3 8 .

S å g v a ll H e in , A . 1987. P a rsin g b y m e a n s o f U p p s a la C h a rt P ro c e s s o r, (U C P ). In : L . B o le (ed .)

N a tu ra l la n g u a g e p a rsin g system s. B e rlin & H e id e lb e rg . P p . 2 0 3 -2 6 6 .

S å g v a ll H e in , A . 1988. T o w a rd s a c o m p re h e n s iv e S w e d ish p a rs in g d ic tio n a ry . In: S tu d ies in co m p u ter-a id ed lexico lo g y. S to c k h o lm . P p . 2 6 8 -2 9 8 .

(13)

Sflgvall H ein , A . T h e S M U in fle c tio n a l g ra m m a r. U p p s a la U n iv e rsity . D e p a rtm e n t o f L in g u istic s. F o rth c o m in g .

S åg v all H ein , A . & A h re n b e rg , L . 1985. A p a rs e r fo r S w ed ish . S ta tu s R e p o rt fo r S v e .U c p . J u n e 1985. U p p sa la U n iv e rsity . C e n te r f o r C o m p u ta tio n a l L in g u is tic s . U C D L -R -8 5 -2 .

S å g v a ll H e in , A . & S jö g re e n , C . 1991. E tt s v e n sk t sta m le x ik o n f ö r d a ta m a s k in e ll m o rfo lo g is k a n a ly s. E n ö v e r s ik t [A S w e d ish ste m d ic tio n a ry f o r ctH n p u tatio n al m o rp h o lo g ic a l a n a ly sis. A n o v e rv ie w .] In : M . T h e la n d e r e t al. (e d s .) S ven ska n s b eskrivn in g 18. L u n d . P p . 3 4 8 -3 6 0 .

S å g v a ll H e in , A ., ö s tlin g , A . & W ik h o lm , E . 1990. P h ra se s in th e C o re V o c a b u la ry . U p p s a la U n iv e rsity . C e n te r fo r C o m p u ta tio n a l L in g u istic s.

S å g v a ll H e in , A ., S ta rb ä c k , P . &. W ik h o lm , E . A p h a rm a c o lo g ic a l ste m d ic tio n a ry b a s e d o n F A S S , P h a rm a c o lo g ic a l S p e c ia ltie s in S w e d e n 1985. U p p s a la U n iv e rs ity . D e p a rtm e n t o f L in g u istic s. F o rth c o m in g .

S ven sk O rdbok. 1986. [A D ictio n a ry o f Sw edish.] S to c k h o lm .

T e le m a n , U . 1974. M a n u a l fö r b eskrivn in g a v ta la d o ch skriven sven ska . L tm d .

A n n a S å g v a ll H e in U p p sa la U n iv e rsity D e p a rtm e n t o f L in g u istic s C o m p u tatio n a l L in g u istic s B o x 5 1 3

S - 7 5 1 2 0 U p p sa la

Figure

Figure 2. Analyses of the adjective festlig [festive; grand]
Table l.An Overview of the attributes assigned by the SMU analyser
Table 2. Results of the application of the SOB-based SMU analyser to four sets of Swedish text
Table 4. Results of the application of the SOB-based SMU analyser to ProfProse
+4

References

Related documents

These methods extract features from the fingerprint images in order to determine liveness and, according to the number of images exam- ined, they are called dynamic (the

Cilji empiričnega dela naloge so: • preučiti, kako so se v preučevanem obdobju 2007–2017 spreminjali dejavniki kreditnih standardov – to so tekmovalnosti med bankami,

The issue of product liability is probably more burning regarding fully autonomous vehicles than conventional motor vehicles. In a situation where damage is caused by

Screening Information Data Set (SIDS) of OECD High Production Volume Chemicals Programme, 4, (1994)...

Exposition time: 72 h Method: Algae, Growth Inhibition Test The values mentioned are those of the active ingredient... Toxicity to bacteria : No

from large quantities of data by posing automati-. While various forms of data mining have existed for quite a while, it is only during the past decade that data mining has emerged as

• Cost of Ownership • Customer Life Cycle • Emissions Vehicle Body Cooling System Climate Control Chassis Powertrain Electrical Temperature Contol Subsystem Heat Dissipation

Without a teacher or other formal training in traditional Chinese medicine, many Western martial artists are left with little more than blind trust that the book in front of