1 Introduction
1.4 Thesis structure
This thesis is structured in six chapters. Chapter 2 includes a review o f systems processing collateral text for v id eo retrieval, suggesting the potential o f using C ross-D ocum ent C oreference applications for im proved v id eo retrieval. Section 2.1 presents current vid eo retrieval systems processing collateral text, resulting in the con clu sion that m ore collateral texts can be used fo r richer vid eo annotations and im proved v id eo retrieval. M ore advanced inform ation extraction solutions can b e em ployed fo r vid eo retrieval, such as cross-docum ent coreference. G iven the range o f collateral texts fo r film s describing the same stories, Section 2.2 presents the current state
o f the art in cro ss-d ocu m e n t c o re fe re n ce algorithm s b etw een the sam e text types. T h ese a lgorithm s are m anually sim ulated and evaluated o n a pair o f differen t text types, such as p lot sum m ary and au d io description , su ggestin g that they n eed to b e exten ded, as the narrative d iscou rses c o n v e y e d in collateral texts fo r film s can d iffe r sign ifica n tly in the am ount and kinds o f in form a tion they in clud e, although they tell the sam e story.
T ex ts d escrib in g a s p e c ific d om a in in clu d e sp ecia l terms to refer to it. A s collateral texts for film s d escrib e film content, they m a y in clu d e lex ica l regularities. C hapter 3 investigates w hether le x ica l regularities are in clu d ed b y collateral texts fo r film s. T o test this h yp oth esis, w e analyse and com p a re the lan gu age u sed b y au dio d escrip tion and p lo t sum m aries. In S ection 3 .1, a corpu s o f 45 au d io descrip tion scripts fo r film s has b e e n gathered, spread o v e r n ine genre categories a cco r d in g to the su ggestion s o f expert au dio describers, b a sed on the kinds o f in form ation in clu d ed . T h e w o rd freq u en cy analysis o f the language o f audio d escrip tion sh ow s that audio d escrip tion in clud es frequent w o rd s referrin g to entities, e.g. lo ca tio n s and characters and w o rd s referrin g to events related to sight and m otion . In S e ctio n 3.2, a corp u s o f a 111 p lo t sum m aries has b e en colle cte d , in clu d in g the sam e and additional film s to the corp u s o f audio description , d iv id e d into the sam e genre categories. T h e in vestigation o f the lan gu age used in p lo t sum m aries sh o w s that these texts talk abou t central characters, their fa m ily relations, their g oa ls o r b e lie fs and m a jor events o f the story p lot. T h e com p a rison o f the tw o co rp o ra in S e ction 3.3 presents the ch a llen g e o f cro ss-d o cu m e n t co re fe re n ce b etw een tw o different text types fo r film s. B o th p lo t sum m ary and au d io d escrip tion use the sam e w o rd s to refer to characters but d ifferen t w o rd s to refer to the sam e events.
W e h yp oth esise that sin ce b o th texts use le x ica l regularities to d escrib e film content, they m a y present d ifferen t lex ica l regularities to refer to the same frequent events. C hapter 4 discu sses le x ica l regularities in the a u d io d escrip tion referring to frequent p lo t sum m ary events. T o investigate this hypoth esis, first the ten m ost frequent events are id en tified in a p lo t sum m ary corp u s o f 54 texts and the sam e events are id en tified in an audio descrip tion corp u s narrating the sam e film s, S ection 4 .1 . T h e correlated fragm ents are also characterised a cco rd in g to cro ss d ocu m en t structure th eory relations and the event aspect. A m eth od to id en tify lex ica l regularities is presented in S ection 4 .2 , d etectin g w o rd s o ccu rrin g m ore frequ en tly in the corp ora than in the general language in clu d ed in the British N ational C orpu s and appearing in m ore than 5 0 % o f the correlated au dio descrip tion fragm ents. T h e results in S ection 4.3 s h o w that six ou t o f ten m ost frequent p lo t sum m ary events presented le x ica l regularities in au dio d escrip tion , w hereas the rest w ere fou n d to b e irregular in the w a y they are d escribed . T h e d iscu ssion in 4 .4 fo cu s e s on w hether the findings can b e u sed fo r general solu tion s to cross-d ocu m e n t co re fe re n ce betw een d ifferen t text types or o n ly con cern a c lo s e d set o f events.
Chapter 1. Introduction
A n evaluated solu tion fo r C ro ss-D o cu m e n t C o re fe re n ce on d ifferen t types o f collateral texts fo r film s is presented in C hapter 5. F ou r heuristics fo r C ro ss-D o cu m e n t C o refe re n ce, d erivin g fr o m the co rp o ra analysis in Chapters 3 and 4 , are d escrib ed in S ection 5 .1 , exten d in g current algorithm s fo r the sam e text types. In S e ctio n 5 .2 , the g o ld standard data set o f cross-d o cu m e n t co re fe re n ce pairs populated b y the answ ers o f fiv e annotators sh ow s that cro ss-d ocu m e n t co re fe re n ce is a hard and tim e-con su m in g task fo r hum ans, w h o re v ie w e d and co n so lid a te d their answ ers to fin a lly p ro d u ce the g o ld standard data set. T h e evalu ation o f the fou r heuristics against the g o ld standard data set in S ection 5.3 sh o w s that autom ating cro ss-d o cu m e n t co re fe re n ce b etw een differen t kinds o f texts is a ch a llen gin g task, w h ich can at least a ch iev e a R e ca ll o f 3 3 % and a P recision o f 5 0 % . T h e d iscu ssion in 5 .4 sh ow s that a com b in a tio n o f all heuristics con sid e rin g the event tem poral aspect is the best p o s sib le a p p roach su ggested b y this thesis and suggests w a y s o f im p rovin g the heuristics.
C hapter 6 in cludes the sum m ary o f w hat steps have b e e n fo llo w e d and w hat has b een d isco v e re d in the p rev iou s chapters, S ection 6.1, and future opportunities, S ection 6.2. T h e future su ggestion s fo cu s on a p p lyin g the heuristics o n other kinds o f collateral texts fo r film s, as w e ll as other kinds o f d ifferen t text genres d escrib in g the sam e story. Future recom m en d a tion s also co n ce rn cro ss-m e d ia co re fe re n ce w h ich takes advantage o f cross-d o cu m e n t co re fe re n ce fo r v id e o retrieval and sum m arisation. F in ally the fin d in gs in Chapters 3 and 4 m a y b e o f im p ortan ce fo r narratologists and c o g n itiv e scientists, co n cern in g the w a y events are exp ressed and co n c e iv e d , as w e ll as artificial in telligen ce application s co n ce rn in g event representation fo r co m p u tin g narrative in m ultim edia system s.