Learning to Select a Good Translation

(1)

Dan Tidhar and Uwe Kussner

Technische Universitat Berlin

FachbereichInformatik

Franklinstr. 28/29

D-10587Berlin

Germany

fuk|[email protected]

Abstract

Within the machine translation system

Verb-mobil, translation is performed simultaneously

by four independent translation modules. The

four competing translationsare combined bya

selection module so as to form a single

opti-mal output for each input utterance. The

se-lection module relies on condence valuesthat

are delivered together with each of the

alter-native translations. Since the condence v

al-ues are computed by four independent

mod-ules that are fundamentally dierent from one

another, they are not directly comparable and

needtoberescaledinordertogaincomparative

signicance. In this paper we describe a

ma-chinelearningmethodtailoredtoovercomethis

diÆculty by using o-line human feedback to

determine an appropriate condence rescaling

scheme. Additionally, we describe some other

sources of information that areused for

select-ingbetweenthecompetingtranslations,and

de-scribe the way in which the selection process

relates to qualityof servicespecications.

1 Introduction

Verbmobil (Wahlster, 2000) is a speech to

speech machine translation system, aimed at

handling a wide range of spontaneous speech

phenomena within the restricted domain of

travel planning and appointment scheduling

dialogues. For the language pairs

English-German and German-English, four dierent

translationmethodsareappliedinparallel,thus

increasing the system's robustness and

versa-tility. Since exactly one translation should be

produced for each input utterance, a selection

procedureisnecessary. Inorderto benetmore

from this diversity,the alternative translations

are furthermore combined within the

bound-pound translations. Combining translations

fromdierentsourceswithinamulti-threadMT

systemhasalreadyprovedbenecialinthepast

(FrederkingandNirenburg,1994). Ourpresent

work diersfromthe work reported inthere in

several ways (apart from the trivial fact that

we use `four heads'rather thanthree). Firstly,

weattempt to investigate asystematicsolution

totheincomparabilityofthevariouscondence

values. Secondly, as we deal with speech to

speechratherthantext to text translation,

dif-ferentsegmentationsforeachgiveninputstring

are allowed, making the segment combination

process signicantlymore complicated.

1.1 Incomparability

Eachtranslationmodulecalculatesacondence

value for each translation that it produces.

However, sincethevarioustranslationmethods

are fundamentally dierent from one another,

the resulting condence values cannot be

di-rectly compared across modules. Whereas we

do assume a general correspondence between

condencevaluesandtranslationqualitywithin

each one of the modules, there is no guaranty

whatsoeverthatahighvaluedeliveredbya

cer-tainmodulewouldindeedsignifyabetter

trans-lationwhencomparedwithanothervalue,even

a much lower one, which was delivered by

an-other module. An additional step needs to be

taken in order to make the condence values

comparable withone another.

1.2 Working Hypotheses

It should be noted that one of our working

hypotheses, namely, that condence values do

generally reect translation quality, also

com-pensates to a certain extent for the lack of a

wide range theory of translation, according to

(2)

annotators, sincethe applicable criteria are

di-verse, and at the absence of a comprehensive

translation theory, very often lead to

contra-dicting conclusions. This diÆculty is partially

dealtwithinsection4.1below,butforpractical

reasons we tend to accept the need to rely on

human judgment,partially theoryassisted and

partiallyintuitive,as inevitable. Another

pre-supposition that underlies the current work is

thatthedesirablerescalingcan bewell

approx-imated by means of linear polynomials. This

assumption allows us to remain withinthe

rel-atively friendly realm of linear equations

(al-beit inconsistent), and reects two basic

guid-ing principles: rstly,that the rescaling is

mo-tivated by pragmatical needs, rather than by

descriptive aspirations, and secondly, that it

should notcontradict the presupposed

correla-tionbetweencondenceandqualitywithineach

module, which implies that the rescaling

func-tionsshouldbemonotonous.

2 The Various Translation Paths

The Verbmobil system includes four

indepen-dent translations paths that operate in

paral-lel. The input shared by all paths consists

of sequences of annotated Word Hypotheses

Graphs (WHG), produced by the speech

rec-ognizer. Each translation module chooses

in-dependently a path through the WHG, and a

possible segmentation according to its

gram-mar and to the prosody information (Buckow

et al., 1998). This implies that even though

all translation modules share the same input

datastructure,boththechoseninputstringand

its chosen segmentation may well dier across

modules. Thissection providesthe readerwith

very brief descriptions of the dierent

trans-lation subsystems, along with their respective

condencevaluecalculation methods.

The ali subsystem implements an

exam-ple based translation approach.

Con-dencevaluesarecalculatedaccordingtothe

matching-level of the input string with its

counterpartsinthedatabase.

The stattrans (Och et al., 1999)

sub-system is a statistical translation system.

Condencevaluesarecalculatedaccording

translationmodel.

The syndialog(Kippet al.,1999)

subsys-temisadialogueactbasedtranslation

sys-tem. Here the translation invariant

con-sists of a recognizeddialogue act, together

with its extracted propositional content.

The condencevalue reects the

probabil-ity that the dialogue act was recognized

correctly,togetherwiththeextenttowhich

the propositional content was successfully

extracted.

The deep translation path in itself

con-sists of multiple pipelined modules:

lin-guisticanalysis, semantic construction,

di-alogue and discoursesemantics,and

trans-fer (Emele and Dorna, 1996) and

gener-ation (Kilger and Finkler, 1995)

compo-nents. The transfer module receives

dis-ambiguation informationfrom the context

(Koch et al., 2000) and dialogue modules.

Thelinguisticanalysispartconsistsof

sev-eral parsers which,in turn,also operate in

parallel(Rulandetal.,1998). Theyinclude

an HPSG parser, a Chunk Parser and a

statistical parser,all producingdata

struc-tures of the same kind, namely, the

Verb-mobilInterfaceTerms(VITs)(Schiehlenet

al., 2000). Thus, withinthe deep

process-ing path, a selection problem arises,

simi-lar to thelarger scale problem of selecting

the best translation. This internal

selec-tion process withinthedeep pathis based

on aprobabilisticVIT model. Condence

values withinthe deeppath are computed

according totheamountofcoverage ofthe

inputstring by theselected parse, and are

subject to modicationsas a byproduct of

combining and repairing rules that

oper-ate withinthesemantics mechanism.

An-other source of information which is used

for calculating the `deep' condence

val-ues is the generation module, which

es-timates the percentage of each transfered

VIT which can be successfully realized in

thetarget language.

Althoughallcondencevaluesarenallyscaled

to the interval [0;100] by their respective

(3)

tudesthataredirectlycomparablewithone

an-other. As expected, our experience has shown

that when condence valuesare taken as such,

withoutanyfurthermodication,their

compar-ative signicance isindeedvery limited.

3 The Selection Procedure

In order to improve their comparative

signi-cance, the delivered condencevalues c(s), for

each given segment s, are rescaled by linear

functions oftheform:

ac(s)+b : (1)

Note that each input utterance is decomposed

into several segments independently,and hence

potentiallydierently,byeachofthetranslation

paths. The dierent segments are then

com-bined to form a datastructure which, by

anal-ogy to Word Hypotheses Graph, can be called

Translation AlternativesGraph(TAG).Thesize

ofthisgraphisboundby4 n

,whichisreachedif

all translationpaths happen to choosean

iden-tical partition into exactly n segments. The

following vectorial notation was adopted in

or-dertosimplifythesimultaneousreferencetoall

translation paths. The linear coeÆcients are

represented by the following four-dimensional

vectors:

~a= 0

B

@ a

al i

a

syndial og

a

stattrans

a

deep

1

C

A ~

b= 0

B

@ b

al i

b

syndial og

b

stattrans

b

deep

1

C

A :

(2)

Single vector components can then be referred

tobysimpleprojections,ifwerepresentthe

dif-ferenttranslationpathsasorthogonal unit

vec-tors,sothat~sdenotesthevector corresponding

to themodule by which s had been generated.

The normalized condence is then represented

by:

(~ak(s)+ ~

b)~s : (3)

In order to express the desirable favoring of

translations with higher input string coverage,

the compared magnitudes are actually the

(rescaled) condence values integrated with

respect to the time axis, rather than the

(rescaled) condence values as such. Let ksk

be the length of a segment s of the input

andSeq2 SEQanyparticularsequence.

We dene the normalized condence of

Seq as follows:

C(Seq)= X

s2Seq

((~ac(s)+ ~

b)~s)ksk

This induces the following order relation:

seq

1

C seq

2 def

= C(seq

1

)C(seq

2 )

Based on this relation, we dene the set B of

best sequencesasfollows:

B(SEQ) = fseq2SEQ jseqis a maximum

elementin (SEQ;

C

)g : (4)

The selection procedure consists in generating

thevariouspossiblesequences, computingtheir

respectivenormalizedcondencevalues,and

ar-bitrarily choosing a member of the set of best

sequences. It should be noted that not all

sequences need to be actually generated and

tested, due to the incorporation of Dijkstra's

wellknown \Shortest Path" algorithm (e.g. in

(Cormenet al.,1989)).

4 The Learning Cycle

Learningthe rescalingcoeÆcients isperformed

o-line, and should normally take place only

once, unlessnew trainingdata is assembled,or

new criteria for the desirable system behavior

have beenformulated. The learningcycle

con-sistsofincorporatinghumanfeedback(training

set annotation) and nding a set of rescaling

coeÆcients so asto yield a selection procedure

withoptimalorclosetooptimalaccordwiththe

human evaluation. Atrainingset, consisting of

test dialogues that cover the desirable system

functionality, is fed through the system, while

separatelystoring theoutputsproduced bythe

various translation modules. These are then

subject to two phases of annotation (see

sec-tion 4.1), resulting in a set of `best' sequences

oftranslatedsegmentsforeach inpututterance.

The next task is to determine the

appropri-ate linear rescaling, that would maximize the

accord between the rescaled condence values

andthepreferencesexpressedbythose`best'

se-quences. Inordertodothat,werstgeneratea

large set of inequalities as described in section

(4)

As mentioned above, evaluating alternative

translations is a complex task, which

some-times appears to be diÆcult even for specially

trained people. When one alternative seems

highlyappropriateandalltheothersareclearly

wrong,avigilantannotatorwouldnormally

en-counter very little diÆculty. But when all

op-tionsfallwithinthereasonablerealmand dier

only slightly from one another, or even more

so, when all options are far from perfect, each

having its uniquely combined weaknesses and

advantages| whatcriterionshouldbeusedby

the annotator to decide which weaknesses are

more crucialthantheothers? Ourhuman

feed-backcycleistwofold: rst,theoutputsofthe

al-ternative translations paths are annotated

sep-arately, so as to enable the calculation of the

`o-line condence values' as described below.

For each dialogue turn, all possible

combina-tions of translated segments that cover the

in-putarethengenerated. Foreachofthose

possi-ble combinations, an overall o-line condence

value is calculated, in a similar way to which

the `on-line' condence is calculated (see

sec-tion 3), leaving out the rescaling coeÆcients,

but keeping the time axis integration. These

segmentcombinationsarethenpresentedtothe

annotators for a second round, sorted

accord-ingtotheirrespectiveo-linecondencevalues.

Theannotatorisrequestedat thisstage merely

to select the best segment combination, which

would normally be one of the rst to appear

on the list. The rst annotation stage may be

described as `theory assisted annotation', and

thesecondisitsmoreintuitivecomplement. To

assist the rst annotation round we have

com-piledasetofannotationcriteria,anddesigneda

specializedannotationtoolfortheirapplication.

These criteria direct the annotator's attention

to`essentialinformationitems',andrefertothe

number of such items that have been deleted,

inserted or maintained during the translation.

Other criteria are the semantic and syntactic

correctness of the translated utterance as well

as those of the source utterance. The separate

annotationofthese criteriaallowsusto express

the`o-line condence'astheirweighted linear

combination. Thedierentweightscan beseen

asimplicitlyestablishingamethod of

quantify-tactical correctness, or the transmission of all

essentialinformationitems. Usingthevague

no-tionof `translationquality'asa singlecriterion

wouldhavedenitelycaused agreatdivergence

inpersonalannotationstyleandpreferences,as

can bevery wellexemplied by thecase of the

dialogueactbasedtranslation: somepeoplend

wordbywordcorrectnessofatranslationmuch

more important than the dialogue act

invari-ance, while others argue exactly the opposite

(Schmitz,1997),(Schmitz and Quantz, 1995).

4.2 Generating Inequalities

Once the best segment sequences for each

ut-terancehavebeendeterminedbythecompleted

annotation procedure, a set of inequalities is

createdusingthelinearrescaling coeÆcientsas

variables. Thisisdonesimplybystatingthe

re-quirementthatthenormalizedcondencevalue

of the best segment sequence should be better

than the normalized condence values of each

one of the other possible sequences. For each

utterance with n possible segment sequences,

thisrequirementisexpressedby(n 1)

inequal-ities. It is worth mentioning at this point that

itsometimes occurs duringthe second

annota-tionphase,thatnumeroussequencesrelatingto

thesameutteranceareconsidered`equallybest'

by the annotator. In such cases, when not all

sequences are concerned but only a subset of

all possible sequences, we have allowed the

an-notator to select multiple sequences as `best',

correspondingly multiplying the number of

in-equalities that are introducedbythe utterance

in question. These multiple sets are known in

advanceto be inconsistent, as they in fact

for-mulate contradictory requirements. Since the

optimization procedure attempts to satisfythe

largest possiblesubset of inequalities,the

logi-calrelationbetweensuchcontradictingsets can

be seenas disjunctionratherthanconjunction,

and they do seem to contribute to the

learn-ing process,because thedierent `equallybest'

sequences are still favored in comparison to all

other sequencesrelatingto thesame utterance.

The overall resulting set of inequalities is

nor-mallyverylarge,andcanbeexpectedtobe

con-sistent only in a very idealized world, even in

the absence of `equallybest' annotations. The

(5)

which is the fact that the original condence

values, as usefulas they may be,are

neverthe-less far from reecting the human annotation

and evaluation results, whichare, furthermore,

not always consistent among themselves. The

restof thelearningprocessconsistsintryingto

satisfyasmanyinequalitiesaspossiblewithout

reaching a contradiction.

4.3 Optimization Heuristics

The problem of nding the best rescaling

co-eÆcients reduces itself, under the above

men-tioned presuppositions, to that of nding the

maximalconsistentsubsetofinequalitieswithin

alarger,mostlikelyinconsistent,setoflinear

in-equalities,and solvingit. In (Amaldiand

Mat-tavelli, 1997), the problemof extracting

close-to-maximumconsistent subsystemsfrom an

in-consistent linear system (MAX CS) is treated

as part of a strategy for solving the problem

of partitioning an inconsistent linear system

intoaminimalnumberofconsistentsubsystems

(MIN PCS). Both problems are NP-hard, but

through a thermal variation of previous work

by (Agmon, 1954) and (Motzkin and

Schoen-berg, 1954), a greedy algorithm is formulated

by (Amaldi and Mattavelli, 1997), which can

serve asan eective heuristic for obtaining

op-timalorneartooptimalsolutionsforMAXCS.

Implementing thisalgorithmintheC language

enabled us to complete the learning cycle by

nding a set of coeÆcients that maximizes, or

at least nearly maximizes, the accord of the

rescaled condence values with the judgment

providedbyhumanannotators.

5 Additional Information Sources

Independently of the condence rescaling

pro-cess,wehavemadeseveralattemptsto

incorpo-rateadditionalinformationinordertorenethe

selection procedure. Some of these attempts,

such as using probabilistic language model

in-formation, orinferringfrom thelogicalrelation

between the approximated propositional

con-tents of neighboring utterances (e.g. trying to

eliminate contradiction), have not been

fruit-ful enough to be worth full description in the

present work. The following two sections

de-scribe two attempts that do seem to be worth

Ourexperienceshowsthatthetranslation

qual-ity that is accomplished by the dierent

mod-ules varies, among the rest, according to the

dialogue act at hand. This seems to be

par-ticularly true for syndialog , the dialogue act

based translation path. Those dialogue acts

that normallytransmitvery littlepropositional

content,orthosethattransmitnopropositional

content at all, are normally handled better

by syndialog compared to dialogue acts that

transmit more information (such as INFORM,

which can in principle transmit any

proposi-tion). The dialogue act recognition algorithm

used by syndialog does not compute the

sin-glemostlikelydialogact,butrathera

probabil-itydistributionofallpossibledialogueacts 1

We

represent thedialogue act probability

distribu-tion fora given segment s by thevector ~

da(s),

where each component denotes the conditional

probabilityof a certain dialogue act, given the

segment s:

~

da(s)= 0

B

@

P(suggestjs)

P(rejectjs)

P(greetjs)

.

1

C

A

: (5)

Thevectors~aand ~

bfromsection3aboveare

re-placedbythematrices ~

Aand ~

B whichare

sim-ply a concatenation of the respective dialogue

act vectors.

Let ~

A s

= ~

A ~

da(s),and ~

B s

= ~

B ~

da(s).

The normalizedcondencevalue,with

incorpo-rated dialogue act informationcan then be

ex-pressed as:

C(Seq)= X

s2Seq ((

~

A s

c(s)+ ~

B s

)~s)ksk : (6)

5.2 Disambiguation Information

Withinthedeeptranslationpath,severaltypes

of underspecication are used for representing

ambiguities (Kussner, 1997), (Kussner, 1998),

(Emele and Dorna, 1998). Whenever an

ambi-guity hasto be resolved in order forthe

trans-lation to succeed, resolution is triggeredon

de-mand (Buschbeck-Wolf, 1997). Several types

1

(6)

Verb-module (Koch et al., 2000), which uses various

knowledge sources in conjunction for resolving

anaphorical and lexical ambiguities. Examples

for such knowledge sources are world

knowl-edge, knowledge about the dialogue state, as

well as various sorts of morphological,

syntac-tic and semantic information. Since thedeep

translation path is the only one that includes

contextual disambiguation,its condencevalue

is incremented by the selection module

when-ever suchambiguitiesoccur.

6 Quality of Service Parameters

Translation quality isperhapsthe most

signi-cant Quality ofService (QoS)parameter asfar

as MT systems are concerned. The selection

moduleandthelearningprocedureasdescribed

above, are indeed aimedat optimizingthis

pa-rameter. Additionally, we have further

exper-imented with our selection module in order to

accommodateforotherQoSparametersaswell.

Analogously to QoS in Open Distributed

Pro-gramming (ODP), we can distinguish between

the following main categories: timeliness,

vol-ume,andreliability. Inthetimelinesscategory,

we refer to the delay fromthe beginningof the

acoustic inputtillthebeginningoftheacoustic

output, which is highly dependent on the

sys-tem's incrementality. The algorithm described

sofarrequiresthepresenceofalltranslated

seg-ments withina given dialogue turn, before the

selection itself can take place. This implies a

relatively long delay, because the biggest

pos-sible increment unit, i.e. the whole turn, is

being used. The maximal incrementality, and

therefore theminimaldelay,areachieved when

the rst readysegment is being chosen at each

point. Thisimplies,however,apossible

deterio-rationintranslationquality,and increasingthe

riskthatdueto segmentationdierencesacross

modules, noappropriatecontinuationwouldbe

found forthe rst segment that had been

cho-sen. The latter is referred to as'loss rate', and

belongs to the reliability category of QoS

di-mensions. The trade-o between loss rate and

incrementality is parameterized by the

selec-tion module, by selecting a segment as soon as

n translation modules have delivered segments

withsimilarsegmentations(1n4). Within

processing time (from the beginning of

acous-tic input till the end of acoustic output) and

the overall speaking time (beginning of

acous-tic input till the end of acoustic input). In

order to support conformance to RTF

speci-cation for the translation service, the selection

modulesupportsaQoSsignalinterface. AQoS

management module monitors the runtime

be-havior of the translation modules, and signals

the selection process if the estimated RTF is

expected to exceed the specication. Upon

re-ceiving such a signal, the selection module

at-tempts to complete its output without waiting

for furthertranslatedsegments.

7 Conclusion

Wehave describedcertaindiÆcultiesthatarise

from theattempt to integratemultiple

alterna-tive translation paths and to choose their

op-timal combination into one `best' translation.

Usingcondencevaluesthatoriginatefrom

dif-ferent translation modules as our basic

selec-tion criterion, we have introduced a learning

method which enables us to select in maximal

accord with decisions taken by human

annota-tors. Alongthe way,we havealso tackled some

problematicaspectsoftranslationevaluationas

such, described some additional sources of

in-formation that are used by our selection

mod-ule, and briey sketched the way in which it

supports quality of service specications. The

extent to which thismodule succeeds in

creat-ing higher quality compound translations is of

course highlydependenton theappropriate

as-signmentofcondencevalues,performedbythe

translationmodulesthemselves. Asarough

cri-terion for evaluating our success, we compared

the selection module's output to the best

re-sults achieved bya singletranslationpath.

Re-cent Verbmobil evaluation results demonstrate

animprovementof27.8%achievedbythe

selec-tion module, measured by the number of

dia-logue turns that were marked 'good' by

anno-tators who were presentedwithve alternative

translations for each turn, namely,those

(7)

S.Agmon. The relaxation method for linear

in-equalities CanadianJournal of Mathematics,

6:382-392, 1954.

J.Alexandersson, B.Buschbeck-Wolf,

T.Fujinami, M.Kipp, S.Koch, E.Maier,

N.Reithinger, B.Schmitz, M.Siegel. Dialogue

Acts in VERBMOBIL-2 Second Edition,

DFKI Saarbrucken, Universitat Stuttgart,

Technische Universitat Berlin, Universitat

des Saarlandes, Verbmobil Report 226, Mai

1997.

E.Amaldi, M.Mattavelli. A combinatorical

op-timization approach to extract piecewise

lin-ear structure fromnonlinear data and an

ap-plication to optical ow segmentation, TR

97-12, Cornell Computational Optimization

Project,CornellUniversity,IthacaNY,USA.

J.Buckow, A.Batliner, F.Gallwitz, R.Huber,

E.Noth, V.Warnke, and H.Niemann.

Dove-tailing of Acoustics and Prosody in

Sponta-neous Speech Recognition In Proc. Int. Conf.

on Spoken Language Processing, volume 3,

pages 571-574, Sydney, Australia, December

1998.

B.Buschbeck-Wolf. Resolution on Demand.

UniversitatStuttgart.VerbmobilReport196.

May1997.

T.Cormen, C.Leiserson, L.Rivet. Introduction

to Algorithms MIT Press, Cambridge,

Mas-sachusetts, 1989.

M.Emele, M.Dorna. EÆcient Implementation

of a Semantic-based Transfer Approach In

Proceedingsofthe12thEuropeanConference

on Articial Intelligence (ECAI-96). August

1996.

M.Emele, M.Dorna. Ambiguity Preserving

Ma-chine Translation using Packed

Representa-tions. In Proceedings of the 17th

Interna-tionalConferenceonComputational

Linguis-tics (COLING-ACL '98), Montreal, Canada.

August 1998.

R.Frederking, S.Nirenburg. Three Heads are

Better than One,ANLP94P,p 95-100, 1994.

A.Kilger, W.Finkler. Incremental Generation

for Real-Time Applications, DFKI Report

RR-95-11, German Research Center for

Ar-ticialIntelligence - DFKIGmbH,1995.

M.Kipp, J.Alexandersson, N.Reithinger.

Un-derstanding Spontaneous Negotiation

Dia-logue Systems, Stockholm, Sweden, August

1999.

S.Koch, U.Kussner, M.Stede. Contextual

Dis-ambiguation, In W.Wahlster, Ed. Verbmobil:

Foundations of Speech to Speech Translation

SpringerVerlag, 2000.

U.Kussner.ApplyingDLinAutomaticDialogue

Interpreting, ProceedingsoftheInternational

Workshop on DescriptionLogics- DL-97, pp

54-58, Gif surYvette,France, 1997.

U.Kussner. Description Logic Unplugged

Pro-ceedings of the International Workshop on

Description Logics - DL-98, pp 142-146,

Trento,Italy,1998.

T.S.Motzkin, I.J.Schoenberg. The relaxation

methodfor linearinequalitiesCanadian

Jour-nalof Mathematics,6:393-404, 1954.

F.J.Och, C.Tillmann, H.Ney. Impr oved

Align-ment modelsfor Statistical MachineT

ransla-tion, In Proc.of theJoint SIGDATConf. on

EmpiricalMethodsinNaturalLanguage

Pro-cessing and Very Large Corpora, University

ofMaryland, 1999.

T.Ruland, C.J.Rupp, J.Spilker, H.Weber,

C.Worm.Making the Most of Multiplicity: A

Multi-Parser Multi-Strategy Architecture for

the Robust Processing of Spoken Language.

Proceedings ofICSLP 1998.

M.Schielen, J.Bos, M.Dorna Verbmobil

Inter-face Terms (VITs), In W.Wahlster, Ed.

Verbmobil: Foundations of Speech to Speech

Translation SpringerVerlag, 2000.

B.Schmitz. Pragmatikbasiertes Maschinelles

Dolmetschen. Dissertation, FB Informatik,

TUBerlin, 1997.

B.Schmitz, J.J.Quantz. Dialogue Acts in

Auto-maticDialogueInterpretinginProceedingsof

the SixthInternational Conference on

Theo-reticalandMethodologicalIssuesinMachine

Translation (TMI-95), Leuven,1995.

W.Wahlster, Ed. Verbmobil: Foundations of

SpeechtoSpeechTranslation SpringerVerlag,

2000.

C.Worm, C.J.Rupp. Towards Robust

Under-standingof Speechby Combination of Partial