• No results found

Post-editing7 of MT output [Somers, 2003; Balling and Carl, 2014] is an idea as old as MT itself, going back until the initial steps in MT [Koponen, 2016], and it still is a necessary step if machine translations are to be used for more than just gisting. It has been shown in a number of studies that PE has an positive effect on the productivity of translators.

Early studies on the effects of post-editing on translators found at best mixed results in terms of efficiency and quality comparing PE to translation from scratch Ð [Orr and Small, 1967] report signiőcantly lower quality of post-edited machine translations, [Pierce and Carroll, 1966] conclude post-editing takes longer and is thus more expensive, and more recently Krings [2001, 1997] also could not őnd generally applicable gains in efficiency through PE. These results can however possibly attributed to insufficient baseline translation quality, at least to some extent, since it is directly linked to effectiveness8

of PE [Koehn and Germann, 2014; De Sousa et al., 2011]. Krings [2001, 1997] also employ an arguably non-realistic evaluation method Ð think-aloud protocols [Jääskeläinen, 2010; Bernardini, 2001] Ð which, while giving insights to cognitive processes, do not represent a realistic scenario. For low-resource languages, PE has been shown to increase productivity, which however comes with a loss in overall translation quality [Skadin,š et al., 2011].

Garcia [2011, 2010] show the inverse for the Chinese-English language pair, with no or only marginal improvements in productivity, but a loss in the resulting quality of the translations.

More recent studies, mostly employing SMT, have reported encouraging results for PE compared to translation from scratch: Guerberof [2009], Arenas [2008], Flournoy and Duran [2009], Federico et al. [2012] Zhechev [2012], Plitt and Masselot [2010], Koglin [2015] and De Sousa et al. [2011] report large gains in efficiency for constrained well-known domains. Casanellas and Marg [2014] report on a study with positive results for PE, with some constraints with regard to the general applicability of the results. An early result by Baker et al. [1994] also shows possible gains by PE when used with a controlled language in a corporate environment. Koehn [2009] and Koehn and Haddow [2009] show that in general, translators can improve using aids in CAT, as well as that PE showed best results in terms of efficiency and also resulting quality (in terms of coarse human judgments) of translations. Plitt and Masselot [2010] also report improved quality when they use

7

So to say the inverse view ofpre-translation.

8

PE. Aziz et al. [2012] and De Sousa et al. [2011] provide further evidence for the effectiveness of PE, but do report no effects on quality, whereas Läubli et al. [2013] report gains in quality and translation speed. Prominently, Beaton and Contreras [2010] report an average total cost reduction of 30% by utilizing post-editing. Green et al. [2013a] provide a statistical analysis showing improvements in both quality and speed when translators post-edit. Gaspari et al. [2014] also report that faster translation speeds are possible by PE, but also point out discrepancies in the perceived productivity and overall mixed results when comparing PE and fully human translation.

Concerning the comparison between more advanced interactive approaches to CAT, [Sturgeon and Lee, 2015] cannot provide a conclusive result, since it appears to depend on the individual translator. Conversely, Green et al. [2014b] report superior speed of PE compared to an interactive approach, but in contrast overall lower quality assessments for translations produced with PE.

Due to these positive results, interest in PE is strong in both research and the translation industry Tatsumi [2010]; Koponen [2016].

The use of MT in the translation industry, especially in the form of PE, is not without controversy: Due to the specious, impressive gains in translation productivity reported in some user studies, translators may be confronted by exaggerated expectations of their throughput, and in turn have to accept dramatic pay cuts. In addition, it is unclear whether these őndings can even be generalized to other types of translation jobs, i.e. on different domains with possibly worse baseline translation quality which affects productivity. But translators may nevertheless be forced to accept signiőcant cuts in pay per word regardless, which is one reason why a negative perception of PE prevails e.g. discussed by Guerberof [2013]: The authors report that some translators just do not like the task of PE as such, it being seen as an inherently different task to translation from scratch Ð in addition to implications on their pay. O’Brien and Moorkens [2014] conclude the same. However, e.g. Lagoudaki [2009] observe that the translation environments are diverse and experiences are thus very subjective and general conclusions again may be difficult9. Moorkens and O’Brien [2015] report on a study in which expert translators (in contrast to novices) showed a negative attitude towards PE. Kim and Kankanhalli [2009] present a meta study regarding general attitude towards change in this context.

Nevertheless, CAT and especially PE have increased rates in their adoption in the translation industry, and the process of PE is also an acknowledged and

9

These difficulties also apply to translation memories and their different modes of application [Wallis, 2006].

officially deőned standard10

in addition to regular11

translation.

For our work, we conclude that it is imperative to improve baseline translation performance, since it does directly impact the PE productivity [Koehn and Germann, 2014; De Sousa et al., 2011; Sanchez-Torron and Koehn, 2016; Lacruz et al., 2014], and may also be able to enhance translators’ perception of the task of PE if they experience it as an interactive experience [Wallis, 2006], where the MT system evidently reacts to user input by continuous learning and adaptation.