• No results found

Measuring Machine Translation Errors in New Domains

N/A
N/A
Protected

Academic year: 2020

Share "Measuring Machine Translation Errors in New Domains"

Copied!
12
0
0

Loading.... (view fulltext now)

Full text

Loading

Figure

TABLE(% 3: WADE: Percent correct, percent seen errors, percent sense errors, and percent score errors

References

Related documents

MT metrics compute a score based on the output of a MT system, here called “candidate”, and a “reference” sentence, which is provided. The ref- erence is a valid translation of

The most commonly used automatic evaluation metrics, BLEU (Papineni et al., 2002) and NIST (Doddington, 2002), are based on the assumption that “The closer a machine translation is to

This is essentially equivalent to the standard domain-adaptation problem in machine learning, and in the context of MT there have been methods proposed to perform Bayesian adaptation

Domain Adaptation for Statistical Machine Translation with Domain Dictionary and Monolingual Corpora Proceedings of the 22nd International Conference on Computational Linguistics

We employ 8 different MT metrics for identifying paraphrases across two different datasets - the well-known Microsoft Research paraphrase corpus (MSRP) (Dolan et al., 2004) and

Traditional machine translation evaluation metrics such as BLEU and WER have been widely used, but these metrics have poor correlations with human judgements because they badly

After introducing gist consistency score into traditional MT metrics, the Kendall correlation between the hybrid BLEU (HBLEU( s topic )) and human judgements rise from 42.56% to

We report translation results for two metrics, Bleu (Papineni et al., 2002) and NIST (Doddington, 2002), and significance testing is performed using approxi- mate randomization