Top PDF evaluation metric

Which Coreference Evaluation Metric Do You Trust? A Proposal for a Link based Entity Aware Metric

... only metric which is insensitive to this flaw is MUC, which, however, is known to be the least discriminative ...Entity-Aware evaluation metric that is designed to over- come the shortcomings of the ...

11

RED: A Reference Dependency Based MT Evaluation Metric

... The sentence level correlations on WMT 2012 and WMT 2013 are shown in Table 4. RED is better than BLEU and HWCM on all the language pairs, which reflects the effectiveness of syntactic information and only parsing the ...

10

EMMA: A novel Evaluation Metric for Morphological Analysis

... novel Evaluation Metric for Morphological Analysis (EMMA) that is both linguistically appealing and empirically ...the metric currently used by the Morpho Challenge (MC) competition ...MC ...

9

Improving AMBER, an MT Evaluation Metric

... A recent paper described a new machine translation evaluation metric, AMBER. This paper describes two changes to AMBER. The first one is incorporation of a new ordering penalty; the second one is the use of ...

5

ParaMetric: An Automatic Evaluation Metric for Paraphrasing

... Paraphrasing is useful in a variety of natural language processing applications including natural language generation, question answering, multi- document summarization and machine translation evaluation. These ...

8

Choosing an Evaluation Metric for Parser Design

... This paper seeks to quantitatively evaluate the degree to which a number of popular metrics provide overlapping information to parser designers. Two routine tasks are considered: optimizing a machine learning ...

6

LEPOR: A Robust Evaluation Metric for Machine Translation with Augmented Factors

... automatic metric for Machine Translation (MT) evaluation, many other methods have been proposed to revise or improve ...the evaluation metric. Other evaluation metrics, such as TER ...

10

MT Tuning on RED: A Dependency Based Evaluation Metric

... • When BLEU is used as the evaluation metric, the best score is obtained by using BLEU as the optimization objective in tuning as well. This follows the findings in Cer et al. (2010). • The best METEOR ...

6

PEM: A Paraphrase Evaluation Metric Exploiting Parallel Texts

... automatic evaluation metric for paraphrases that we are aware of is ParaMetric (Callison-Burch et ...Para- Metric does not attempt to propose a single metric to correlate well with human ...

10

Designing a Frame Semantic Machine Translation Evaluation Metric

... the evaluation. In the first round of evaluation, around 30 sentence pairs from the human trans- lated TED talk were scored from 4 (acceptable without changes) to 1 (unusable) by 7 annotators for each ...

8

The Feasibility of HMEANT as a Human MT Evaluation Metric

... of evaluation is of some use when one is using MT to aid human translation (although the re- lationship between number of edits and actual ef- fort is not straightforward (Koponen, 2012)), but it is not so ...

10

Recall is the Proper Evaluation Metric for Word Segmentation

... ployed evaluation metrics for word seg- ...standard evaluation metrics and that the evaluation score obtained by using only recall is sufficient and better correlated with the performance of word ...

5

A Critique and Improvement of an Evaluation Metric for Text Segmentation

... the metric, a Type C error occurrence in which a false positive and a false nega- tive are some distance e < k from each other incurs a penalty of 2e, where e is as- signed for the false positive and another e ...

18

Smatch: an Evaluation Metric for Semantic Feature Structures

... this metric and provide several practical and replicable computing methods by using Inte- ger Linear Programming (ILP) and hill-climbing ...our metric can be used for measuring the annotator agreement in ...

5

SemBleu: A Robust Metric for AMR Parsing Evaluation

... major evaluation metric, S MATCH (Cai and Knight, 2013), searches for one-to-one mappings between the nodes of two AMRs with a greedy hill-climbing algorithm, which leads to search ...

6

LAYERED: Metric for Machine Translation Evaluation

... translation evaluation has always re- mained as the most popular measure to judge the quality of a system output compared to the reference ...MT evaluation metric which is based on the n-gram ...

7

Edit Distance: A Metric for Machine Translation Evaluation

... The evaluation of MT technology received a boost in 2001 when IBM introduced “BLEU” (Papineni, et ...n-gram evaluation metric that demonstrated high correlation with human judgements of system ...

6

MAXSIM: A Maximum Similarity Metric for Machine Translation Evaluation

... (MT) evaluation metric that calculates a similarity score (based on precision and recall) of a pair of ...proposed metric achieves higher correlation with human judgements than all 11 automatic ...

8

Traversal Free Word Vector Evaluation in Analogy Space

... or taking multiple word pairs into consideration (Drozd et al., 2016), but the goal is still to find word D from the vocabulary. Besides, Linzen (2016) made a thorough assessment of word analogy test, and the most ...

5

The Parameter Optimized ATEC Metric for MT Evaluation

... It is a general consensus that the performance of an evaluation metric can be improved by matching more words between MT outputs and human references. Linguistic resources like stemmer and WordNet are ...

5

evaluation metric

Related subjects