• No results found

[PDF] Top 20 NICT’s Corpus Filtering Systems for the WMT18 Parallel Corpus Filtering Task

Has 10000 "NICT’s Corpus Filtering Systems for the WMT18 Parallel Corpus Filtering Task" found on our website. Below are the top 20 most common "NICT’s Corpus Filtering Systems for the WMT18 Parallel Corpus Filtering Task".

NICT’s Corpus Filtering Systems for the WMT18 Parallel Corpus Filtering Task

NICT’s Corpus Filtering Systems for the WMT18 Parallel Corpus Filtering Task

... noisy parallel data. We adopt the clean data of WMT18 News Translation Task to train a classifier and compute informative ...the task and ...noisy corpus are scored by this classi- ... See full document

5

SYSTRAN Participation to the WMT2018 Shared Task on Parallel Corpus Filtering

SYSTRAN Participation to the WMT2018 Shared Task on Parallel Corpus Filtering

... the WMT18 shared task on parallel corpus ...identify parallel sen- tences using a flexible method that relies on deep neural ... See full document

5

Measuring sentence parallelism using Mahalanobis distances: The NRC unsupervised submissions to the WMT18 Parallel Corpus Filtering shared task

Measuring sentence parallelism using Mahalanobis distances: The NRC unsupervised submissions to the WMT18 Parallel Corpus Filtering shared task

... The WMT18 shared task on parallel corpus fil- tering (Koehn et ...web-scraped parallel cor- pus (Koehn et ...“clean” corpus looks ...target corpus of the language is the ... See full document

8

Parallel Corpus Filtering Based on Fuzzy String Matching

Parallel Corpus Filtering Based on Fuzzy String Matching

... SMT systems as described in Section 4, we translate the Nepali (or Sinhala) sentences from partially filtered parallel corpora into English, and apply fuzzy string matching to score each pair of ...SMT ... See full document

5

UTFPR at WMT 2018: Minimalistic Supervised Corpora Filtering for Machine Translation

UTFPR at WMT 2018: Minimalistic Supervised Corpora Filtering for Machine Translation

... We present the UTFPR systems at the WMT 2018 parallel corpus filtering task. Our supervised approach discerns between good and bad translations by training clas- sic binary ... See full document

5

An Unsupervised System for Parallel Corpus Filtering

An Unsupervised System for Parallel Corpus Filtering

... A German-English dataset was released contain- ing 1 billion (English) tokens. The corpus was crawled from the web as part of the ParaCrawl project. After extracting texts from web pages with BiTextor ... See full document

6

Tilde’s Parallel Corpus Filtering Methods for WMT 2018

Tilde’s Parallel Corpus Filtering Methods for WMT 2018

... data filtering for statistical machine trans- lation (SMT) has shown to be a challenging ...Stricter filtering does not always yield positive re- sults (Zarin¸a et ...SMT systems, ...data ... See full document

7

STACC, OOV Density and N gram Saturation: Vicomtech’s Participation in the WMT 2018 Shared Task on Parallel Corpus Filtering

STACC, OOV Density and N gram Saturation: Vicomtech’s Participation in the WMT 2018 Shared Task on Parallel Corpus Filtering

... points below the top performing systems on aver- age. The n-gram saturation variant did not provide significant improvements and actually performed significantly worse in one scenario, while also consuming more ... See full document

7

MAJE Submission to the WMT2018 Shared Task on Parallel Corpus Filtering

MAJE Submission to the WMT2018 Shared Task on Parallel Corpus Filtering

... shared task have to submit a file with quality scores, one per line, corresponding to the sentence pairs on the 1 billion word German- English Paracrawl ...MT systems with these corpora, and assessing ... See full document

5

Coverage and Cynicism: The AFRL Submission to the WMT 2018 Parallel Corpus Filtering Task

Coverage and Cynicism: The AFRL Submission to the WMT 2018 Parallel Corpus Filtering Task

... MT systems trained on smaller sets and rarely detrimental for the systems trained on larger ...The filtering that includes a translation score, cvg-mix-meteor, is our top submission by mean BLEU ... See full document

5

The ILSP/ARC submission to the WMT 2018 Parallel Corpus Filtering Shared Task

The ILSP/ARC submission to the WMT 2018 Parallel Corpus Filtering Shared Task

... 2018 Parallel Cor- pus Filtering shared ...the task with the purpose of clustering sentence pairs according to their appropriateness in training MT ...the task) that were eval- ... See full document

6

Findings of the WMT 2019 Shared Task on Parallel Corpus Filtering for Low Resource Conditions

Findings of the WMT 2019 Shared Task on Parallel Corpus Filtering for Low Resource Conditions

... Shared Task on Parallel Corpus Filtering at the Conference for Machine Translation (WMT 2019) was organized to promote research to learn- ing from noisy data more viable for low-resource ... See full document

19

NRC Parallel Corpus Filtering System for WMT 2019

NRC Parallel Corpus Filtering System for WMT 2019

... shared task on parallel corpus filter- ing was essentially the same as last year’s edi- tion (Koehn et ...noisy corpus crawled from the web using ParaCrawl (Koehn et ...(MT) systems on ... See full document

9

Findings of the WMT 2018 Shared Task on Parallel Corpus Filtering

Findings of the WMT 2018 Shared Task on Parallel Corpus Filtering

... these systems to eval- uate different weight ...quality parallel corpora, while bad sentence pairs are either synthesized by scrambling good sen- tence pairs or by using the raw crawled ... See full document

14

Webinterpret Submission to the WMT2019 Shared Task on Parallel Corpus Filtering

Webinterpret Submission to the WMT2019 Shared Task on Parallel Corpus Filtering

... After discussing the performance of our sub- missions, we will compare our best submission on each condition to the rest of participants. Figure 1 summarizes the results of the shared task as re- ported by the ... See full document

6

The RWTH Aachen University Filtering System for the WMT 2018 Parallel Corpus Filtering Task

The RWTH Aachen University Filtering System for the WMT 2018 Parallel Corpus Filtering Task

... a filtering approach, we train a transformer model on the top 10M respec- tively top 100M subwords of the scored training ...translation systems from the Marian framework (Junczys-Dowmunt et ... See full document

9

The Speechmatics Parallel Corpus Filtering System for WMT18

The Speechmatics Parallel Corpus Filtering System for WMT18

... the parallel corpus filtering task uses a two-step ...effective corpus size down from the initial 1 billion to 160 million ...further filtering down to 100 or 10 million ...exact ... See full document

7

Alibaba Submission to the WMT18 Parallel Corpus Filtering Task

Alibaba Submission to the WMT18 Parallel Corpus Filtering Task

... this task, we can divide the corpus cleaning task into three ...this task, we attempt to quantify the translation quality (also called bilin- gual score) and accuracy of the sentence ...the ... See full document

6

uniblock: Scoring and Filtering Corpus with Unicode Block Information

uniblock: Scoring and Filtering Corpus with Unicode Block Information

... MT systems to show the simplicity and effectiveness of our method in a more challeng- ing ...same parallel corpora as in Section 4.2 and train systems in four directions: zh-en, en-zh, ru-en and ... See full document

6

Generating High Quality Proposition Banks for Multilingual Semantic Role Labeling

Generating High Quality Proposition Banks for Multilingual Semantic Role Labeling

... the corpus level, resulting in a significantly bet- ter SRL corpus for ...SRL systems, treating arguments as already given; (2) it generates rules for the argument classification step preferably from ... See full document

11

Show all 10000 documents...

Related subjects