• No results found

[PDF] Top 20 Webinterpret Submission to the WMT2019 Shared Task on Parallel Corpus Filtering

Has 10000 "Webinterpret Submission to the WMT2019 Shared Task on Parallel Corpus Filtering" found on our website. Below are the top 20 most common "Webinterpret Submission to the WMT2019 Shared Task on Parallel Corpus Filtering".

Webinterpret Submission to the WMT2019 Shared Task on Parallel Corpus Filtering

Webinterpret Submission to the WMT2019 Shared Task on Parallel Corpus Filtering

... consideration. The initial filtering partially allevi- ates this cost by drastically reducing the amount of sentences to rank. However, it is still a slow process that took about one second per iteration with our ... See full document

6

Findings of the WMT 2019 Shared Task on Parallel Corpus Filtering for Low Resource Conditions

Findings of the WMT 2019 Shared Task on Parallel Corpus Filtering for Low Resource Conditions

... of filtering rules based on sentence length, sentences with long words (over 40 characters), sentences with XML or HTML tags, and sentences in the wrong script (Latin, Devanagari, or ...mismatch. ... See full document

19

SYSTRAN Participation to the WMT2018 Shared Task on Parallel Corpus Filtering

SYSTRAN Participation to the WMT2018 Shared Task on Parallel Corpus Filtering

... our submission to the WMT18 shared task on parallel corpus ...identify parallel sen- tences using a flexible method that relies on deep neural ... See full document

5

Findings of the WMT 2018 Shared Task on Parallel Corpus Filtering

Findings of the WMT 2018 Shared Task on Parallel Corpus Filtering

... The scoring on additional subset sizes was not announced before the submission deadline for the shared task, so none of the participants optimized for these. In fact, some participants assigned the ... See full document

14

Measuring sentence parallelism using Mahalanobis distances: The NRC unsupervised submissions to the WMT18 Parallel Corpus Filtering shared task

Measuring sentence parallelism using Mahalanobis distances: The NRC unsupervised submissions to the WMT18 Parallel Corpus Filtering shared task

... WMT18 shared task on parallel corpus fil- tering (Koehn et ...web-scraped parallel cor- pus (Koehn et ...“clean” corpus looks ...target corpus of the language is the only ... See full document

8

JU Saarland Submission to the WMT2019 English–Gujarati Translation Shared Task

JU Saarland Submission to the WMT2019 English–Gujarati Translation Shared Task

... preprocessed parallel cor- pus. Moreover, after adding preprocessed paral- lel corpus, the BLEU score dropped ...preprocessed parallel corpus for our final ... See full document

6

An Unsupervised System for Parallel Corpus Filtering

An Unsupervised System for Parallel Corpus Filtering

... Munich’s submission for the WMT 2018 Parallel Cor- pus Filtering shared task which addresses the problem of cleaning noisy parallel ...The task of mining and cleaning ... See full document

6

The University of Helsinki Submission to the WMT19 Parallel Corpus Filtering Task

The University of Helsinki Submission to the WMT19 Parallel Corpus Filtering Task

... In this paper, we presented our rescoring system for the WMT 2019 Shared Task on Parallel Cor- pus Filtering. Our system is based on contrastive scoring models using features extracted from ... See full document

7

Tilde’s Parallel Corpus Filtering Methods for WMT 2018

Tilde’s Parallel Corpus Filtering Methods for WMT 2018

... describes parallel corpus filtering methods that allow reducing noise of noisy “parallel” corpora from a level where the cor- pora are not usable for neural machine trans- lation training ... See full document

7

The ILSP/ARC submission to the WMT 2018 Parallel Corpus Filtering Shared Task

The ILSP/ARC submission to the WMT 2018 Parallel Corpus Filtering Shared Task

... By comparing the results of the two alternative ranking schemes, we conclude that their perfor- mances are similar for the 100M corpora. This is explained by the fact that their intersection is ex- tremely high: 5.2M ... See full document

6

Prompsit’s submission to WMT 2018 Parallel Corpus Filtering shared task

Prompsit’s submission to WMT 2018 Parallel Corpus Filtering shared task

... lel corpus filtering shared ...crafted filtering rules and an automatic classifier that selects those sentences that are mutual trans- ... See full document

8

MAJE Submission to the WMT2018 Shared Task on Parallel Corpus Filtering

MAJE Submission to the WMT2018 Shared Task on Parallel Corpus Filtering

... We also conducted some initial experiments us- ing the Common Crawl corpus, under the rationale that it would be closer to the domain of the noisy data from the Paracrawl corpus. However, Com- mon Crawl ... See full document

5

AX Semantics’ Submission to the SIGMORPHON 2019 Shared Task

AX Semantics’ Submission to the SIGMORPHON 2019 Shared Task

... baseline from the organizers (Wu and Cotterell, 2019) are the unmodified results from the submis- sion to the task. We found minor tooling mis- takes on the surprise languages after the submis- sion deadline which ... See full document

5

CogALex V Shared Task: LexNET   Integrated Path based and Distributional Method for the Identification of Semantic Relations

CogALex V Shared Task: LexNET Integrated Path based and Distributional Method for the Identification of Semantic Relations

... We sampled 25 (from the 184) true positive pairs in LexNET+Cos that were false negatives in Cos, and found that they were all connected via paths in the corpus, suggesting that LexNET’s contribution comes also ... See full document

6

UHH Submission to the WMT17 Metrics Shared Task

UHH Submission to the WMT17 Metrics Shared Task

... The evaluation of Machine Translation (MT) rep- resents a very important domain of research, as providing meaningful, automatic and accurate methods for determining the quality of machine- translated output is a key ... See full document

7

NLI Shared Task 2013: MQ Submission

NLI Shared Task 2013: MQ Submission

... to corpus error annotation detection (Dickinson and Meurers, ...the shared task, there seemed to be not much mileage in trying new features that were likely to be more peripheral to the ... See full document

10

MorphoLogic‘s Submission for the WMT 2009 Shared Task

MorphoLogic‘s Submission for the WMT 2009 Shared Task

... training corpus into morphemes did not in itself solve the word alignment quality problem: the alignments look even worse than those achieved on the plain text version of the ... See full document

5

TweetMT : a parallel microblog corpus

TweetMT : a parallel microblog corpus

... as parallel sentences are often very close to each other, because both vocabulary and word order are ...for filtering out wrong candidates, but it is not enough to find the correct parallel tweet, so ... See full document

7

UTFPR at WMT 2018: Minimalistic Supervised Corpora Filtering for Machine Translation

UTFPR at WMT 2018: Minimalistic Supervised Corpora Filtering for Machine Translation

... In this contribution, we presented the UTFPR sys- tems submitted to the WMT 2018 parallel corpus filtering task. Our supervised systems discern be- tween good and bad translations using ... See full document

5

Data Cleaning for Word Alignment

Data Cleaning for Word Alignment

... Based on these propositions, we could assume that if we measure the literalness score under a word-based MT M W B we will be able to deter- mine the degree of outlier-ness whatever the mea- sure we use for it. Hence, ... See full document

9

Show all 10000 documents...