[PDF] Top 20 Alibaba Submission to the WMT18 Parallel Corpus Filtering Task
Has 10000 "Alibaba Submission to the WMT18 Parallel Corpus Filtering Task" found on our website. Below are the top 20 most common "Alibaba Submission to the WMT18 Parallel Corpus Filtering Task".
Alibaba Submission to the WMT18 Parallel Corpus Filtering Task
... The parallel corpus is an essential resource for machine translation and multilingual natural lan- guage ...of parallel corpus is also very important in MT system training (Koehn and Knowles, ... See full document
6
Coverage and Cynicism: The AFRL Submission to the WMT 2018 Parallel Corpus Filtering Task
... The preceding processes and metrics were de- signed to remove many sources of error men- tioned in the introduction of this paper. How- ever, we have not yet dealt with the case of hav- ing both English and German lines ... See full document
5
Measuring sentence parallelism using Mahalanobis distances: The NRC unsupervised submissions to the WMT18 Parallel Corpus Filtering shared task
... In a system such as this, which is looking for “strange” sentence pairs, training on additional monolingual data beyond the target corpus car- ries some risks. If the additional monolingual data were to have very ... See full document
8
The ILSP/ARC submission to the WMT 2018 Parallel Corpus Filtering Shared Task
... Zarin¸a et al. (2015) exploit already available parallel corpora in order to get word alignments, which are then used to identify mistranslations. Denkowski et al. (2012) use N-gram language models built from ... See full document
6
The MeMAD Submission to the WMT18 Multimodal Translation Task
... We also studied the effect of additional train- ing data. Our initial experiments showed that movie subtitles and their translations work rather well to augment the given training data. Therefore, we included ... See full document
9
Accurate semantic textual similarity for cleaning noisy parallel corpora using semantic machine translation evaluation metric: The NRC supervised submissions to the Parallel Corpus Filtering task
... redundancy filtering after scoring sentence pairs: going down the re- ranked corpus, we filtered out any sentence pair that did not contain at least one “new” source- language word bigram, ...re-ranked ... See full document
9
Prompsit’s submission to WMT 2018 Parallel Corpus Filtering shared task
... allel corpus filtering shared ...training corpus with diverse vocabulary and fluent sentences: language model scor- ing, an active-learning-inspired data selection algorithm and n-gram ... See full document
8
Findings of the WMT 2018 Shared Task on Parallel Corpus Filtering
... AFRL afrl-cvg-large 21.9 26.1 18.9 18.3 26.0 20.0 22.1 25.2 29.9 22.2 21.5 29.7 22.6 25.5 AFRL afrl-cvg-mix-meteor 23.4 27.7 20.0 20.6 26.8 21.1 24.0 25.3 29.9 22.3 21.5 29.9 22.7 25.6 AFRL afrl-cvg-mix 22.5 26.5 19.4 ... See full document
14
SYSTRAN Participation to the WMT2018 Shared Task on Parallel Corpus Filtering
... our submission to the WMT18 shared task on parallel corpus ...identify parallel sen- tences using a flexible method that relies on deep neural ... See full document
5
CUNI System for the WMT18 Multimodal Translation Task
... Second, for Czech and German, we selected pseudo in-domain data by filtering the available general domain corpora. For both languages, we trained a character-level RNN language model on the corresponding language ... See full document
8
The RWTH Aachen University Filtering System for the WMT 2018 Parallel Corpus Filtering Task
... of parallel sentences by applying basic rule- based heuristics each of whom can reject a sen- tence as described in Section ...final submission consists of three differ- ent systems on top of rule-based ... See full document
9
Findings of the WMT 2019 Shared Task on Parallel Corpus Filtering for Low Resource Conditions
... We received submissions from 11 different orga- nizations. See Table 8 for the complete list of par- ticipants. The participant’s organizations are quite diverse, with 4 participants from the United States, 2 ... See full document
19
An Unsupervised System for Parallel Corpus Filtering
... Munich’s submission for the WMT 2018 Parallel Cor- pus Filtering shared task which addresses the problem of cleaning noisy parallel ...The task of mining and cleaning ... See full document
6
The MLLP UPV German English Machine Translation System for WMT18
... data filtering method based on independent language models for each side of a noisy parallel corpus has the caveat of not be- ing able to detect sentence pairs where the source and the target are ... See full document
7
NRC Parallel Corpus Filtering System for WMT 2019
... In this paper, we presented the NRC’s submissions to the WMT19 parallel corpus filtering task. Offi- cial results indicate our best systems were ranked 3rd or 4th out of over 20 submissions in ... See full document
9
Tilde’s Parallel Corpus Filtering Methods for WMT 2018
... describes parallel corpus filtering methods that allow reducing noise of noisy “parallel” corpora from a level where the cor- pora are not usable for neural machine trans- lation training ... See full document
7
The University of Helsinki Submission to the WMT19 Parallel Corpus Filtering Task
... In this paper, we presented our rescoring system for the WMT 2019 Shared Task on Parallel Cor- pus Filtering. Our system is based on contrastive scoring models using features extracted from dif- ... See full document
7
NICT’s Corpus Filtering Systems for the WMT18 Parallel Corpus Filtering Task
... aggressive filtering since it ap- peared that many of the sentence pairs are obvi- ously too noisy to be used to train MT ...the corpus are made of long sequences of numbers or punctua- tion ...the ... See full document
5
Webinterpret Submission to the WMT2019 Shared Task on Parallel Corpus Filtering
... German-English task last year, the organizers now pose the problem under more challenging low-resource conditions includ- ing Nepali and Sinhala ...the task addresses the challenge of data quality and not ... See full document
6
Alibaba Submission for WMT18 Quality Estimation Task
... QE Task We filtered all the corpora except src-pe pairs with basic rules to guarantee the ...qualifying parallel corpora roughly include 13 million for WMT17 QE tasks and 29 million for WMT18 QE ... See full document
7
Related subjects