[PDF] Top 20 Extracting In domain Training Corpora for Neural Machine Translation Using Data Selection Methods
Has 10000 "Extracting In domain Training Corpora for Neural Machine Translation Using Data Selection Methods" found on our website. Below are the top 20 most common "Extracting In domain Training Corpora for Neural Machine Translation Using Data Selection Methods".
Extracting In domain Training Corpora for Neural Machine Translation Using Data Selection Methods
... all data from the previ- ous ...all data has access to both the generic and domain vocabulary, the fine-tuned models are built on top of the generic vocabulary ...vant domain words, while in ... See full document
8
Dynamically Composing Domain Data Selection with Clean Data Selection by “Co Curricular Learning” for Neural Machine Translation
... dynamic data selection function, D φ λ (t, D), to return the top λ(t) of ex- amples in a dataset D sorted by a scoring func- tion φ at a training step ...During training, D φ λ (t, D) ... See full document
11
Adaptation Data Selection using Neural Language Models: Experiments in Machine Translation
... chine Translation (SMT) systems is the dearth of high-quality bitext in the domain of ...adaptation data selection: the idea is to use language models (LMs) trained on in-domain text to ... See full document
6
Improving Statistical Machine Translation Performance by Training Data Selection and Optimization
... test domain, for example, when building a specific domain SMT system or when participating the NIST MT evaluation 1 ...testing data. This paper presents two methods to exploit full potential ... See full document
8
Target Conditioned Sampling: Optimizing Data Selection for Multilingual Neural Machine Translation
... examined data selection from the view of domain adaptation, selecting good train- ing data from out-of-domain text to improve in- domain ...these methods select ... See full document
6
Incremental Decoding and Training Methods for Simultaneous Translation in Neural Machine Translation
... tained using models trained on the TED corpus ...larger data-sets (See the supple- mentary section again for data sizes) to see if our findings are ...in-domain data to avoid ... See full document
7
Dynamic Data Selection for Neural Machine Translation
... Regarding data selection for SMT, previous work has targeted two goals; to reduce model sizes and training times, or to adapt to new ...domains. Data selection methods for ... See full document
11
Submodularity for Data Selection in Machine Translation
... English translation task, using the NIST 2006 set for development and the NIST 2009 set for eval- ...The training data consists of all Modern Standard Arabic-English parallel LDC ... See full document
11
Handling Rare Word Problem using Synthetic Training Data for Sinhala and Tamil Neural Machine Translation
... parallel training data influences the rare word problem in Neural Machine Translation (NMT) systems, particularly for under- resourced ...languages. Using synthetic parallel ... See full document
5
An Empirical Comparison of Domain Adaptation Methods for Neural Machine Translation
... of training, and thus we early stopped training soon after 1 ...of training, fine-tuning overfitted very quickly, while mixed fine-tuning did not ...“Multi domain” ...the data used for ... See full document
7
Three phase training to address data sparsity in Neural Machine Translation
... of data sparsity in NMT, using only little amount of parallel ...results using this approach on five Indian language pairs and showed a substantial improvement in translation ...monolingual ... See full document
10
Using Comparable Corpora to Adapt MT Models to New Domains
... on machine translation domain adap- tation has focused on either the language model- ing component or the translation modeling com- ponent of an SMT ...explored methods for subselecting ... See full document
8
Denoising Neural Machine Translation Training with Trusted Data and Online Data Selection
... is data selection for machine ...the domain adaptation con- ...sure domain relevance of data is based on cross entropy difference (CED) between an in-domain and an ... See full document
11
Communications between Deaf and Hearing Children Using Statistical Machine Translation
... automatic translation and recognition systems are developed in this ...statistical machine translation (SMT) it is necessary to present a set of appropriate ...available corpora are too ... See full document
7
Six Challenges for Neural Machine Translation
... of training data are only available out of domain, but we still seek to have robust ...different corpora obtained from OPUS (Tiede- mann, ... See full document
12
Domain Control for Neural Machine Translation
... though domain is wrongly pre- dicted in some cases, translation accuracy is still improved when compared to the Join ...that domain classification at sentence level is a challenging task as short ... See full document
7
Domain Adaptive Inference for Neural Machine Translation
... test data domain to match with the best adapted model, let alone optimal weights for an ensemble on that do- ...on data with- out domain labelling using our adaptive decod- ing schemes ... See full document
7
Enhancement of Encoder and Attention Using Target Monolingual Corpora in Neural Machine Translation
... the training, we used different learning rates for the base and synthetic parallel corpora to avoid overfitting to the syn- thetic ...the translation qual- ity was improved by increasing the number ... See full document
9
Domain Differential Adaptation for Neural Machine Translation
... different data requirements, namely LMs and NMT models, exhibit similar behavior when trained on the same domain, but there is little correlation between models trained on data from different domains ... See full document
11
Transductive Data Selection Algorithms for Fine Tuning Neural Machine Translation
... In-Domain Data As we have seen in previous sections, applying fine-tuning with subsets of data can perform bet- ter than using the complete ...on data retrieved from a mixture of the ... See full document
11
Related subjects