[PDF] Top 20 Regularized Training Objective for Continued Training for Domain Adaptation in Neural Machine Translation
Has 10000 "Regularized Training Objective for Continued Training for Domain Adaptation in Neural Machine Translation" found on our website. Below are the top 20 most common "Regularized Training Objective for Continued Training for Domain Adaptation in Neural Machine Translation".
Regularized Training Objective for Continued Training for Domain Adaptation in Neural Machine Translation
... speaker adaptation. Their work adapts a context dependent deep neural network hidden Markov model (CD-DNN-HMM) using the KL-divergence between the softmax out- puts (modeling tied-triphone states) of a ... See full document
9
Overcoming Catastrophic Forgetting During Domain Adaptation of Neural Machine Translation
... Continued training is an effective method for domain adaptation in neural machine transla- ...(EWC)—a machine learning method for learning a new task without for- getting ... See full document
7
Curriculum Learning for Domain Adaptation in Neural Machine Translation
... ued training model by up to ...the training data with samples that have different levels of simi- larity to the in-domain ...the training set, the previously visited shards are still used, so ... See full document
13
Freezing Subnetworks to Analyze Domain Adaptation in Neural Machine Translation
... that continued training does not move the model very far from the initial out-of- domain model, in the sense that random pertur- bations of the same magnitude cause only small performance drops on ... See full document
9
An Empirical Comparison of Domain Adaptation Methods for Neural Machine Translation
... (Cromieres et al., 2016). The NMT settings were the same as (Cromieres et al., 2016) except that we used a vocabulary size of 32k for all the ex- periments, and did not ensemble independently trained parameters. The ... See full document
7
Multi Domain Neural Machine Translation through Unsupervised Adaptation
... the training data, and set the number of merge rules to 89,500, resulting in vocabular- ies of size 78K and 86K tokens respectively for English and ...the training set at each epoch, and are evaluated every ... See full document
11
Extracting In domain Training Corpora for Neural Machine Translation Using Data Selection Methods
... The results presented in Table 3 seem to support some of the previous conclusions that data selec- tion does not yield as much gain for the NMT as it did for SMT. The best results are mostly data selection of 2M or 4M. ... See full document
8
Dynamic Sentence Sampling for Efficient Training of Neural Machine Translation
... NMT training by resampling a smaller subset of the data that makes a relatively high contribution, to improve the training efficiency of ...for domain adaptation using the internal sentence ... See full document
7
Unsupervised Domain Adaptation for Neural Machine Translation with Domain Aware Feature Embeddings
... use domain tags to control the output domain, but it still needs a in-domain par- allel corpus and our architecture allows more flex- ible modifications than just adding additional ...Unsupervised ... See full document
6
Learning Hidden Unit Contribution for Adapting Neural Machine Translation Models
... the domain adaptation step, new, un- seen training data is added to ...the domain adaptation techniques can efficiently include new information into the original ... See full document
6
Training Neural Machine Translation to Apply Terminology Constraints
... by neural machine translation (NMT), its output is of- ten still not adequate for many specific domains handled daily by the translation ...learn domain specific terms (Farajian et ... See full document
6
Incremental Domain Adaptation for Neural Machine Translation in Low Resource Settings
... of training samples without loss in transla- tion quality compared to the commonly used fine- tuning with random ...reduces translation cost and time by ...1 training setting, none of the scores is ... See full document
10
Sentence Embedding for Neural Machine Translation Domain Adaptation
... Recently, Neural Machine Translation (NMT) has set new state-of-the-art benchmarks on many translation tasks (Cho et ...NMT training. However, only the in-domain or related- ... See full document
7
Domain Adaptation of Neural Machine Translation by Lexicon Induction
... word-for-word translation for these individual words without any other context, and the results turn out to be ex- tremely bad, indicating that the model does not ac- tually find the correspondence of these word ... See full document
13
Neural Lattice Search for Domain Adaptation in Machine Translation
... Domain adaptation is a major challenge for neural machine translation ...of training data and thus perform poorly relative to phrase-based machine transla- tion (PBMT) ... See full document
6
Instance Weighting for Neural Machine Translation Domain Adaptation
... NMT domain adaptation baselines, “ensemble” indicates in and out models were ensembled in decoding and “sampler” indicates that we sampled duplicated in-domain data into training data, to make ... See full document
7
Iterative Dual Domain Adaptation for Neural Machine Translation
... out-of-domain translation knowl- edge into the in-domain NMT ...out-of-domain training cor- pus, and then fine-tuned on the in-domain train- ing ...NMT domain ... See full document
11
Domain Differential Adaptation for Neural Machine Translation
... deep neural networks rely on the availability of high quality and labeled training data (He et ...ticular, neural machine translation (NMT) models tend to perform poorly if they are not ... See full document
11
Cost Weighting for Neural Machine Translation Domain Adaptation
... NMT objective in Equation 7, and the classifier’s cross-entropy ...objective. Training the two concur- rently allows the classifier to benefit from and ad- just to improvements in the encoder ... See full document
7
The AFRL WMT17 Neural Machine Translation Training Task Submission
... seems to have good support for factors. The teacher systems are provided the lowercased general-domain training dataset, along with its do- main, case, and subword location factors. Vectors for the factor ... See full document
5
Related subjects