Incorporating Word and Subword Units in Unsupervised Machine Translation Using Language Model Rescoring
Full text
Figure
Related documents
In our method, we initial- ize the weights of the encoder and decoder with two language models that are trained with monolingual data and then fine-tune the model on parallel data
An effective and practical solution is adaptation data selection : the idea is to use language models (LMs) trained on in-domain text to select similar sentences from
Because of memory constraints, we were not able to train our models on quite as many sentence translation pairs as for the previous experiment: we trained our translation model
Unsupervised Japanese Chinese Opinion Word Translation using Dependency Distance and Feature Opinion Association Weight Proceedings of COLING 2012 Technical Papers, pages 1503?1518,
For each English training sentence, we use this confusion gram- mar to generate a simulated confusion set, from which we train a discriminative language model that will prefer
Splitting Input Sentence for Machine Translation Using Language Model with Sentence Similarity Takao Doi Eiichiro Sumita ATR Spoken Language Translation Research Laboratories 2 2
First, unsupervised WA is performed on the SMT training corpus where the Chinese sentences are treated as sequences of characters; then, the Chi- nese sentences are segmented by
Our distortion model was trained as follows: We used 0.2 million sentence pairs and their word alignments from the data used to build the translation model as the training data for