• No results found

Incorporating Word and Subword Units in Unsupervised Machine Translation Using Language Model Rescoring

N/A
N/A
Protected

Academic year: 2020

Share "Incorporating Word and Subword Units in Unsupervised Machine Translation Using Language Model Rescoring"

Copied!
8
0
0

Loading.... (view fulltext now)

Full text

Loading

Figure

Figure 1: The illustration of our system.The translation procedure can be divided into five steps: (a) pre-generated from the word-level NMT model, (c) translation candidate rescoring, (d) construction of an ensemble ofing, we fine-tune word-level and subwor
Figure 2: The illustration of the unknown word replacement (UWR) procedure for word-level NMT
Table 2: Unsupervised translation results. We report the scores of several evaluation methods for every step of ourapproach

References

Related documents

In our method, we initial- ize the weights of the encoder and decoder with two language models that are trained with monolingual data and then fine-tune the model on parallel data

An effective and practical solution is adaptation data selection : the idea is to use language models (LMs) trained on in-domain text to select similar sentences from

Because of memory constraints, we were not able to train our models on quite as many sentence translation pairs as for the previous experiment: we trained our translation model

Unsupervised Japanese Chinese Opinion Word Translation using Dependency Distance and Feature Opinion Association Weight Proceedings of COLING 2012 Technical Papers, pages 1503?1518,

For each English training sentence, we use this confusion gram- mar to generate a simulated confusion set, from which we train a discriminative language model that will prefer

Splitting Input Sentence for Machine Translation Using Language Model with Sentence Similarity Takao Doi Eiichiro Sumita ATR Spoken Language Translation Research Laboratories 2 2

First, unsupervised WA is performed on the SMT training corpus where the Chinese sentences are treated as sequences of characters; then, the Chi- nese sentences are segmented by

Our distortion model was trained as follows: We used 0.2 million sentence pairs and their word alignments from the data used to build the translation model as the training data for