[PDF] Top 20 Mining Very Non Parallel Corpora: Parallel Sentence and Lexicon Extraction via Bootstrapping and E
Has 10000 "Mining Very Non Parallel Corpora: Parallel Sentence and Lexicon Extraction via Bootstrapping and E" found on our website. Below are the top 20 most common "Mining Very Non Parallel Corpora: Parallel Sentence and Lexicon Extraction via Bootstrapping and E".
Mining Very Non Parallel Corpora: Parallel Sentence and Lexicon Extraction via Bootstrapping and E
... extract parallel sentences from the matched English and Chinese ...Each sentence is again represented as word ...Chinese-English sentence pairs. Sentence pairs above a set threshold are ... See full document
7
Agreement based Learning of Parallel Lexicons and Phrases from Non Parallel Corpora
... Parallel corpora, which are large collections of parallel texts, serve as an important resource for inducing translation correspondences, either at the level of words (Brown et ...wide-coverage ... See full document
10
Bootstrapping Translation Detection and Sentence Extraction from Comparable Corpora
... of parallel data used to build ...comparable corpora—a collection of multilingual documents that are only topically aligned but not necessary translations of each other (Fung and Cheung, ...for ... See full document
6
Multimodal Comparable Corpora as Resources for Extracting Parallel Data: Parallel Phrases Extraction
... ble corpora look for parallel data at the sentence level (Zhao and Vogel, 2002; Utiyama and Isa- hara, 2003; Munteanu and Marcu, 2005; Abdul- Rauf and Schwenk, ...noisy parallel texts, to ... See full document
7
Parallel Sentence Extraction from Comparable Corpora with Neural Network Features
... this sentence pair as non- ...are very determinis- tic to judge this sentence pair as ...well sentence pairs (4733 → 4869) be- long to this type, which significantly improved the ...for ... See full document
5
Measuring Comparability of Documents in Non Parallel Corpora for Efficient Extraction of (Semi )Parallel Translation Equivalents
... comparable corpora is closer to our ...of parallel sentence and sub-sentence ex- ...of parallel sentence ...original corpora for bilingual lex- icon ... See full document
10
Chinese–Japanese Parallel Sentence Extraction from Quasi–Comparable Corpora
... not sentence–level parallel. Instead, they contain many parallel subsentential ...of sentence pairs extracted by “+Rank (Proposed)”, where parallel subsenten- tial fragments are in ... See full document
9
Inversion Transduction Grammar Constraints for Mining Parallel Sentences from Quasi Comparable Corpora
... than mining applications of the kind discussed in this ...that sentence translation between any two natural languages can be accomplished within ITG expressiveness (sub- ject to certain ... See full document
12
Accurate Parallel Fragment Extraction from Quasi–Comparable Corpora using Alignment Model and Translation Lexicon
... SampLEX lexicon are more ac- curate than IBM Model 1 ...external parallel data, except for IBM Model 1 ...external parallel data may have a bad effect on the precision of alignment for the ... See full document
7
Extraction of Multi word Expressions from Small Parallel Corpora
... Multi-word Expressions (MWEs) are lexical items that consist of multiple orthographic words (e.g., ad hoc, by and large, New York, kick the bucket ). MWEs are numerous and constitute a significant portion of the ... See full document
9
Bilingual Lexicon Extraction from Comparable Corpora Enhanced with Parallel Corpora
... Parallel sentence extraction from comparable cor- pora has been studied by a number of researchers (Ma and Liberman, 1999; Chen and Nie, 2000; Resnik and Smith, 2003; Yang and Li, 2003; Fung and ... See full document
8
Extracting Parallel Sub Sentential Fragments from Non Parallel Corpora
... each sentence that have a translation in the other sentence, according to our lexicon ...are parallel, and dis- card the ...the lexicon entry (achievements, ...target sentence as ... See full document
8
Bilingual Word Embeddings from Parallel and Non parallel Corpora for Cross Language Text Classification
... acquire sentence-aligned parallel corpora for many ...of non-parallel corpora such as topic-aligned ...topic extraction models (Vuli´c and Moens, 2014) showed promising ... See full document
11
Unsupervised Parallel Sentence Extraction with Parallel Segment Detection Helps Machine Translation
... for parallel sentence extrac- ...2017 parallel sentence extraction task compared to previous ...comparable corpora and showed bet- ter translation performance when using these ... See full document
11
The Challenges of Multi dimensional Sentiment Analysis Across Languages
... Related work does not provide a full picture of sentiment preservation in translation and we are in- terested in additional investigations with other data sets and setups. In particular, we would like to understand more ... See full document
5
Inducing Sentence Structure from Parallel Corpora for Reordering
... STIR predicts a pre-ordering via two pipelined models: (1) parsing and (2) tree reordering. The first model induces a binary parse, which defines the space of possible reorderings. In particular, only trees that ... See full document
11
Statistical Machine Translation with Word and Sentence Aligned Parallel Corpora
... the sentence pairs have been ...16,000 sentence pairs, which had an alignment error rate of ...16,000 sentence pairs (where all the sentence pairs are word-aligned) with an alignment error ... See full document
8
Mining Parallel Corpora from Sina Weibo and Twitter
... a sentence pair with the term diaosi, which is a popular buzzword among Chinese communities and is used to describe a class of under- privileged men, lacking certain desirable features (looks, social status, ... See full document
37
Automatic Building and Using Parallel Resources for SMT from Comparable Corpora
... Building parallel resources for corpus based machine translation, especially Statistical Machine Translation (SMT), from comparable corpora has recently received wide attention in the field Machine ... See full document
10
The LIGA (LIG/LIA) Machine Translation System for WMT 2011
... available corpora were pre-processed using an in-house script that normalizes quotes, dashes, spaces and ...crawled parallel giga corpus, keeping ...M sentence pairs. For ex- ample, sentence ... See full document
7
Related subjects