• No results found

[PDF] Top 20 Building a Web Based Parallel Corpus and Filtering Out Machine Translated Text

Has 10000 "Building a Web Based Parallel Corpus and Filtering Out Machine Translated Text" found on our website. Below are the top 20 most common "Building a Web Based Parallel Corpus and Filtering Out Machine Translated Text".

Building a Web Based Parallel Corpus and Filtering Out Machine Translated Text

Building a Web Based Parallel Corpus and Filtering Out Machine Translated Text

... as parallel with very low precision, from 20% to ...actually parallel or not. For example, if we need to get 100 000 really parallel documents we should check from 500 thousand to 100 million ...is ... See full document

9

Building a Monolingual Parallel Corpus for Text Simplification Using Sentence Similarity Based on Alignment between Word Embeddings

Building a Monolingual Parallel Corpus for Text Simplification Using Sentence Similarity Based on Alignment between Word Embeddings

... These text simplification corpora built from English Wikipedia and Simple English Wikipedia received some ...point out that Zhu et al. (2010)’s corpus has 17% of sentence pairs unaligned (two ... See full document

12

Automatic Detection of Machine Translated Text and Translation Quality Estimation

Automatic Detection of Machine Translated Text and Translation Quality Estimation

... of machine translated text from different MT systems, in an environment contain- ing both human generated and machine translated ...Hansard corpus (Germann, 2001), containing ... See full document

7

Building a German/Simple German Parallel Corpus for Automatic Text Simplification

Building a German/Simple German Parallel Corpus for Automatic Text Simplification

... to text simplification have used English/Simple En- glish Wikipedia as their ...our machine translation system designated to translate from German to Simple ...German parallel data is slowly becoming ... See full document

9

The Web as a Parallel Corpus

The Web as a Parallel Corpus

... the Web in order to extract the parallel text it ...the Web for bilingual text (STRAND) (Resnik 1998, 1999), incor- porating new work on content-based detection of translations ... See full document

32

Building a multilingual parallel corpus for human users

Building a multilingual parallel corpus for human users

... be based on morphological or syntactic pri- orities or represent a parochial view, the format of tags may be very different and confusing to the eye of a ...glish text ends up in failure, because the ... See full document

6

Mining Parallel Text from the Web based on Sentence Alignment

Mining Parallel Text from the Web based on Sentence Alignment

... enough parallel corpora with high quality. Parallel corpora made up with text in parallel translation are the foundational resource in data- driven natural language processing, which has a ... See full document

8

Parallel Corpus Filtering Based on Fuzzy String Matching

Parallel Corpus Filtering Based on Fuzzy String Matching

... Building machine translation (MT) systems, specifically NMT (Kalchbrenner and Blunsom, 2013; Cho et ...high-quality parallel train- ing ...any, parallel data. However, getting parallel ... See full document

5

A Corpus-based Study of Human-translated vs. Machine-translated Texts: The Case of Ellipsis in English-Persian Translation

A Corpus-based Study of Human-translated vs. Machine-translated Texts: The Case of Ellipsis in English-Persian Translation

... and machine translation (MT), the present study focused on the ellipsis based on Halliday and Hasan’s model ...bilingual parallel corpus called "Mizan corpus", including more ... See full document

14

NRC Parallel Corpus Filtering System for WMT 2019

NRC Parallel Corpus Filtering System for WMT 2019

... on parallel corpus filter- ing was essentially the same as last year’s edi- tion (Koehn et ...noisy corpus crawled from the web using ParaCrawl (Koehn et ...train machine translation ... See full document

9

Noisy Parallel Corpus Filtering through Projected Word Embeddings

Noisy Parallel Corpus Filtering through Projected Word Embeddings

... of web-scale parallel text min- ing, quality estimation and filtering is becom- ing an increasingly important step in multilingual ...of parallel text available (Schwenk, 2018; ... See full document

5

Alibaba Submission to the WMT18 Parallel Corpus Filtering Task

Alibaba Submission to the WMT18 Parallel Corpus Filtering Task

... In this task, we can divide the corpus cleaning task into three parts. Firstly, a high-quality paral- lel sentence pair should have the property that its target sentence precisely translates the source sen- tence, ... See full document

6

Web Text Corpus for Natural Language Processing

Web Text Corpus for Natural Language Processing

... utilised web data have accessed it through search engines, us- ing only the hit counts or examining a limited number of results ...actual text for ... See full document

8

Building the Croatian-English Parallel Corpus

Building the Croatian-English Parallel Corpus

... in parallel corpora started more than 30 \HDUVDJRDVSURI5XGROI)LOLSRYLüODXQFKHGWKH Yugoslav Serbo-Croatian—English Contrastive Project 1 in ...Brown corpus was acquired, cut in half ...and translated ... See full document

8

WN Toolkit: Automatic generation of WordNets following the expand model

WN Toolkit: Automatic generation of WordNets following the expand model

... a parallel corpus between English and the target language and perform an automatic sense tagging of the English ...corpora based on the sense information of all the languages (Shahid and Kazakov, ... See full document

9

Compiling and Filtering ParIce: An English Icelandic Parallel Corpus

Compiling and Filtering ParIce: An English Icelandic Parallel Corpus

... years machine translation (MT) systems have achieved near human-level performance in a few ...(NMT) machine translation systems, parallel data quality is im- portant and may weaken performance if ... See full document

6

Tilde’s Parallel Corpus Filtering Methods for WMT 2018

Tilde’s Parallel Corpus Filtering Methods for WMT 2018

... data filtering for statistical machine trans- lation (SMT) has shown to be a challenging ...Stricter filtering does not always yield positive re- sults (Zarin¸a et ...data filtering allows ... See full document

7

Building The Sense-Tagged Multilingual Parallel Corpus

Building The Sense-Tagged Multilingual Parallel Corpus

... multilingual corpus, which is the first such corpus for multiple Asian ...annotated corpus will be freely available at ...the corpus and utilize them in NLP tasks to test their ... See full document

7

The RWTH Aachen University Filtering System for the WMT 2018 Parallel Corpus Filtering Task

The RWTH Aachen University Filtering System for the WMT 2018 Parallel Corpus Filtering Task

... To check the quality of a filtering approach, we train a transformer model on the top 10M respec- tively top 100M subwords of the scored training data. We mainly focus on the 10M-subsampling results, as this ... See full document

9

Building Large Scale Text Corpus for Tibetan Natural Language Processing by Extracting Text from Web Pages

Building Large Scale Text Corpus for Tibetan Natural Language Processing by Extracting Text from Web Pages

... two web sites are using the same computer management system of news gathering and editing, which is a product of Beijing Founder Electronics company, to manage their articles and web ...some web ... See full document

10

Show all 10000 documents...