• No results found

[PDF] Top 20 Feature Based Method for Document Alignment in Comparable News Corpora

Has 10000 "Feature Based Method for Document Alignment in Comparable News Corpora" found on our website. Below are the top 20 most common "Feature Based Method for Document Alignment in Comparable News Corpora".

Feature Based Method for Document Alignment in Comparable News Corpora

Feature Based Method for Document Alignment in Comparable News Corpora

... any document would imply fewer possible document alignment pairs for the ...each document, we use the term extraction model from Vu et ...per document are 556/37, 410/28 and 384/28 for ... See full document

9

Set Theoretic Alignment for Comparable Corpora

Set Theoretic Alignment for Comparable Corpora

... Sophisticated feature-based approaches have been developed in recent years in order to provide a method that may apply to larger sets of language pairs and ...a feature-based sentence ... See full document

10

Extracting bilingual terminologies from comparable corpora

Extracting bilingual terminologies from comparable corpora

... Our method works by first pairing each term ex- tracted from a source language document S with each term extracted from a target language doc- ument T aligned with S in the comparable cor- ...term ... See full document

10

Combining String and Context Similarity for Bilingual Term Alignment from Comparable Corpora

Combining String and Context Similarity for Bilingual Term Alignment from Comparable Corpora

... are based on corpus-independent features, ...context-based method was affected by corpus comparability ...hybrid method that combines compositional and contextual similarity scores as features ... See full document

12

Extracting Parallel Sentences from Comparable Corpora using Document Level Alignment

Extracting Parallel Sentences from Comparable Corpora using Document Level Alignment

... automatic document alignment in news and web corpora has been explored by a number of researchers, includ- ing Resnik and Smith (2003), Munteanu and Marcu (2005), Tillmann and Xu (2009), and ... See full document

9

ACCURAT Toolkit for Multi Level Alignment and Information Extraction from Comparable Corpora

ACCURAT Toolkit for Multi Level Alignment and Information Extraction from Comparable Corpora

... requires comparable corpora aligned in the document level as ...cognate based method fails, therefore, allowing increasing the recall of the ... See full document

6

Mining Large scale Comparable Corpora from Chinese English News Collections

Mining Large scale Comparable Corpora from Chinese English News Collections

... Existing approaches for keyword extraction could be distinguished into two main categories: supervised or unsupervised methods. Supervised machine learning algorithms were widely used in keyword extraction such as Naïve ... See full document

9

Sentence Alignment for Monolingual Comparable Corpora

Sentence Alignment for Monolingual Comparable Corpora

... sentence alignment. Our method em- phasizes the search for an overall alignment, while relying on a simple local similarity ...cal alignment within mapping fragments to find sen- tence ...the ... See full document

8

Building Comparable Corpora Based on Bilingual LDA Model

Building Comparable Corpora Based on Bilingual LDA Model

... Table 2: Existing Methods Comparison The table shows CS outperforms other algo- rithms, which indicates that bilingual LDA is valid to construct comparable corpora. Thuy et al. (2009) matches similar ... See full document

5

A Portable Method for Parallel and Comparable Document Alignment

A Portable Method for Parallel and Comparable Document Alignment

... fast method for parallel document alignment based on hapax legomena, ...module based on hapaxes and numerical entities; a classifier that includes three features based on ... See full document

13

Accurate Parallel Fragment Extraction from Quasi–Comparable Corpora using Alignment Model and Translation Lexicon

Accurate Parallel Fragment Extraction from Quasi–Comparable Corpora using Alignment Model and Translation Lexicon

... We then apply an averaging filter to the initial scores to obtain filtered scores in both directions. The averaging filter sets the score of one word to the average score of several words around it. We think the words ... See full document

7

STACC, OOV Density and N gram Saturation: Vicomtech’s Participation in the WMT 2018 Shared Task on Parallel Corpus Filtering

STACC, OOV Density and N gram Saturation: Vicomtech’s Participation in the WMT 2018 Shared Task on Parallel Corpus Filtering

... We based our ap- proach on STACC , an efficient and portable method for parallel sentence identification in comparable ...core method was expanded with a penalty based on the amount of ... See full document

7

Phrase based Parallel Fragments Extraction from Comparable Corpora

Phrase based Parallel Fragments Extraction from Comparable Corpora

... phrase-based method to ex- tract parallel fragments from the compa- rable ...decoder based on the hierarchical phrase-based (HPB) translation model to detect the alignments in ... See full document

5

Automatic Building and Using Parallel Resources for SMT from Comparable Corpora

Automatic Building and Using Parallel Resources for SMT from Comparable Corpora

... corpus based machine translation, especially Statistical Machine Translation (SMT), from comparable corpora has recently received wide attention in the field Machine Translation ...from ... See full document

10

Bootstrapping Entity Translation on Weakly Comparable Corpora

Bootstrapping Entity Translation on Weakly Comparable Corpora

... this feature also depends on the comparability of entity occurrences in time-stamped corpora, which may not hold as shown in Figure ...our method can find and compare articles, on different dates, ... See full document

10

Exploiting Comparable Corpora and Bilingual Dictionaries for Cross Language Text Categorization

Exploiting Comparable Corpora and Bilingual Dictionaries for Cross Language Text Categorization

... Using only comparable corpora. Figure 2 re- ports the performance without any use of bilingual dictionaries. Each graph show the learning curves respectively using a BoW kernel (that is consid- ered here as ... See full document

8

Parallel and comparable corpora: What are they up to?

Parallel and comparable corpora: What are they up to?

... and comparable corpora are of use to translation, it is difficult to generate ‘possible hypotheses as to translations’ with such data (Aston, ...Parallel corpora, in contrast, provide ‘[g]reater ... See full document

13

Identifying Comparable Corpora Using LDA

Identifying Comparable Corpora Using LDA

... applied based on the proportion of source language document’s NEs found in the target document (we do not expect all the NEs to be present in the target language: NEs could be mis-translated, and not all ... See full document

5

Named Entity Transliteration with Comparable Corpora

Named Entity Transliteration with Comparable Corpora

... works as follows: We pool all documents in a sin- gle day to form a large pseudo-document. Then, for each transliteration candidate (both Chinese and English), we compute its frequency in each of those ... See full document

8

A Factory of Comparable Corpora from Wikipedia

A Factory of Comparable Corpora from Wikipedia

... Given a pair of articles related by an interlan- guage link, we estimate the similarity between all their pairs of cross-language sentences with dif- ferent text similarity measures. We repeat the pro- cess for all the ... See full document

11

Show all 10000 documents...