• No results found

[PDF] Top 20 Enhancing Chinese Word Segmentation Using Unlabeled Data

Has 10000 "Enhancing Chinese Word Segmentation Using Unlabeled Data" found on our website. Below are the top 20 most common "Enhancing Chinese Word Segmentation Using Unlabeled Data".

Enhancing Chinese Word Segmentation Using Unlabeled Data

Enhancing Chinese Word Segmentation Using Unlabeled Data

... Penn Chinese Tree- bank (CTB) ...the word segmentation application. In general, the use of unlabeled data can be moti- vated by two concerns: First, given a fixed amount of labeled ... See full document

10

Adaptive Chinese Word Segmentation

Adaptive Chinese Word Segmentation

... allowable word segmentation for each ...million Chinese characters from various domains of text such as newspapers, novels, maga- zines ...held-out data for model parameter training as shown ... See full document

8

Chinese Word Segmentation by Mining Maximized Substrings

Chinese Word Segmentation by Mining Maximized Substrings

... We sketch the process of maximized substring retrieval in Pseudocode 1. From the beginning of the document D, we scan each position and regis- ter maximized substrings into the data structure H. If an incoming ... See full document

9

Labeled Morphological Segmentation with Semi Markov Models

Labeled Morphological Segmentation with Semi Markov Models

... (e.g., Chinese and English) are morphologically impov- ...bilingual word alignment (Eyigöz et ...morphological segmentation by describing a high-performance, data-driven tool for handling ... See full document

11

An Double Hidden HMM and an CRF for Segmentation Tasks with Pinyin’s Finals

An Double Hidden HMM and an CRF for Segmentation Tasks with Pinyin’s Finals

... our Chinese word segmentation based on the proposed conditional support vector Markov models for sequential labeling tasks, especially Chinese word segmen- ...well-known data, ... See full document

6

Active Learning for Chinese Word Segmentation

Active Learning for Chinese Word Segmentation

... for Chinese word segmentation (CWS) are extremely re- source intensive in terms of annotation data ...of data acquisition is active learning, which aims to actively select the most ... See full document

10

Chinese Word Segmentation without Using Lexicon and Hand crafted Training Data

Chinese Word Segmentation without Using Lexicon and Hand crafted Training Data

... Definition 4 Given 'vxyw' a Chinese character string, dtsx:y is said to be a local maximum if.. And, the height of the local maximum dtsx:y is defined as:.[r] ... See full document

7

Chinese word segmentation model using bootstrapping

Chinese word segmentation model using bootstrapping

... training data is that all Arabic numbers, Latin letters and punc- tuations in the data are double-byte ...in Chinese texts, there are actually two ver- sions of codes for Arabic numbers, Latin let- ... See full document

5

A Chinese Word Segmentation System Based on Structured Support Vector Machine Utilization of Unlabeled Text Corpus

A Chinese Word Segmentation System Based on Structured Support Vector Machine Utilization of Unlabeled Text Corpus

... In our work, Chinese phonology information is used as basic features of Chinese characters in all models. For open tracks, we propose a new dou- ble hidden layers HMM in which a new phonol- ogy information ... See full document

5

Synthetic Word Parsing Improves Chinese Word Segmentation

Synthetic Word Parsing Improves Chinese Word Segmentation

... the word segmentation re- sults on PKU and MSR ...CRF word segmenter on the original PKU and MSR data sets with the same ...the word segmen- tation performance on ... See full document

6

Word Boundary Decision with CRF for Chinese Word Segmentation

Word Boundary Decision with CRF for Chinese Word Segmentation

... the segmentation speed is also very important in some applications, such as information retrieval and online machine translation ...long segmentation time would make the applications’ whole running time ... See full document

7

Multi Grained Chinese Word Segmentation

Multi Grained Chinese Word Segmentation

... In order to fully investigate the MWS problem, we have manually created a true MWS data of 1,500 sentences for final evaluation. From each test dataset in Table 5, we randomly sample 500 sentences with converted ... See full document

12

Neural Word Segmentation Learning for Chinese

Neural Word Segmentation Learning for Chinese

... is word centered, our proposed scoring model covers all three pro- cessing levels from character, word until sen- ...of word segmentation, the n-gram data sparsity is- sue makes it ... See full document

12

Unsupervised phonemic Chinese word segmentation using Adaptor Grammars

Unsupervised phonemic Chinese word segmentation using Adaptor Grammars

... vised word segmentation from phonemic representations of child-directed unseg- mented English ...unsupervised word segmentation of ...different segmentation models, and show that the ... See full document

9

Word Context Character Embeddings for Chinese Word Segmentation

Word Context Character Embeddings for Chinese Word Segmentation

... for Chinese word segmentation: PKU and MSR from the second SIGHAN bakeoff shared task, and Chinese Treebank ...training data are randomly se- lected as development ...on Chinese ... See full document

7

Enhancement of Feature Engineering for Conditional Random Field Learning in Chinese Word Segmentation Using Unlabeled Data

Enhancement of Feature Engineering for Conditional Random Field Learning in Chinese Word Segmentation Using Unlabeled Data

... In this work, unsupervised feature selection for CWS is based on frequent strings that are extracted automatically from unlabeled corpora. For convenience, these features are referred to as unsupervised features ... See full document

42

Exploiting Unlabeled Text to Extract New Words of Different Semantic Transparency for Chinese Word Segmentation

Exploiting Unlabeled Text to Extract New Words of Different Semantic Transparency for Chinese Word Segmentation

... exploits unlabeled text data to improve new word identification and Chinese word segmentation ...on unlabeled data and encode this information as ...by using ... See full document

6

Semi Supervised Sequential Labeling and Segmentation Using Giga Word Scale Unlabeled Data

Semi Supervised Sequential Labeling and Segmentation Using Giga Word Scale Unlabeled Data

... the unlabeled data incor- ...over unlabeled data can effectively incorporate the ‘labeled’ training data information via a ‘bias’ since λ included in A (x, y) is estimated from la- ... See full document

9

Chinese Word Segmentation without Using Lexicon and Hand crafted Training Data

Chinese Word Segmentation without Using Lexicon and Hand crafted Training Data

... Chinese Word Segmentation without Using Lexicon and Hand crafted Training Data Chinese Word Segmentation without Using Lexicon and Hand crafted Training Data Sun Maosong, Shen Dayang*, Benjamin K Tsou[.] ... See full document

7

Exploring Representations from Unlabeled Data with Co training for Chinese Word Segmentation

Exploring Representations from Unlabeled Data with Co training for Chinese Word Segmentation

... large unlabeled corpus. They are proved useful in the Chinese word segmenta- tion ...the Chinese word segmentation task, ... See full document

11

Show all 10000 documents...

Related subjects