• No results found

Chinese Word Segmentation

Improving Cross Domain Chinese Word Segmentation with Word Embeddings

Improving Cross Domain Chinese Word Segmentation with Word Embeddings

... Cross-domain Chinese Word Segmentation (CWS) remains a challenge despite recent progress in neural-based ...semi-supervised word-based approach to im- proving cross-domain CWS given a baseline ...

10

Active Learning for Chinese Word Segmentation

Active Learning for Chinese Word Segmentation

... for Chinese word segmentation (CWS) are extremely re- source intensive in terms of annotation data ...a Word Boundary Annotation (WBA) model to make effec- tive active learning on CWS ...

10

Semi supervised Chinese Word Segmentation for CLP2012

Semi supervised Chinese Word Segmentation for CLP2012

... Chinese word segmentation (CWS) lays the essential foundation for Mandarin Chinese analysis. However, its performance is always limited by the identification of unknown words, especially for ...

6

Word Boundary Decision with CRF for Chinese Word Segmentation

Word Boundary Decision with CRF for Chinese Word Segmentation

... In this work, we analyze the relationship between WBD (Huang et al., 2007) and 4-tag character tagging approach for Chinese word segmentation. There are two main differences between them: One is ...

7

Word Context Character Embeddings for Chinese Word Segmentation

Word Context Character Embeddings for Chinese Word Segmentation

... for Chinese word segmentation: PKU and MSR from the second SIGHAN bakeoff shared task, and Chinese Treebank ...on Chinese novel ...

7

The Fourth International Chinese Language Processing Bakeoff: Chinese Word Segmentation, Named Entity Recognition and Chinese POS Tagging

The Fourth International Chinese Language Processing Bakeoff: Chinese Word Segmentation, Named Entity Recognition and Chinese POS Tagging

... CIPS Chinese Language Processing Evaluation in the summer of 2007, and co-organized by SIGHAN, Chinese LDC, and the Verifying Center of Chinese Language and Character Standards of the State Lan- ...

13

Chinese word segmentation model using bootstrapping

Chinese word segmentation model using bootstrapping

... Like SVMs, parameter vector w is learned with maximum margin principle using training data. To control the complexity of the training problem, cutting plane method is proposed to solve the resulted constrained ...

5

Improving Word Alignment by Adjusting Chinese Word Segmentation

Improving Word Alignment by Adjusting Chinese Word Segmentation

... adjust word segmentation so as to decrease the effect of lexicalization differences to improve word alignment ...adjust Chinese word segmentation according to their translation ...

8

Which Is Essential for Chinese Word Segmentation: Character versus Word

Which Is Essential for Chinese Word Segmentation: Character versus Word

... common segmentation standards of Bakeoffs, the comparison problem on word-based method and character-based method are still ...effective Chinese word segmentation techniques are turned ...

12

Multiple Character Embeddings for Chinese Word Segmentation

Multiple Character Embeddings for Chinese Word Segmentation

... Chinese word segmentation (CWS) is often re- garded as a character-based sequence label- ing task in most current works which have achieved great success with the help of pow- erful neural ...clue: ...

7

Synthetic Word Parsing Improves Chinese Word Segmentation

Synthetic Word Parsing Improves Chinese Word Segmentation

... inside Chinese synthetic words on a fine- grained word segmentation ...the Chinese word segmentation performance (especially on pseudo-OOVs) with- out introducing any new feature ...

6

Chinese Word Segmentation by Classification of Characters

Chinese Word Segmentation by Classification of Characters

... of Chinese word ...International Chinese Word Segmentation Bakeoff [Sproat and Emerson 2003] intended to evaluate the accuracy of different segmenters by standardizing the training and ...

16

Chinese Word Segmentation as Character Tagging

Chinese Word Segmentation as Character Tagging

... As a representative of purely statistical approaches, [Sproat and Shih, 1990] relies on the mutual information of two adjacent characters to decide whether they form a two-character word. Given a string of ...

20

Punctuation as Implicit Annotations for Chinese Word Segmentation

Punctuation as Implicit Annotations for Chinese Word Segmentation

... Chinese word segmentation based on position tagging was initiated by Xue ...in word segmentation (Peng, Feng, and McCallum 2004; Low, Ng, and Guo 2005; Zhao, Huang, and Li ...

8

Chinese Word Segmentation by Mining Maximized Substrings

Chinese Word Segmentation by Mining Maximized Substrings

... a word-character hybrid model as our baseline Chinese word segmentation system (Nakagawa and Uchimoto, 2007; Kruengkrai et ...of word- level and character-level nodes from a given in- ...

9

Chinese Word Segmentation without Using Lexicon and Hand crafted Training Data

Chinese Word Segmentation without Using Lexicon and Hand crafted Training Data

... Chinese Word Segmentation without Using Lexicon and Hand crafted Training Data Chinese Word Segmentation without Using Lexicon and Hand crafted Training Data Sun Maosong, Shen Dayang*, Benjamin K Tsou[.] ...

7

A Pragmatic Approach for Classical Chinese Word Segmentation

A Pragmatic Approach for Classical Chinese Word Segmentation

... Word segmentation, a fundamental technology for lots of downstream applications, plays a significant role in Natural Language Processing, especially for those languages without explicit delimiters, like ...

8

The Second International Chinese Word Segmentation Bakeoff

The Second International Chinese Word Segmentation Bakeoff

... finding word-boundaries is an essential first step in many natural language processing applica- tions including mono- and cross-lingual infor- mation retrieval and text-to-speech ...This word ...

11

Chinese Word Segmentation And its Effects on Chinese Information Retrieval

Chinese Word Segmentation And its Effects on Chinese Information Retrieval

... Each word can further be weighted according to its discrimination power (the word the appears so frequently in documents has lower discrimination power than the word uranium, thus plays a less ...

46

Show all 10000 documents...

Related subjects