• No results found

unlabeled text

Exploiting Unlabeled Text to Extract New Words of Different Semantic Transparency for Chinese Word Segmentation

Exploiting Unlabeled Text to Extract New Words of Different Semantic Transparency for Chinese Word Segmentation

... exploits unlabeled text data to improve new word identification and Chinese word segmentation ...on unlabeled data and encode this information as ...the unlabeled training and test set to ...

6

Discriminative Learning of Selectional Preference from Unlabeled Text

Discriminative Learning of Selectional Preference from Unlabeled Text

... We present a discriminative method for learn- ing selectional preferences from unlabeled text. Positive examples are taken from ob- served predicate-argument pairs, while nega- tives are constructed from ...

10

A Chinese Word Segmentation System Based on Structured Support Vector Machine Utilization of Unlabeled Text Corpus

A Chinese Word Segmentation System Based on Structured Support Vector Machine Utilization of Unlabeled Text Corpus

... Hanyu Pinyin is the form of sound for Chi- nese text and the Chinese phonology informa- tion is explicit expressed by Pinyin. It is cur- rently the most commonly used romanization sys- tem for Standard Mandarin. ...

5

Chinese Word Segmentation with Conditional Support Vector Inspired Markov Models

Chinese Word Segmentation with Conditional Support Vector Inspired Markov Models

... enormous unlabeled corpus for CWS, such as some statistics information on co-occurrence of sub- sequences in the whole text has been extracted from unlabeled data and been employed as input features ...

7

Bootstrapping Semantic Analyzers from Non Contradictory Texts

Bootstrapping Semantic Analyzers from Non Contradictory Texts

... predicting text-meaning alignments is interesting in itself, as the extracted alignments can be used in training of a statisti- cal generation system or information extractors, but we also believe that evaluation ...

10

Teaching a Weaker Classifier: Named Entity Recognition on Upper Case Text

Teaching a Weaker Classifier: Named Entity Recognition on Upper Case Text

... case text can be improved by us- ing a mixed case NER and some unlabeled ...some unlabeled mixed case text, which are then used as additional training mate- rial for the upper case ...case ...

8

Semi Supervised QA with Generative Domain Adaptive Nets

Semi Supervised QA with Generative Domain Adaptive Nets

... obtain unlabeled text ...leverage unlabeled text to boost the perfor- mance of question answering models, especially when only a small amount of labeled data is avail- able? The problem is ...

11

Design Challenges and Misconceptions in Named Entity Recognition

Design Challenges and Misconceptions in Named Entity Recognition

... sions: text chunks representation, inference algo- rithm, using non-local features and external knowl- ...on unlabeled text can be an alternative to the traditional semi-supervised learning ...

9

Improved Pattern Learning for Bootstrapped Entity Extraction

Improved Pattern Learning for Bootstrapped Entity Extraction

... data, unlabeled entities are either assumed to be negative or are ignored by the existing pat- tern scoring ...general text, and exploiting distributional similarity and edit distances to learned ...

11

Natural Language Processing (Almost) from Scratch

Natural Language Processing (Almost) from Scratch

... We have described how we induced useful word embeddings by applying our architecture to a language modeling task trained using a large amount of unlabeled text data. These embeddings improve the ...

45

Distributional Representations for Handling Sparsity in Supervised Sequence Labeling

Distributional Representations for Handling Sparsity in Supervised Sequence Labeling

... For our experiment on domain adaptation, we fo- cus on NP chunking and POS tagging, and we use the labeled training data from the CoNLL 2000 shared task as before. For NP chunking, we use 198 sentences from the ...

9

Learning Script Participants from Unlabeled Data

Learning Script Participants from Unlabeled Data

... Chambers and Jurafsky (2008; 2009) exploit coreference chains and co-occurrence frequency of verbs in text corpora to extract narrative schemas describing sequences of events and their participants. 1 Because this ...

8

Positive Unlabeled Learning for Deceptive Reviews Detection

Positive Unlabeled Learning for Deceptive Reviews Detection

... Another family of methods learned the final classifier by using positive examples dataset and all examples of the unlabeled dataset. Li et al. (Li et al., 2009) studied PU learning in the data stream environment, ...

11

Boosting Entity Linking Performance by Leveraging Unlabeled Documents

Boosting Entity Linking Performance by Leveraging Unlabeled Documents

... The weakly- or semi-supervised set-up, which we use, is not common for entity linking. The only other approach which uses a combination of Wikipedia and unlabeled data, as far as we are aware of, is by Lazic et ...

11

Enhancing Chinese Word Segmentation Using Unlabeled Data

Enhancing Chinese Word Segmentation Using Unlabeled Data

... Table 3 summarizes the segmentation results on the development data with different configurations, rep- resenting a few choices between baseline, statistics- based and document-based feature sets. In this table, the ...

10

Using Unlabeled Data to Improve Author Identification

Using Unlabeled Data to Improve Author Identification

... of unlabeled examples (those that considerably augment the dissimilarity among classes) than incorporate a lot of doubtable-quality ...ten unlabeled examples by ...other text categorization ...

5

Supervised Keyphrase Extraction as Positive Unlabeled Learning

Supervised Keyphrase Extraction as Positive Unlabeled Learning

... weighting unlabeled training data as in (Elkan and Noto, 2008), and self-training in which only confident initial predictions are used as positive and negative ...

6

Pretraining Sentiment Classifiers with Unlabeled Dialog Data

Pretraining Sentiment Classifiers with Unlabeled Dialog Data

... The sentiment dataset includes about 100K tweets with manually annotated three-class sen- timent labels: positive, negative, and neutral. The breakdown of positive, negative, and neutral in the training set was 15.0, ...

7

Attacking Parsing Bottlenecks with Unlabeled Data and Relevant Factorizations

Attacking Parsing Bottlenecks with Unlabeled Data and Relevant Factorizations

... The unlabeled data features improved the already state-of-the-art dpo3 parser in UAS, complete sen- tence accuracy, conjunctions, and ...of unlabeled data improves parser performance, increasing the size of ...

9

Extensions to Metric Based Model Selection

Extensions to Metric Based Model Selection

... Table 4: Comparing cross-validation (XVT), ADJ, and xADJ on the UCI Boston Housing dataset. The table reports the average Test MSE and standard errors, within the Forward Feature Selection setting described in the ...

19

Show all 10000 documents...

Related subjects