• No results found

word tokens

Clustering of word types and unification of word tokens into grammatical word-classes

Clustering of word types and unification of word tokens into grammatical word-classes

... Proceedings Paper: Atwell, ES 2004 Clustering of word types and unification of word tokens into grammatical word-classes.. Proceedings of TALN04: XI Conference sur le Traitement Automati[r] ...

7

Lexicon Infused Phrase Embeddings for Named Entity Resolution

Lexicon Infused Phrase Embeddings for Named Entity Resolution

... actual word types in the bigram, and (3) each word type has non-zero probability only on a single ...of word types to classes, then, and a corpus of text, it is easy to estimate these proba- bilities ...

9

Building a Large Annotated Corpus of Learner English: The NUS Corpus of Learner English

Building a Large Annotated Corpus of Learner English: The NUS Corpus of Learner English

... categories? The basic statistics of the NUCLE cor- pus are shown in Table 5. In these statistics, we treat multiple alternative annotations for the same error as separate errors, although it could be argued that these ...

10

Spelling Aware Construction of Macaronic Texts for Teaching Foreign Language Vocabulary

Spelling Aware Construction of Macaronic Texts for Teaching Foreign Language Vocabulary

... some word tokens with glosses in a foreign language (L2), in such a way that the student can acquire L2 vocabulary simply by reading the resulting macaronic ...the word guessing and learning ability ...

6

A Workbench for Finding Structure in Texts

A Workbench for Finding Structure in Texts

... For instance, we can index documents by word-tokens in their sentences, or by sentence length, or we can index sentences themselves by their tokens, or by tokens together with their part[r] ...

8

PP 2009 50: 
  A syllable frequency list for Dutch

PP 2009 50: A syllable frequency list for Dutch

... CGN’s word frequency table has 143850 entries (representing counts of a total of 6153974 word tokens), of which 22623 entries ...108889 tokens or ...

9

Hindi language as a graphical user interface to relational  database for transport system

Hindi language as a graphical user interface to relational  database for transport system

... into tokens and all tokens are seperated by space gap and this tokens are stored into ...called tokens. These Hindi tokens are stored in the lexicon(system dictionary)with their ...

5

Expectations of Word Sense in Parallel Corpora

Expectations of Word Sense in Parallel Corpora

... Gale et al. (1992) used French translations as En- glish sense indicators in the task of WSD. For in- stance, for the English word duty, the French transla- tion droit was taken to signal its tax sense and devoir ...

5

Combining multiple information types in Bayesian word segmentation

Combining multiple information types in Bayesian word segmentation

... each word length. When a word is generated in the Dirichlet process, the generative model would decide whether to assign stress accord- ing to one of these rules or to assign lexical stress from a default ...

10

A Unified Morpho Syntactic Scheme of Stanford Dependencies

A Unified Morpho Syntactic Scheme of Stanford Dependencies

... The SD scheme defines a core set of labels and principles which are assumed to be useful for different languages. However, a close exam- ination of the SD label-set and inheritance hier- archy reveals that some of its ...

7

From Thought to Words to Print: Early Literacy Development in Grade 2

From Thought to Words to Print: Early Literacy Development in Grade 2

... This study examines the relationship of the underlying skills of printing, spelling and vocabulary choices as they influence the quality of writing at the end of Grade 2. Four classes of Grade 2 (N=85) writing in ...

22

Handling Named Entities and Compound Verbs in Phrase Based Statistical Machine Translation

Handling Named Entities and Compound Verbs in Phrase Based Statistical Machine Translation

... Venkatapathy and Joshi (2006) reported a dis- criminative approach of using the compositional- ity information about verb-based multi-word expressions to improve word alignment quality. (Ren et al., 2009) ...

9

On the role of context and prosody in the interpretation of ‘okay’

On the role of context and prosody in the interpretation of ‘okay’

... calculated for each individual label vs. the other two labels and for all three labels, in both study condi- tions. From this table we see that, while there is very little overall agreement among subjects about how to ...

8

Mining Themes and Interests in the Asperger’s and Autism Community

Mining Themes and Interests in the Asperger’s and Autism Community

... Among the identified topics, there are three popular topics discussed in the Aspies Central fo- rum: topic 4, topic 19 and topic 31. From the top word list, we identified that topic 4 is composed of keywords ...

10

Natural Language Inference with Definition Embedding Considering Context On the Fly

Natural Language Inference with Definition Embedding Considering Context On the Fly

... uses word dictionaries as external knowledge. Word dictionaries are useful for domain adapta- tion, where we need to understand rare or novel words in which we do not have good embedding ...a word is ...

6

Terminology Extraction with Term Variant Detection

Terminology Extraction with Term Variant Detection

... Since the first TermSuite release (Rocheteau and Daille, 2011), several enhancements about TET have been made. We developed UIMA To- kens Regex, a tool to define term and variant pat- terns using word annotations ...

6

Reducing OCR Errors in Gothic Script Documents

Reducing OCR Errors in Gothic Script Documents

... on word frequencies and the distribution of similarly spelled ...correct word, which is found only once in the whole corpus, whereas Liegenschast is not correct (in fact, it is misrecognized for ...

7

An Empirical Evaluation of Stop Word Removal in Statistical Machine Translation

An Empirical Evaluation of Stop Word Removal in Statistical Machine Translation

... stop word removal in Information Re- trieval, and later motivated by the finding that text will become less confusing after ...the word relaxation strategy, at least in the case of the specific ...

8

Linguistica 5: Unsupervised Learning of Linguistic Structure

Linguistica 5: Unsupervised Learning of Linguistic Structure

... With induced knowledge analogous to word cat- egories in natural language, results of unsupervised morphological learning could be improved. For in- stance, morphophonology could be learned. In- duced ...

5

The Karlsruhe Institute of Technology Translation Systems for the WMT 2015

The Karlsruhe Institute of Technology Translation Systems for the WMT 2015

... on word classes learned by clustering the words of the corpus using the MK- CLS algorithm (Och, ...each word token of the target language corpus by its corresponding POS tag or cluster ...

6

Show all 10000 documents...

Related subjects