• No results found

Dutch corpus

The Spoken Dutch Corpus. Overview and First Evaluation

The Spoken Dutch Corpus. Overview and First Evaluation

... Spoken Dutch Corpus project obtains data through other projects, as in the case of the private interviews that have been recorded within the project The pronunciation of Standard ...

7

Experiences from the Spoken Dutch Corpus Project

Experiences from the Spoken Dutch Corpus Project

... Spoken Dutch Corpus (Corpus Gesproken Neder- lands; CGN) project aims to develop a corpus of 1,000 hours of speech originating from adult speakers of standard ...The corpus is to serve ...

8

Using the Spoken Dutch Corpus for type-logical grammar induction

Using the Spoken Dutch Corpus for type-logical grammar induction

... The dependency-based annotation format employed within the Spoken Dutch Corpus (CGN) project (van der Wouden et al., 2002) has been designed in such a way as to enable a transparent mapping to the ...

7

Orthographic Transcription of the Spoken Dutch Corpus

Orthographic Transcription of the Spoken Dutch Corpus

... Spoken Dutch Corpus, the problems encountered in making that specification and the evaluation experiments that were carried out to assess the transcription efficiency and the inter- transcriber ...Spoken ...

6

CLiPS Stylometry Investigation (CSI) corpus: A Dutch corpus for the detection of age, gender, personality, sentiment and deception in text

CLiPS Stylometry Investigation (CSI) corpus: A Dutch corpus for the detection of age, gender, personality, sentiment and deception in text

... (CSI) corpus, a new Dutch corpus containing reviews and essays written by university ...The corpus currently contains about 305,000 tokens spread over 749 ...The corpus will be made ...

5

Word Segmentation in the Spoken Dutch Corpus

Word Segmentation in the Spoken Dutch Corpus

... This paper describes the aims of the word segmentation in the Spoken Dutch Corpus (Corpus Gesproken Nederlands, CGN), and the procedures to create it. For one million words, a manually verified ...

6

Syntactic Analysis in the Spoken Dutch Corpus (CGN)

Syntactic Analysis in the Spoken Dutch Corpus (CGN)

... the corpus) and Utrecht (for the Dutch part), the syntactic annotation tool Annotate, developed by DFKI Saarbruecken, is used (Plaehn, 1998; Brants, ...

6

Part of Speech Tagging and Lemmatisation for the Spoken Dutch Corpus

Part of Speech Tagging and Lemmatisation for the Spoken Dutch Corpus

... the Dutch prepositions, for instance, are not only used to intro- duce an NP or some other complement, but can also be used without adjacent complement, as a consequence of strand- ing or intransitive ...

7

Cross linguistic differences and similarities in image descriptions

Cross linguistic differences and similarities in image descriptions

... trilingual corpus of described ...new corpus of Dutch descriptions (Section ...across Dutch, US English, and German (Section ...the Dutch corpus available online and we also ...

10

The D-TUNA Corpus: A Dutch Dataset for the Evaluation of Referring Expression Generation Algorithms

The D-TUNA Corpus: A Dutch Dataset for the Evaluation of Referring Expression Generation Algorithms

... D-TUNA corpus, which is the first semantically annotated corpus of referring expressions in ...the corpus addresses several other research goals. Firstly, the corpus contains both written and ...

6

From D-Coi to SoNaR: a reference corpus for Dutch

From D-Coi to SoNaR: a reference corpus for Dutch

... reference corpus of written ...reference corpus of written Dutch, a pilot project was ...The Dutch Corpus Initiative project or D-Coi was highly successful in that it not only realized ...

8

Linguistic Problems Based on Text Corpora

Linguistic Problems Based on Text Corpora

... The genre of self-contained linguistic problems appeared long before the onset of corpus linguis- tics. The authors of most problems either con- structed phrases or sentences on their own, or (much less commonly) ...

9

Grammar Driven versus Data Driven: Which Parsing System Is More Affected by Domain Shifts?

Grammar Driven versus Data Driven: Which Parsing System Is More Affected by Domain Shifts?

... For parsing, most previous work on do- main adaptation has focused on data-driven sys- tems (Gildea, 2001; McClosky et al., 2006; Dredze et al., 2007), i.e. systems employing (con- stituent or dependency based) treebank ...

9

Collection of a corpus of Dutch SMS

Collection of a corpus of Dutch SMS

... available corpus of Dutch text messages containing data originating from the Netherlands and ...This corpus has been collected in the framework of the SoNaR project and constitutes a viable part of ...

6

Integrating Linguistic Knowledge in Passage Retrieval for Question Answering

Integrating Linguistic Knowledge in Passage Retrieval for Question Answering

... The passage retrieval component in Joost includes an interface to seven off-the shelf IR systems. One of the systems supported is Lucene from the Apache Jakarta project (Jakarta, 2004). Lucene is a widely- used ...

8

The Multilingual Affective Soccer Corpus (MASC): Compiling a biased parallel corpus on soccer reportage in English, German and Dutch

The Multilingual Affective Soccer Corpus (MASC): Compiling a biased parallel corpus on soccer reportage in English, German and Dutch

... The reports are saved as plain text files in UTF-8 coding in separate folders according to which sub- corpus and category (WIN, LOSS, TIE) they belong to. The metadata for the three main subcorpora is split into ...

5

Data Collection and IPR in Multilingual Parallel Corpora. Dutch Parallel Corpus

Data Collection and IPR in Multilingual Parallel Corpora. Dutch Parallel Corpus

... the corpus, (ii) he/she contacts the legitimate author and asks his/her permission (iii), the author agrees and (iv) both parties sign an ...parallel corpus compilation, since more parties are involved ...

6

Testing the Processing Hypothesis of word order variation using a probabilistic language model

Testing the Processing Hypothesis of word order variation using a probabilistic language model

... same corpus that was used by Bloem et ...Large corpus (van Noord, 2009). This corpus consists of a 145 million word dump of the Dutch-language Wikipedia in August 2011, and among these words, ...

12

DIPLOMATIC CORPUS: BETWEEN THE DUTCH IN TANAH MELAYU AND THE NORTHERN MALAY COURTS, 1641-1699

DIPLOMATIC CORPUS: BETWEEN THE DUTCH IN TANAH MELAYU AND THE NORTHERN MALAY COURTS, 1641-1699

... of Dutch power in Tanah Melayu introduced a new concept of diplomatic corpus to the indigenous states in order to gain political and commercial influence in a particular ...diplomatic corpus of the ...

13

On task effects in NLG corpus elicitation: a replication study using mixed effects modeling

On task effects in NLG corpus elicitation: a replication study using mixed effects modeling

... NLG corpus elicitation recently started to receive more attention, but are usu- ally not modeled ...written Dutch descriptions to supplement the spoken data from the DIDEC corpus, and an- alyzed the ...

6

Show all 6445 documents...

Related subjects