Morphological Analysis

Top PDF Morphological Analysis:

Morphological Analysis

Morphological Analysis

MORPHOLOGICAL ANALYSIS MARTIN KAY M O R P H O L O G I C A L ANALYSIS A computer program that is intended to carry out nontrivial oper ations on texts in an ordinary language must start by recognizing[.]

20 Read more

Morphological Analysis of the Spontaneous Speech Corpus

Morphological Analysis of the Spontaneous Speech Corpus

This paper describes a project tagging a sponta- neous speech corpus with morphological infor- mation such as word segmentation and parts-of- speech. We use a morphological analysis system based on a maximum entropy model, which is independent of the domain of corpora. In this paper we show the tagging accuracy achieved by using the model and discuss problems in tagging the spontaneous speech corpus. We also show that a dictionary developed for a corpus on a certain domain is helpful for improving accu- racy in analyzing a corpus on another domain.

5 Read more

Finite state morphological analysis for Gagauz

Finite state morphological analysis for Gagauz

This paper describes a finite-state approach to morphological analysis and generation of Gagauz, a Turkic language spoken in the Republic of Moldova. Finite-state approaches are commonly used in morphological modelling, but one of the novelties of our approach is that we explicitly handle orthographic errors and variance, in addition to loan words. The resulting model has a reasonable coverage (above 90%) over a range of freely-available corpora.

5 Read more

Probabilistic Models for Korean Morphological Analysis

Probabilistic Models for Korean Morphological Analysis

Most previous systems use morpheme as a processing unit for morphological analysis. We would like to examine the effectiveness of the proposed models based on Eojeol and syllable. First, compare the models that use the Eojeol- unit analysis with others (“M” vs. “EM”, “S” vs. “ES”, and “MS” vs. “EMS”). When applying the Eojeol-unit analysis, AA is decreased, and AIS and 1A are increased. Then, compare the mod- els that use the syllable-unit analysis with others (“E” vs. “ES”, “M” vs. “MS”, and “EM” vs. “EMS”). When applying the syllable-unit anal- ysis, AIR and 1A are increased, and FR is de- creased. Therefore, both models are very useful when compared the morpheme-unit model only.
Show more

6 Read more

A Two Level Morphological Analysis of Korean

A Two Level Morphological Analysis of Korean

A TWO LEVEL MORPHOLOGICAL ANALYSIS OF KOREAN A T W O L E V E l / ~ ~ ' M O R t tfOI~OGI( ,AI~, ANALYSIS OF K O R E A N D c o k B o n g K i m , S u n g J i a L e e , K e y S u n C h o i , a n d G i l C[.]

5 Read more

Disambiguation of morphological analysis in Bantu languages

Disambiguation of morphological analysis in Bantu languages

Disambiguation of morphological analysis in Bantu languages Disambiguation of morphological analysis in Bantu languages A r v i H u r s k a i n e n D e p a r t m e n t o f A s i a n a n d A f r i c a[.]

6 Read more

Morphological Analysis as a Step in Automated Syntactic Analysis of a Text

Morphological Analysis as a Step in Automated Syntactic Analysis of a Text

MORPHOLOGICAL ANALYSIS AS A STEP IN AUTOMATED SYNTACTIC ANALYSIS OF A TEXT GUSTAV LEUNBACH M O R P H O L O G I C A L ANALYSIS AS A STEP IN A U T O M A T E D SYNTACTIC ANALYSIS OF A TEXT Introduction T[.]

8 Read more

Morphological Analysis of the Dravidian Language Family

Morphological Analysis of the Dravidian Language Family

We focus on the computational processing of Dravidian morphology, a critical issue since the family exhibits rich agglutinative inflectional morphology as well as highly-productive com- pounding. For example, Dravidian nouns are typically inflected with gender, number and case in addition to various postpositions. E.g., con- sider the word ag niparvvatattinṟeyeāppam ( അഗ്നിപർവ്വതത്തിന്റെയോപ്പം ) in Malayalam which is compromised of the compound noun stem agni+paṟavvatam (fire+mountain) and the following suffixes: tta (inflectional increment), inṟe (genitive case marker), ye (inflectional increment) and oppam (postposition). These combine to give the mean- ing of the English phrase ``with a volcano.'' This complexity makes morphological analysis obligatory for the Dravidian languages.
Show more

6 Read more

Morphological Analysis without Expert Annotation

Morphological Analysis without Expert Annotation

The task of morphological analysis is to produce a complete list of lemma+tag analyses for a given word-form. We pro- pose a discriminative string transduction approach which exploits plain inflection tables and raw text corpora, thus obviat- ing the need for expert annotation. Ex- periments on four languages demonstrate that our system has much higher cover- age than a hand-engineered FST analyzer, and is more accurate than a state-of-the-art morphological tagger.

6 Read more

Fast Yet Rich Morphological Analysis

Fast Yet Rich Morphological Analysis

(Morphological Analysis and GEneration for Ara- bic and its Dialects) system (Habash et al., 2005; Habash and Rambow, 2006). This system, which we use as starting point in this paper, compiles ab- stract high-level linguistic information of different types to finite state machinery. The second type is typically not implemented in finite-state technology. Examples include the Buckwalter Arabic Morpho- logical Analyzer (BAMA) (Buckwalter, 2004) and its extension A LMORGEANA (Habash, 2007). These

9 Read more

Design and Application of a Gold Standard for Morphological Analysis: SMOR as an Example of Morphological Evaluation

Design and Application of a Gold Standard for Morphological Analysis: SMOR as an Example of Morphological Evaluation

There are usually two perspectives to be considered when NLP tools are evaluated: the developer’s and the users’ view. Developers validate their tool by comparing the in- put/output pairs to what they expect, but they also check e.g. for the processing speed or other system parameters. Such validation of specific targets by the developer is de- pendent on the system’s knowledge base (e.g. lexicon con- tents and processing rules), in other words, developers val- idate and report on the performance of their system on the basis of what they expect it to be capable of doing. From the users’ perspective, system performance has to sat- isfy their requirements. We refer to Underwood (1998) who states – for NLP lexicons – that users’ requirements may significantly differ when being compared to what a system has to offer; this ranges from needing far less information than what the system has to offer to needing to extend or modify even the best output. Additionally, in the light of an increasing number of web services offering linguistic anal- ysis (including morphological analysis), the user should have the possibility to compare between different tools on offer.
Show more

8 Read more

Data Driven Morphological Analysis for Uralic Languages

Data Driven Morphological Analysis for Uralic Languages

work exploring morphological tagging for Finnish include Kanerva et al. (2018) and Silfverberg et al. (2015). However, work on full data-driven morphological analysis, where the task is to return all and only the valid analyses for each token irrespec- tive of sentence context, is almost non-existent for Uralic languages. The only system known to the authors is the recent neural analyzer for Finnish presented by Silfver- berg and Hulden (2018). The system first encodes an input word form into a vector representation using an LSTM encoder. It then applies one binary logistic classifier conditioned on this vector representation for each morphological tag (for example NOUN|Number=Sg|Case=Nom). The classifier is used to determine if the tag is a valid analysis for the given input word form. Similarly to Silfverberg and Hulden (2018), our system is also a neural morphological analyzer but unlike Silfverberg and Hulden (2018) we incorporate lemmatization. Moreover, the design of our system consider- ably differs from their system as explained below in Section 3.
Show more

14 Read more

Improving the Morphological Analysis of Classical Sanskrit

Improving the Morphological Analysis of Classical Sanskrit

in compound formation. Finite verbal forms are marked for number and (4) person (1st, 2nd, 3rd), and (5) by a complex system of tenses and modes. The rule based morphological analyzer produces fine- grained annotations that cover these five morphological categories. Because the classification methods used in this paper require a single output variable from a nominal scale, an obvious approach would use the Cartesian product of the five morphological categories as target variable. However, this approach unnecessarily complicates the learning process, because most feature combinations cannot cooccur in the morphological analysis of a single word. As a consequence, Hellwig (2015) reduced the tag set used
Show more

10 Read more

English Morphological Analysis with Machine-learned Rules

English Morphological Analysis with Machine-learned Rules

The morphological analyzer illustrated in this paper falls into the first class of Gold(2001) classification. The system aims at high accuracy of morphological analysis of English language with morphological rules obtained through unsupervised machine learning. The analyzer applies letter transitional probability proposed in Keshava&Pilter(2005) in morphological rule learning and in disambiguation of morphological analysis as well. An initial evaluation of the analyzer shows a promising result with an 88.42% precision, 78.46 recall and 83.14% F-score, which transcends the best results of English language reported in Unsupervised Segmentation of Words into Morphemes – Challenge 2005.
Show more

7 Read more

Juman++: A Morphological Analysis Toolkit for Scriptio Continua

Juman++: A Morphological Analysis Toolkit for Scriptio Continua

KyTea (Neubig et al., 2011) is a similar tool that can perform morphological analysis for languages with the continuous script. It can also be trained using partial annotation data and output point-wise confidence scores for the analysis result which were used for creating partially annotated data in an active learning scenario. Still, by using a point- wise approach and estimating auxiliary tags (like POS) after computing segmentation, KyTea trades off accuracy for simplicity. Juman++ is faster, has better accuracy, does tag estimation jointly with segmentation, uses an online learning approach and can use longer contexts in forms of RNNLM and trigram features.
Show more

6 Read more

Associative Model of Morphological Analysis: An Empirical Inquiry

Associative Model of Morphological Analysis: An Empirical Inquiry

Harri Jiippinen and Matti Ylilammi Associative Model of Morphological Analysis: An Empirical Inquiry.. Examples of suppletive allomorphs and zero morphs..[r]

16 Read more

A Fast and Accurate Partially Deterministic Morphological Analysis

A Fast and Accurate Partially Deterministic Morphological Analysis

In order to alleviate the performance loss of maximum matching, we propose a concept of Context Independent Strings (CISs), which are strings having no ambiguity in terms of morpho- logical analysis. We also propose an algorithm for the building of the CISs dictionary from the large amount of automatically analysed texts. The dic- tionary maps CISs to the results of morphological analysis (sequence of words and POS tags).

6 Read more

Morphological Analysis with Limited Resources: Latvian Example

Morphological Analysis with Limited Resources: Latvian Example

However, this morphological knowledge can be exploited by adding as training features the results from rule based morphological analysis described in section 4. That gives a reasonably accurate (contains correct form in 98% cases) list of what tags seem possible for each word. So in addition to the used classifier training features commonly used for other languages, we also supply a list of possible part-of-speech and tag options for the selected word and its closest neighbours. We also provide a ‘recommended’ POS and tag, calculated as described in section 5.1, which gives ~1% additional boost in accuracy. This change augments the machine learning of ending (letter n-gram) relations with morphological features with the linguistic rules in analyser, and allows to achieve good results with rather small training corpora.
Show more

11 Read more

3D Facial Morphological Analysis for Genetic Association

3D Facial Morphological Analysis for Genetic Association

This study attempts to explore the discriminated phenotype features of the common facial morphological variations between the Mainland Japanese and the Ryukyuan; the difference of phenotype features between these two populations is prospected to infer different gene base sequences. In order to explore the phenotype features of facial morphology between the Mainland Japanese and the Ryukyuan, we propose a general framework of 3D facial morphological analysis, which is shown in Figure 1. Our framework mainly includes two steps: (1) registration and landmark correspondence (procedures 1-4 in Fig.1), (2) statistical analysis of 3D facial morphological variation and population classification using facial morphological features (procedures 5 and 6 in Fig. 1). Both principal component analysis (PCA) and Mean Hyperplane are used for exploring the facial morphological variations [7]. Experiments show that our proposed strategy can give promising identification performances between the Mainland Japanese and the Ryukyuan.
Show more

6 Read more

Morphological analysis of a non standard language variety

Morphological analysis of a non standard language variety

From the viewpoint of automatic processing, the non-standard word-forms described in Section 4 should be divided into two groups: those that have to be included in the user lexicon manually, and those that can be normalized using some kind of rewriting rules prior to the morphological analysis, or added to the user lexicon automatically. This division corresponds roughly to that of frequent, irregular, non-productive on one side, and infrequent, regular, productive morphological or orthographic changes, on the other side.
Show more

8 Read more

Show all 10000 documents...