• No results found

Arabic and Quranic computational linguistics projects at the University of Leeds المشاريع الحاسوبية على اللغة العربية والقرآن بجامعة ليدز

N/A
N/A
Protected

Academic year: 2019

Share "Arabic and Quranic computational linguistics projects at the University of Leeds المشاريع الحاسوبية على اللغة العربية والقرآن بجامعة ليدز"

Copied!
17
0
0

Loading.... (view fulltext now)

Full text

(1)

This is a repository copy of Arabic and Quranic computational linguistics projects at the University of Leeds زديل ةعماجب نآرقلاو ةيبرعلا ةغللا ىلع ةيبوساحلا عيراشملا.

White Rose Research Online URL for this paper: http://eprints.whiterose.ac.uk/81629/

Proceedings Paper:

Sharaf, A, Atwell, ES, Dukes, K et al. (8 more authors) (2010) Arabic and Quranic

computational linguistics projects at the University of Leeds ةغللا ىلع ةيبوساحلا عيراشملا زديل ةعماجب نآرقلاو ةيبرعلا. In: Proceedings of the workshop of Increasing Arabic Contents on the Web, organized by Arab League Educational, Cultural and Scientific Organization (ALECSO). Workshop of Increasing Arabic Contents on the Web, 16 Oct 2010, Damascus, Syria. . (Unpublished)

[email protected] https://eprints.whiterose.ac.uk/

Reuse

Unless indicated otherwise, fulltext items are protected by copyright with all rights reserved. The copyright exception in section 29 of the Copyright, Designs and Patents Act 1988 allows the making of a single copy solely for the purpose of non-commercial research or private study within the limits of fair dealing. The publisher or other rights-holder may allow further reproduction and re-use of this version - refer to the White Rose Research Online record for this item. Where records identify the publisher as the copyright holder, users can verify any specific terms of use on the publisher’s website.

Takedown

If you consider content in White Rose Research Online to be in breach of UK law, please notify us by

(2)

1

Abdul-Baquee Sharaf, Eric Atwell, Kais Dukes, Majdi Sawalha, Amal Al-Saif, Serge Sharoff and Katja Markert*

*School of Computing, Leeds University, Leeds, England. http://www.comp.leeds.ac.uk/arabic

Latifa Al-Sulaiti, Bayan Abu Shawar, Nora Abbas and Andy Roberts**

**Alumni of School of Computing, Leeds University

ﺒ ﺒ ﺖ ز ب ﺒ ﺒ ﺒ ﺖ أ ﺌ أ

.

ﺜﺚأ و ،تﺒوﺚ ﺒ ًﺒ ً ًﺒﺚ و ﺒ ﺒ ً ﺒ تﺒوﺚ ﺒو ﺒﺜﺚ

أ

ﺒ ﺒ ت

) machine learning (

ﺒ ﺒ ﺛ و م ﺒ

.

ﺠ لوأ و ،ت ﺒ ﺜ ﺚ و ﺒ ﺒ و

) corpus (

ﺒ و ﺜ ﺒ و

ﺒ ض ﺒو ﺜ و

.

ﺒ و ى أ ﺒ ﺜ ﺒو ﺒ ﺒ ﺒ ﺌﺒ ﺒو ة ﺒ

.

ﺒ ﺒ ﺒ ﺒ ﺒ تﺒوﺚأ أ

ﺒ ﺒ ﺛو ﺒو ﺒ ،

ﺒ ت و ﺒ

) discourse relations (

ت ﺒ ﺜﺒ ﺒ

ةﺜﺚ ﺒ ﺒ ﺒ

.

ﺒ ة ﺒ ﺒ ﺒت ﺒ و

ﺚﺒ ﺐ و ﺒ نآ ﺒ ﺜ ز

ﺒ ﺒ ﺒ ﺒ

.

و آ ةﺜو ﺖ ﺒ و

"

آ

"

ﺒ ي ﺒو

ﺒ ﺒ ﺜ ﺐو،ً ة ى

ﺒ ﺒو آ

.

ًﺒ و

"

نآ ة ﺛ

) " Quranic Arabic Corpus ] ( http //: corpus . quran . com [

و ﺐ ﺚﺜ و

آ م ﺒ م أو ﺧ ﺒى ت و ﺒ

.

و

ل ﺒ ﺨ ﺒ

و ، ﺒ نآ ﺒ ﺒ ﺒ و ت ﺒ ﺖ أ ﺌﺒ ً ﺒوًﺒﺜ ﺒو ﺒو

أ و ك نآ ﺒ ﺒ ﺒﺜ لو ﺒ ﺒ

50

ً ﺒز أ

.

ﺘ ﺒ و

"

نآ ﺒ

"

م ﺒ ﺌ ﺒو ب ﺒ م

2010 و

.

أ ﺤو ﺒ ﺒ م ً ﺜ و

آ ﺒ ﺒ تﺒ ﺚﺒ ﺐ ﺜ ن ﺒ

)) Quranic Arabic Dependency Treebank

ﺌآ و ﺒ نآ ﺒ بﺒ ﺐ ة

ﺒ ة

.

ﺒوﺜ ﺌ و آ ﺠ م أ

ﺒ نآ ﺒ ةﺚ و ﺒ

.

نآ ﺒ تﺒﺚ ة ﺛ ﺌ آ ﺒﺤو ﺒ ﺒ ﺒ ﺒ و

ﺤو ﺜﺒ WordNet

ﺒﺚ ﺚتﺒﺜ ﺐ ة ﺛ و

ﺤو ﺜﺒ نآ ﺒ ت FrameNet

آ ﺒ ﺒ ك و ، ﺒ ت نآ و

ﺒ آ ﺒ ﺒل ﺛو ﺒو ﺧ ﺒو ﺒو ب ﺒو تﺒﺚ ﺒي

.

ﺜ ل و

ﺒ ﺠ ﺒ ﺒ ﺒ ة ﺒ ﺒ نآ ﺒ

ﺖﺒ ﺒ و ي ﺒ ﺒ

ة ﺒ ﺒ ﺒ و

.

ﺒو ﺒ ضﺒ أ ل ﺒ

ﺒ ل ﺒ

ﺠ ﺒ ﺒ ﺒ ﺒ ﺒ ل ﺛو ﺒ ﺚ

.

و

ت ﺒ و ﺒو ةﺒﺜ ﺒ ى ﺒ ن ﺚ ﺒ ﺒ أ ض ُ

(3)

2

، ﺒ ﺒ ﺒﺚ و ﺒو ة ﺒ و ، ﺜ و ﺜ ﺒ ﺚﺜﺒ نﺐ

ى أ ﺨ ةﺜ ﺒ ﺒ ﺚﺜﺒ ﺒ ﺒ .

نآ ﺒ ة ﺒ ﺤو نﺐ

http //: corpus . quran . com /

ﺜ ﺒ ﺚﺜﺒ ﺒ

:

تﺒوﺚأ ،ﺜﺒ ﺒ ت ﺒ ﺟ ﺒ ، ﺒ ﺒ

و ، ﺒ ﺒ ﺒ وﺜﺒو ﺒ ﺒ ﺛ

.

1

.

أ ﺒو ت و ﺒ ﺌ ﺒ ﺒ ت ﺒ ب ﺒ ﺒ ت ﺒ نﺐ

ﺒ ﺒو ﺒ ﺒ ، ﺒ ﺒ ، ﺒ ﺒ ﺒ ، ﺒ ﺒ، ﺒ ةﺜو ﺒ

و ﺠ ﺒ

ت ﺒ ﺛ .

ة ت و ﺒ ﺖ ﺒ ًﺒ ًﺒ ﺒ ﺒ ﺒ ﺒ ﺜو و

ة ت

.

أ ﺒو ﺒ ﺒت ﺒ و ﺒ ﺒ ﺠ و وﺜو ﺒت ﺒ نأ ﺒو

ت ﺒ

ﺚ أ ﺨ ﺚ و و ﺛو ل و أ ﺒ ﺒ ﺒ ل ً

ل ﺒﺒ ت و

.

ﺒ ﺒو ﺒ ت ﺒ ل ﺒ ﺒنأ ﺐ، ﺒﺚ و ﺜ ﺒ أ ﺒ ﺒو

وﺜو ﺒ ت ﺒ ﺒ أ و ﺒ أو

.

ﺐ ز ﺒ و

.

ﺒ ً ﺒ ﺒ ً ة ﺒ ﺒو تﺒوﺚ ﺒ نأ ﺚﺜأ ﺌ يﺛ ﺑﺚ و

ﺠ ﺒ ض تﺒوﺚأ ﺒ

) concordance (

م ﺒ ﺌﺒ أ و ﺒ ﺒو

) part of

speech (

ﺒ ةﺌﺒ ﺒ و ﺒ ض تﺒوﺚأو

.

ﺒﺜ ﺒ ل و

] Atwell et al

04

[

ﺒ و تﺒوﺚ ﺒ ﺜ أ

ت ﺒ ًﺌ و ﺒﺜ ﺒ م وى ﺒ ت ة

ﺨ ﺒو ﺒ ﺒ ة ﺛ ﺜ ﺒ و ﺧ ةﺚ ﺒ ﺒ ﺨ ﺒ نأ ﺐ ﺒ

ً ﺛًﺒﺜ ة ﺒ ز ﺒ ﺒ

) gold standard (

ﺒ ﺒﺨ ﺒﺌ

ﺜ ﺒ ﺒ ﺒ أو

.

ُـ نأ ﺒو ﺴ

ُوًﺒ وً ً ﺒ ﺜ ﺒ ﺒ ً ﺛ ﺒ نآ ﺒ

ﺴ ﺷ ً

.

نآ ﺒو ﺒ ﺒ ﺒ ز ة ﺖ أ ﺒ ﺒ و

.

ﺧ ﺜ ﺒ و

ﺒ تﺒ ﺒ ﺌ ﺒ يؤ ﺒو ﺧﺒ ﺒ ض أ ،ﺖ ﺒ ﺗﺛ ، ﺒ ﺌ

ﺒ ﺒ ت و ت نو ﺒ ﺒ ول ﺒﺒ

.

2

.

ر

2.1

ضﺒ ﺒ

تﺒوﺚأو

ة ﺛ

ﺒ ﺒ

) corpus (

ﺒﺜﺚو ةﺚﺒ ﺒ ﺒ و ﺒ ﺒ ت ﺒ ًﺒزﺜ ًﺒﺜوﺚ .

ُﺒ ﺒ ﺒو ﺒ ل ً

ﺴ ﺷ ُﺒو ﺴـ ﺴ ﺷ ﺌﺒ ﺒ ﺒ و

ﺤ ﺒو ﺨ ﺒ ًﺒﺚ ﺒ ﺧ ﺒو ﺒ

.

ت ﺒ ًﺌ و

ﺒ ﺚﺒ ﺐ

ت ﺒ

.

ﺒﺜﺚ وأ ل ﺒ

] van Mol

00

[

ي ﺜ ﺐ ة ﺛ أ

240

ت ﺒﺛﺐ أ

ةﺜ م ﺒ ﺒﺛ ﺒ نأ ﺚ و ﺒ و

"

"

ةﺜ أ "

"

يو ﺜ ﺒم ﺚ ﺒ ﺒﺛ ﺒ أ، ﺒ ﺒ ﺒﺛ ﺒ ﺒ

(4)

3

ﺒ ﺒ ﺌﺒ ﺐ ﺒ ﺒ ﺒ ت ﺒ و

تﺒﺚ ﺒ

و

ﺒ ت ﺒ ﺒ ﺧ ﺒ

.

ﺒﺜﺚ وأ ل ﺒ

] Ghazali &

Braham

00

[

نأ و ن ي ة ﺛ ًﺌ

"

ﺴأ ﺴ ﺴ

"

و ﺜ ﺒ ﺒ ن :

ﺜ ﺒ ﺒ و ﺒ ﺤو ﺒ .

ﺒ ﺒ ﺒ و 11

%

ﺒ ﺒ ﺒ ﺒﺒ ﺚ وم ﺒ ت ﺒ ﺒ

.

ت ﺒ تﺒوﺚأﺌﺒ ﺐ أ ﺒ أً أ أ ي ﺒ ﺒ أ

و ﺒ ﺒو ﺒ ﺒو ﺒ

ت ﺒ ﺛ

.

ت ﺒ ﺒ و

ُ ﺒ ﺒ نأ ﺐ ﺒً و

ﺷ ﺒةو ﺒ ً .

ﺒ ت ﺒو ﺒ ﺒ ً ﺐو

-ُ ن ﺛ ﺐ أ

ً ة ﺒ ﺒ

ُـ ﺚ ﺐو ﺐ ﺚﺒ و ﺒ ﺠ ﺒ ة ﺛ ز

ﺴ ﺷ ض

ﺒ ﺒ ﺒ ﺒ ﺒ و ت ﺒ

.

ﺌ ﺐ ﺒ

"

ة ﺛ

) Corp us of Contemporary Arabic ] "(

Al -Sulaiti and Atwell

06

[

ي ﺒو

843

ةﺚ ﺒ و ﺒ ﺚﺒ ﺒ و ﺚﺒ و ة ﺜﺚ ةﺚ أ

:

و ت ، ﺜ، ﺒو ﺒ، ﺚﺒ ، ﺒﺛ ،ﺚ ﺐ، ﺒ،ة

.

ُ ة ﺒ ضﺒ ﺒ ﺒ نأ و أ ة ﺒ ﺚﺒ ضﺒ ﺒو ﺒو ﺚﺒ و

ﺷ ﺴ

ﺒ ﺠ ﺒ ﺒ و ﺒ ﺠ ﺒ

ﺒ ﺒﺌﺒ ﺛ ﺐ أ،

و ﺒ ت ﺒ

ﺴ ﺛ ،ﺜ

ي ﺒ ﺒ ﺒ ﺤ

) aConCorde ] (

Roberts et

al .

06

[

و ة ﺒ ت ﺒ ﺌ ﺐ ﺜ ﺠ ض ي ﺒو

ﺒ ﺒ

ﺒﺒ ﺗﺛ

.

1

:

ض

ﺒو

(5)

4

2 .2

ر

ﺒ ﺒ ﺒو ﺒ

) Part -of -Speech tagging and morphological analysis (

ة ت

.

و ﺒ ﺒ ت

.

و نﺐ ﺒ ﺤ

ز ﺒ ﺒ ِ ﺜ ﺲﺚﺒ ﺒ ﺒ ﺒ ﺠ ﺒ ﺒو ﺒ

.

ﺒ ﺒ ﺧ ﺒو

ل أو ﺤﺒ أو وأ ت ﺒ ﺠ ﺒ ﺒو ﺒ ﺒو ﺒ ﺒ ل ﺤو ﺒ

ﺒ و ﺒ ﺒﺠ ﺒ

.

ً ﺛ ًﺒﺜ مﺒ ﺒ و ،ً ةﺜ ﺒ ﺒ ت ﺒ ﺒﺜ ﺒ ﺒو ﺒ ﺒ ﺤو أ

) Gold

Standard for Evaluation (

و ، ﺒ ت ﺒ ﺒ

،ب ﺒ ة ﺒ ت ﺜ نأ ُ ، ﺜ ﺒ ﺌ ﺒ ﺒﺜ ﺒ

ﺒ ﺒ ة ﺌ و ﺒو ﺒ ﺒ ﺒ ت ﺒ ﺚ ﺒ

) Part -of -speech tagging and parsin g

(

ت ﺒ ﺜ ﺧ ﺒ ﺒ ﺒ ﺌ ﺒ ن ،

ة ﺒ ) Sawalha & Atwell 2008 .(

ﺒ ﺒ ﺒ ﺒ ت ﺒ أ ﺒ ﺒ ﺠ

) Sawalha & Atwell 2009 b ; Sawalha & Atwell 2009 a (

ﺒو ﺜ ً أ و ،

ﺒ ) Prior -knowledge Broad

-Coverage Lexical Resource (

ﺒ ﺒ ت

و ، ﺒ ﺒ ﺒ ﺒ ﺒ ت ﺒ ﺗﺒ ﺒ ل ﺒ ﺜ ﺒ ﺌ ، ﺒ ﺒو

ﺒ ﺒو ﺳ ﺒو ﺳﺜ ت ﺒ ُ ) Sawalha & Atwell 2010 a .(

ﺚ و زو و و ﺒ ﺜ و ، ﺒ ﺒت ﺒ ﺷ ﺒ ﺒ ت

ﺒ ﺌﺒ أ ﺒ

)

ﺒ ﺒ ﺒو ﺒو ﺒو ﺒ ﺒ

و

(

ﺒ نﺒ ﺒ ﺒ ﺌﺒ أ ﺳﺌ ﺚ و ،

ﺒ ﺚ ﺒ و نﺒ ﺒ ﺌﺒ ﺒ و ﺒ ُو ،ﺌ ﺒ ﺒ

) Sawalha & Atwell 2010 b . (

ﺒ ﺒ ﺒ

ﺒ ﺒ ﺒ و ﺒ مﺒ

1

) Morphological Features Tag Set for Arabic (

ﺒ ﺒ ﺒ ﺒ ﺒ ﺒ ﺒو ،

ﺒ نﺒ ﺒ ن ، و ﺒو ﺒ

) Tag (

ًﺒ ﺜ و ﺐ ن ﺜ

ﺜ ،ً

، ﺒ ً نﺒ ﺒ ﺒ و ، ﺒ وأ ﺒ ﺒ ى ﺐ ﺐ وأ

ﺒ ،ة ﺒ ﺒ ﺒﺧو ﺒو تﺒ ﺒوأ ﺒ و

) v (

نﺒ ﺒ لو ﺒ ﺒ

ﺒو ، ﺒ

) n (

ﺧ ﺒ نﺒ ﺒ ﺒ ﺒ ُ ﺒ ُ ﺴُو ، ﺒ ﺒ ﺐ ﺒ ﺒ

) m (

ﺧ ﺒو ﺒ

) f (

ﺒ ﺒ ﺒ ﺒ ﺒﺛﺐو ، ﺒ ﺒ ) -) ( ﺒ ( م ، ﺒ ) ﺋ ) ( لﺒ ﺒ (

ةﺚ و ﺒ ﺒ ﺒ نأ ﺐ

.

(6)

5

2 .3

ت

ر

ﺒ ت ﺒ

) discourse relations (

ﺒ و ب ﺌ زﺜ ﺜوﺚ

.

ةﺜ نأ ً "

ةﺜ ﺒ

ةﺜ

ﺒ وًﺒ

"

تﺚ

) Contrast (

) argument (

ﺒ ﺜ

" ."

ﺒ ت ﺒ ى أ ت ك ﺒ و

) Causual (

) Exemplification (

) Conditional (

) Background (

) Temporal (

و

.

ة ﺒ ﺒ ت ة ﺛ لوأ و

] Al -Saif and Markert

10

[

ﺛو

ﺒ ﺒو ﺒ و ﺒ ﺒ ﺌﺒ ﺒ ت ﺒ

.

آ أ ة ﺒ و

يﺜ ﺐ أ

.

ﺒ ل ﺒ

) similarity (

ب ﺒ

) arg

1

and arg

2

(

ةﺒﺚ و

) discourse connective DC " (

."

2

:

ﺒ ﺒ

ةﺒﺚأو

ةﺜ

و

يو ﺒ ةﺒﺚأ ة ﺒ ﺚﺒ ﺐ و

ﺴُوض ﺒﺒ ً ﺷ

ﺜ ﺒو ﺒ ﺒ

.

ﺒ ﺒ ل ﺒ ﺒو

.

3

:

ت

يو ﺒ

ﺒو

ﺒﺌﺒ ل ﺒ نأ و ﺒ و ﺒ ة ﺒ نﺐ

ة ت ﺒ ﺒ

.

(7)

6

3

.

نآ ﺒ

ر

أ ب أ ة ﺛو ﺒ ﺒ نآ ﺒ أ ﺜﺚأ

:

و ،ً و ﺘ نآ ﺒ ن

ُ ﺴ ﺷ

ل وﺛو

.

ﺒ ن ﺛ ﺐ أ

ت ً ً ﺒ و لوﺒ و ﺜ نآ

.

ﺒ أ و ﺒو ﺒ نآ ﺒ ﺒو ﺒ ت ﺒ ت ﺚ و و

أ ﺒو تﺒﺚ ﺒ و

.

ﺒ ﺒ وأ كﺜ ﺒ ﺒ و

ﺒ ﺒ نآ ﺒ .

م ﺚ ب نآ ﺒ ن

ﺒ ﺒ و ﺒ ﺒ ﺒ

.

ن نآ ﺒ ن و

80 أ

ﺛ ﺒ ﺒ ﺒ ﺒ ت ﺒ لﺒوﺚ

و ،ﺨ ن تﺒﺛ

ﺒ ﺖﺒ ﺒ و ي ﺒ ﺒ ى أ ﺠ ﺨ نآ ﺒ

.

ى أ ت ً ﺛًﺒﺜ نآ ﺒ

.

ﺒ ﺜ ﺒوﺖ ﺒ و

ﺜ ﺒوة

ﺒنآ ﺒ ز

.

3 .1

"

آ

"

آ ﺒ

ﺒ ﺒ

ﺒ نآ ﺒ ﺤ ﺒ ﺒ ﺒ ن

.

ﺒ ﺚ و

ة ﺒ ﺒ ﺤ ﺜﺒ

. آ ﺒ ﺜ ﺒ ةﺜ و

ﺞ ن ﺜ ﺒ ي ﺒ ﺒ ﺤو ز

و لوﺒ ﺐ ت

"

] "

Mushaf at - Taj w eed 1420H

[

نآ ﺒ ﺒ ﺒ ﺚ ﺒو ﺒ ﺒ ﺒ ﺒ ﺒ و

.

ﺒ ﺒو

ت ﺒ ى ﺗﺛ

.

4

:

ة

"

آ

"

آ ﺒ

ﺒ ﺒ

(8)

7

3 .2

ﺒو

ةرو ﺒ

ت

ﺒةﺜو ﺒ ﺒ

)

dialogue systems

(

ى ب ي ﺒو آ ﺐﺌ ﺐ ض

ﺒ ﺜ

ﺴ و ﺜﺒ ﺒ ﺜﺒ ﺒو

ﺷن ﺴﺜو ﺒ ﺸ ﺴ ﺴ . ﺴ ﺷ ﺌﺒ ًﺒ ًﺒ ﺒ ﺒ ﺤ ﺒ ﺒ

ﺒ ﺢ أ ﺚ ةﺚ ﺒل ﺒ و ﺒﺌ ﺒ

) Question patterns ( وﺒ ﺒ أو . ﺒ و

نآ ﺒ أﺠ ﺒ ﺚ ﺒ و ﺒنآ ﺒ ل .

و

ﺚﺒ ﺐ ]

Shawar and Atwell

04

[

أ، ﺒ ت ﺒ ي آ ت آ ﺜﺒ ﺑ

أ ﺒ ﺒ آ آ ﺒن ﺒﺛﺐ

لﺒ آ آ ن ﺒ أ

آ ﺚﺜﺒو .

ﺒ ﺒ ﺒ

.

5

:

آ

ﺜﺒ

ل

ﺒو ﺒ ت ﺒ ة ت ب ﺒ ﺒ ﺌﺒ ﺐ ﺒ ةﺌ أ ل ﺒ أ

ﺒ ﺛ ﺒ

.

ت ﺐ ي ﺒو ﺒ آ ﺒ ة ﺒ نآ

آ ت و ﺒﺚ

ةﺌ أﺜﺒ ﺒ ن ﺒ ة

.

ً وأ ن نأ ﺚﺜأ ﺛ أ و

ة ت ﺌ ﺒ ﺒنآ ﺒ ﺒوي ﺒ ﺒ ت ﺒ

.

3 .3

نآ ﺒ

ل

تﺒر إ

ﺒ تﺒﺜ ﺒ

) Semantic Frames (

ﺒ وأ ﺒ نﺒ ﺒ ﺒ ﺒ ﺜﺒوﺚ ﺒ

. ً " " ُـ نأ ﺴ ﺷ ﺒ ﺜﺒوﺚ ﺒ ةﺚﺜﺒ ﺒ ةﺜ ﺒ "

ب ﺒ ﺒ

" و " ﺒ " و "

ﺒ ﺠ ﺒ

"

ﺛ و

.

ﺜ ﺒ ﺒ و

] Fillmore

76

[ ﺒ ن ﺒ ت ﺒ

ﺤو أ

FrameNet ]

Ruppenhofer et al

2005

[

ت ة ي ﺒو

تﺒﺜ ﺒ ﺒ ت ﺒ ز و ﺒ تﺒﺜ ﺒ ي ﺒ

.

ﺒ ﺒ ً

ً ﺚﺜ

"

ﺒﺌ ﺐ

" )

text creation

(

(9)

8

6

:

ﺜﺒوﺚ ﺒ

ﺜ ﺐ

ﺒ ن و

900

ل ﺐ ﺚ ﺜ ﺐ

10

و ﺧ آ

و ﺐ

.

ﺨ ﺐ و

ى أ ﺨ م ،ة ت تﺒﺜ ﺒ ﺒ ﺒو ﺒ ﺒ ﺌﺒ ﺒ ﺒ ﺤو ﺒ

و ﺒو ﺒو ﺒ ى أت ﺜ

.

ﺌﺒ ﺐ ﺐ ﺒﺜﺚ ﺌﺒ و

ﺒ نآ ﺒل ت ة

] Sharaf &

Atwell

09

[

تﺒﺜ ﺐ كﺒ ﺐك ن نﺐو أ وو FrameNet

ت ﺒ ﺒ ن ﺒ آ ﺒ ت ﺒ نأ ﺐ

ة

.

ﺒ نأ ً "

"

نآ ﺒ

"

ﺴ ﺜﺷ

"

ت ة ﺚﺜﺒو F

rameNet

"

ن ﺒ

."

ﺴ ﺒ و ﺸ ﺴ ﺸ

م ت ﺒ ﺌﺒ و ﺌﺒ ﺤو

ﺒ نآ ﺚ تﺒﺜ ﺐ ﺛ ًﺌ و آ ﺒ ت ﺒ ﺒ ﺒ و آ ﺒ تﺒﺚ ﺒ ﺒﺜ

.

م آ ﺚﺜ ﺐ و ﺒ ﺒن ﺒﺒ و

"

"

آ ﺒ ل

.

7

:

م

آ

ﺜ ﺐ

و

(10)

9

3.4

"

"

ﺒو

ر ﺒ

"

ﺒ ﺒ

) " Machine Learning (

ﺚﺒ ﺒ ﺗﺛ ﺒ ت ﺒ

ﺷ ُ ﺒ ﺒ

ﺒ ﺒ لﺒوﺚ ﺒ

" "

ﺠﺒ ﺒ ﺗﺛ ﺒ ﺒ

ً آ

ة ﺗﺛ

.

تﺒوﺚأ ك و

بﺜ ﺒ ﺌﺒ ة WEKA

ﺒ ﺜ ﺒ ﺘ

.

ﺚ لو ﺚﺒ ﺒو ﺌ ﺒ ت ﺒو ﺒ ﺜ ﺒ ﺤ و

ﺒ ﺌﺒزﺐةﺜ

.

ﺐت ﺒ ل ﺚ ﺛ و

Weka

ﺒ ﺒ لﺒوﺚ و .

ﺒ ﺒ ﺜ ﺒ ً

ﺜ ﺒو ﺒو ﺒ م لﺒ أو ة ﺒ ﺛ ﺜ .

ًﺌ و

ﺒ ﺒ ﺒ ت ﺒو تﺒ ﺒ

)

بﺒ و ، ، ، ،ﺜ ﺒ ، ﺒ

(

ﺜﺒ ﺚ ﺚﺒ و

ةﺜ و ت ﺒ .

ي ﺚ ﺒ و

13

ﺒو ﺒ ﺜ

.

و

Weka

ً ﺒت ﺒ ة ت ﺜ ﺌ

8

ﺒ ﺠﺒ ﺒ ﺐ

ﺒﺜ ﺒ ز )

ﺨﺜز ﺒ ن ﺒ

(

ﺒﺜ ﺒو

)

ﺒ ن ﺒ

.(

ﺜ ت

أوأ

.

تﺒﺜﺒ ة ﺗﺒ ﺒ لو و ﺒ ت ﺒ ﺠﺒ ﺒ ي ﺒ ﺒ ﺒ لﺒوﺚ ًﺌ ﺒ ﺒ

) decision tree (

9

.

ﺒ ﺜ ﺒ ز ﺒ تﺒﺜﺒ ﺒة ﺒ

) K (

ﺒو

) D . (

ةﺜ ةﺜ ﺒ ﺒﺛ ً "

ﺒ آ ﺒ أ

"

ت ﺒ ﺒ تﺒ ﺒ و

ةﺜ و ﺒ

.

بﺜ ﺒ أ ﺒ و

93

لﺒو ﺒ ى ﺒ ،ﺌ ﺒ ﺒ ﺒ ﺜ ﺒ

ـ ﺒﺜ ﺒ

21

ﺜ ﺧ ﺒ وو ﺒ ﺜ ﺒ ﺛ ﺜ و ﺒ

ـ ﺒﺜ ﺒ

21

.

ت

) clustering (

نأ ﺒ لﺒوﺚ ت ﺒً ﺠﺒ ﺒ ًﺌ ً آ ﺜ ﺒ

ﺒو ة ﺒﺜ

ﺴ نﺒ ﺧﺒ

ﺷ ة ﺒو

.

ﺒ و ﺜ

10

ت ﺒ

(11)

10

8

:

ة

ت ﺜ

weka

ﺜ ﺒ

:

ﺒو

ن

ﺜ ﺒ

ن

9

:

تﺒﺜﺒ

ة

)

decision tree

(

(12)

11

10

:

ت

آ ﺒ

ﺜ ﺒ

)

clustering

(

ﺜ ﺒ

و

ةﺜ ﺒ

ﺒو

)

K

(

ﺒو

)

D

(

آ ﺒ ﺒ ت ﺜﺒو ﺧ ﺒ ﺒ ﺒت ﺐ بﺜ ﺒ نﺐ

.

3.5

آ ﺒ

ة

)

Quranic Arabic Corpus

(

ز

"

ﺒ آ ﺒة ﺒ

"

أو ة ﺒ ن و ﺒ ﺒ

ﺒ ﺒ ﺒو ﺒ نآ ﺒ ﺒ ﺒ ﺒ

نآ ﺒ

.

أ أ ﺒﺚ و

ً ﺒز

.

ﺒ نآ ﺒ و ة ﺒ

.

آ ً ة ﺒ و

ﺒ ﺒ

) Buckwalter Arabic Morphological Analyzer (

ﺌﺒ ﺐ ﺛ يو

ﺒ ]

Dukes and Habash

10

.[

(13)

12

م ﺒ ﺜو ﺒو ﺒ ﺜﺚ ة ﺒ ﺒ ﺒ و

.

ي ﺒو ﺒ ﺗﺛ ﺒ ﺒ

آ

.

11

:

ﺚ و

و

آ

و

أ و

ﺒ تﺒ ﺒ

) dependency treebank (

ﺒ و ، آ ﺒ ت ﺒ بﺒ ﺐ

ﺒ و ن ﺛ ي ﺒ بﺒ ﺒ

] Dukes and Buckwa lter

10

] [ Dukes et

al .

10

.[

آ آبﺒ ﺗﺛ ﺒ ﺒ

(14)

13

12

:

ة

)

dependency treebank

(

آ

آ

بﺒ ﺐ

ﺒ بﺜ ﺒ ﺚﺜ أ و ، ﺒ و ﺜ ﺒ ة ﺒ ز

ة ﺒ ﺒ ت ﺒ ﺒو ﺒو ت ﺒ ﺒ ز و ، ﺒﺌ أ

.

3.

6

آ ﺒ

ت

ة

ة

ﺒ نآ ﺒ م ﺒ نأ

30

%

ت ﺒ ﺤ

.

و نأ ًﺒ

ت ﺒ ﺒﺚ ًﺒ ت ﺒ ل ﺒ ﺒ و ﺒ نآ ﺒ

.

ﺒ ﺒأ ً

)

ﺜ ﺒ أ ﺐ (

و ﺒنآ ﺚ ﺒنأ

)

ﺜ أ ﺐ .(

ًﺌ و

ﺴ ﺤو أ ﺒ

ﺸـ ﺴ

ﺛ آ ﺒ ﺒ وونآ ﺒ ةزﺜ ﺒ ﺒ

ة ﺒ ﺒ

.

ُو ة ت ﺒ ﺒ ن نأ و أ ﺐ ﺚ ﺒ ﺒ و

ِ

نآ ﺒ ت ﺠ ﺒ

ة ﺒ نو ﺐ ﺒ

.

ﺒ نآ ﺒ ﺒ و .

و ﺒ

ﺒ ة ﺒ آ آ ﺗﺛ ﺒ

.

13

:

نآ ﺒ

ةزﺜ ﺒ

ة ﺒ

ت ﺒ

ﺗﺛ

3.

7

ة

ت

ت

ل ى أ آ آ ﺒ ﺒو ﺜ ﺒو ﺒو ﺣ ﺒ نآ ﺒت آ نأ

)

ً ً ﺒ أل ي ﺒ

.(

(15)

14

.

ﺒ ﺒ ت ﺒ ﺠ وأ أ و

آ ﺜ و ﺐ

ﺒ ﺒ ت ﺒ ﺒ

ﺒنآ ﺒ

.

ﺒ ﺒت ﺒ ﺗﺛ ﺒ ﺒو .

14

:

ﺒ ﺒ

ت ﺒ

ى أ ى أ ﺒ ت ﺠ ﺒ ﺒ و

.

ة ﺒ ﺒ ت ة ﺒ و

ة ﺒ آ ﺒ ﺒ ﺒ ًﺌ ت

.

ة ﺒ و

ﺒ ﺒ ﺧ ﺒ ﺒ ﺒﺖ أ ًﺒ ﺒﺜًﺒﺜوﺚ

ً آ

.

ً آت ﺜ ﺒﺧ ﺒ ﺒ ﺒلﺒوﺚ ة نأ ﺒ ﺧ و

ﺒنآ ﺒ

.

4

.

ن ﺚ

ﺒ ﺒ

ى

ﺤو

ء إ

ﺐ 2010

ﺒت ﺒ ﺟ ً ًﺒ ﺚﺐ ﺒ

)

Gr and Challenges

in Com put ing Resear ch for 2010 and beyond

( .

ن ﺘﺒ ﺒ ﺒ ﺜ ن و "

نآ ﺒ

ب ﺒ ﺒ ﺠ ﺒو

"

ب ﺒ ﺌ ﺚ ﺒ ت ﺒ

]

At w ell et al 2010

[ .

ﺘﺒ ﺒ ﺒ ن و

ﺒ ن

ت ﺒو ﺚﺒ ﺒ ة ﺒ ﺠ

.

ﺒ ﺒ ﺒ ﺠ ﺒ ن و

ﺠ ﺒ م تﺒوﺚأوت و ة و ة و نآ ﺒو ﺒوةﺒﺜ ﺒ

.

ﺌﺒ ﺘ ً ﺒو

أ نآ ﺒو ﺒﺠ ﺒ ة ﺒﺜ ﺖ أو بﺜ

ى أ ﺨ ﺜ و ت م ﺤو ﺜ ةﺌ ﺒو ةﺜ ﺒ

.

نأز

ﺒ ﺒ ﺒ ﺒ ﺒ

ل تﺒوﺚأو ت ﺜ ﺧ و

.

ت ﺒ ة ت ﺛﺌ ﺐ ةﺚ و ةﺌ تﺒﺛ ت أ و

و ﺒو ةﺒﺜ ﺒو ﺒو نآ ﺒ ب ﺒ ﺒ ﺒ ﺒو ﺒ

.

ﺨ ﺒ و

ﺠ ﺒ ﺒ ة ﺒ ة ﺒ وﺠ ﺒ ﺌﺒ

.

ﺒ و نآ ﺒى ﺒ ﺒ ة ﺚﺜﺒ و تﺒوﺚأ ﺒ ﺤو ﺒ ﺒ

.

تﺒوﺚ ﺒ و

ة ﺒ ﺒ ﺒ ﺠ ﺚﺜﺒ ﺒو

.

ﺒ ﺒ ة ﺒ ﺚﺜﺒ و

WordNet

و ،تﺒﺚ ﺒ ت ﺒ FrameNet

و ﺚ تﺒﺜ ﺐ

ﺤو

PropBank .

تﺒ أ

) Treebank (

ﺠ ﺒ بﺒ ﺒو ﺒ

ﺒ نآ ﺒ بﺒ ﺐ أ ﺜﺒ .

ﺌ ﺐ و

(16)

15

Ontology (

ﺠ ﺒ

.

ًﺒﺜوﺚ و ة ﺒ ﺠ ﺒ ت ﺒوﺜ ﺚﺜﺒ ﺒ و

ًﺒ ﺒﺜ

ﺒ ﺒو ﺒت

) Text Mining .(

5

.

ﺜ ﺒ

ة

ﺒ نآ ﺒو ﺒ ﺒ ز ﺒ ﺜ ﺒ

.

ل

ﺒ ﺒ ﺒ ة ﺒ ز أ ﺒ

ةﺒﺜ ﺚو ﺖ أ ﺜ ﺒ و

و ﺐ ﺒ و ﺛو تﺒوﺚأ و

.

ب نأ ﺒ وً و ﺜ ﺒ أ و

ﺒ ﺒ نأو تﺒوﺚ ﺒو ﺒ ﺌﺒ ﺒ

ت ﺚﺜﺒ ﺒ

.

ﺒ نأ و

نو ﺒ م ﺒ ﺒ

وأ ﺨ ﺒ ت ﺒﺜ ﺒو ﺒ ﺒ ﺌ ﺒ و ﺒ ﺒ ﺌﺒ

ز

.

ﺜﺒ ﺒ آ ﺒ ة ﺒ ﺒ نﺐ

ﺘ ﺒ ﺒ ﺚ ﺐو

ﺜ ﺒ ﺒ آ

ﺚ ﺒ

:

ﺒ ﺒ ﺒ و و ﺒ ﺒ ة ﺒ ﺚﺜﺒ ﺒو تﺒوﺚ ﺒ ﺒ ﺒ و .

وأً و ﺚﺜ ﺒ

ﺌﺒ ﺒ ﺒ و ﺐ ﺒ و ،ﺤ ﺒ ﺜ ً آ

ﺚﺜ ﺒ ﺜﺒو ﺒو

.

ﺒ ﺒ ﺒ ﺒ ﺜ ﺒ ﺜ ﺒ ﺒ ة ﺒ أكﺜ ﺐ

.

ﺘ ﺜ نأ و

ﺐو ﺒ ﺒ ﺒ ﺒ ﺚ ﺒو ﺒ ﺒ أ

ﺒﺚأو ﺛ

ﺨ ﺒ ﺒ و ﺒ ﺚ ﺒ ﺖ ﺒ

.

Abu Shawar, Bayan; Atwell, Eric. An Arabic chatbot giving answers from the Qur'an in: Bel, B & Marlien, I (editors) Proceedings of TALN04: XI Conference sur le Traitement Automatique des Langues Naturelles, Volume 2, pp. 197-202 ATALA. 2004.

Al-Saif, A; Markert, K. 2010. The Leeds Arabic Discourse Treebank: Annotating discourse connectives for Arabic. (pdf) In: Proc. of the conference on Language Resources and Evaluation. Malta, 2010.

Al-Sulaiti, Latifa; Atwell, Eric. The design of a corpus of contemporary Arabic. International Journal of Corpus Linguistics, vol. 11, pp. 135-171. 2006.

Eric Atwell, Kais Dukes, Abdul-Baquee Sharaf, Nizar Habash, et al.(2010) Understanding the Quran: A new Grand Challenge for Computer Science and Artificial Intelligence. Grand Challenges for Computing Research (2010). British Computer Society Workshop. Edinburgh

Fillmore, C. (1976). “Frame Semantics and the nature of language.” Annals of the New York Academy of Science.

Ghazali, S. & Braham, A. (2001). Dictionary Definitions and Corpus-Based Evidence in Modern Standard Arabic. Arabic NLP Workshop at ACL/EACL. Toulouse, France.

(17)

16

Kais Dukes, Eric Atwell and Abdul-Baquee M. Sharaf. Syntactic Annotation Guidelines for the Quranic Arabic Treebank. The seventh international conference on Language Resources and Evaluation (LREC-2010). Valletta, Malta, 2010

Kais Dukes and Tim Buckwalter. A Dependency Treebank of the Quran using Traditional Arabic Grammar. Submitted to the 7th international conference on Informatics and Systems. Cairo, Egypt, 2010

Mushaf at-Tajweed 1420H. – –

Roberts, Andrew; Al-Sulaiti, Latifa; Atwell, Eric. aConCorde: Towards an open-source, extendable concordancer for Arabic. Corpora journal, vol. 1, pp. 39-57. 2006.

Ruppenhofer, J., M. Ellsworth, M. Petruck, and C. Johnson (2005). “FrameNet: Theory and Practice.

Sawalha, Majdi and Atwell, Eric (2008). Comparative evaluation of Arabic language morphological analysers and stemmers. Proceedings of COLING 2008 22nd International Conference on Computational Linguistics.

Sawalha, Majdi and Atwell, Eric (2009a). Linguistically Informed and Corpus Informed Morphological Analysis of Arabic. Proceedings of the 5th International Corpus Linguuistics Conference CL2009 Liverpool, UK.

Sawalha, Majdi and Atwell, Eric (2009b).

(Adapting Language Grammar Rules for Building Morphological Analyzer for Arabic Language). Proceedings of the workshop of morphological analyzer experts for Arabic language, organized by Arab League Educational, Cultural and Scientific Organization (ALECSO), King Abdul-Aziz City of Technology ( KACT) and Arabic Language Academy. Damascus, Syria.

Sawalha, Majdi and Atwell, Eric (2010a). Constructing and Using Broad-Coverage Lexical Resource for Enhancing Morphological Analysis of Arabic. Language Resource and Evaluation Conference LREC 2010 Valleta, Malta.

Sawalha, Majdi and Atwell, Eric (2010b). Fine-Grain Morphological Analyzer and Part-of-Speech Tagger for Arabic Text. Language Resource and Evaluation Conference LREC 2010 Valleta, Malta.

Sharaf, A. and Atwell, E. (2009) A Corpus-based computational model for knowledge representation of the Qur'an. 5th Corpus Linguistics Conference, Liverpool

References

Related documents

estimated to be +1.6%, however the health reform legislation adopted earlier this year requires a1.6% offset resulting in a 0% proposed update for 2011. ASC 2011 payment rates

Wrists were evaluated radiologically by measuring the volar tilt (dorsal tilt is expressed as a negative value), ra- dial inclination, and radial length (the distance between the

The present study aimed to identify the prevalence of BRFCH (insufficiently active, excessive TV watching, current alcohol and tobacco use, daily soft drinks con- sumption

On average, physical and psychosocial summary scores, major indicators for HRQOL, were significantly higher among the elementary school children in our study relative to those

Studies included in the meta-analysis were required to meet the following criteria: (1) case–control or cohort studies which evaluated the association between NQO1

Unofficial Live Statistics for Women Platform (5 Dives) Prelim / Quarterfinal. This page will

While there seems to be an agreement that Canada-US Free Trade Agreement (CUSTA)/North American Free Trade Agreement (NAFTA) have benefited member countries, some analysts have

Ct: threshold cycle; DMEM: Dulbecco's modified Eagle's medium; ELISA: enzyme-linked immunosorbent assay; FCS: foetal calf serum; HIF: hypoxia inducible transcription factor;