• No results found

Proceedings of the Third SIGHAN Workshop on Chinese Language Processing

N/A
N/A
Protected

Academic year: 2020

Share "Proceedings of the Third SIGHAN Workshop on Chinese Language Processing"

Copied!
10
0
0

Loading.... (view fulltext now)

Full text

(1)

Proceedings of the

Third SIGHAN Workshop

on Chinese Language Learning

Held in cooperation with ACL-2004

(2)

Order copies of this and other ACL proceedings from:

Association for Computational Linguistics (ACL)

73 Landmark Center

East Stroudsburg, PA 18301

USA

(3)
(4)

iii

ORGANIZERS:

Chair: Qin Lu & Oliver Streiter

Proceedings: Qin Lu & Oliver Streiter

PROGRAM COMMITTEE:

Andi Wu - Microsoft, USA

Changning Huang - Microsoft, China

Chu-ren Huang - Academia Sinica, Taiwan

Joyce Chai - Michigan State Univ, USA

Keh-Jian Chen - Academia Sinica, Taiwan

Li Wenjie - the Hong Kong Polytechnic University, Hong Kong

Martha Palmer - Univ. of Pennsylvania, USA

Nianwen Xue - Univ. of Pennsylvania, USA

Oliver Streiter - EURAC, Italy

Qiang Zhou - Tsinghua University, China

Qing Ma - Ryukoku University, Japan

Qin Lu - The Hong Kong Polytechnic University

Sui Zhifang - Peking University, China

Sun Maosong - Tsinghua University, China

Tom Emerson - Basis Technology Corp, U.S.A.

FURTHER INFORMATION:

Dr. Qin Lu

Department of Computing,

The Hong Kong Polytechnic University,

Hung Hom, Kowloon,

(5)

!#"%$&$('*),+-"./01324$56 1/780:940;".<=)>$!?0

@ACBED-FHGIJBKG,L @BEMONQPRFSGHTU3BKVWLHXPRGHTJBKYZU3B[V]\RGS^WIRPJGHTM`_baJA1PRcdZA1Afe1eCe1eCe1e1eCe1eCee1eCe1e1eCe1eCe1eeCe1e1eCe1eCe1e1eCee1eCe1e g

hjilk 1mn"."./! k;o OpSq o $5 ilk$Rr8!Jr".mn"s0;O"%7t=$CqJmu0v'$ k )Z+6".w;01 h rCr k 1xC"%!#"%$H0

IJBKGHTRM`y-zHBKG|{}zS\RGHT\GS^:NQF-M`~€(Pd4\B?ee1eCe1eCe1e1e1e1e1eCe1eCe1e1eCe1eeCe1eCe1e1eCe1eCe1eeCe1e1eCe1eCe1e1eCee1eCe1eCe1e1eCe1eCee1eCe1e1eCe1eCe1eeCe1e1eCe`

d,B[GHTJŽ ACGHTNv\GHT LSXPJGHTJ_PRGHT*IJBZ\GS^ˆd,B4~4\RGHTeeCe1e1eCe1eCe1eeCe1e1eCe1eCe1e1eCee1eCe1eCe1e1eCe1eCee1eCe1e1eCe1eCe1eeCe1e1eCe1eCe1e1eCee1eCeJg5

h qR!8‘/O".x?:)Q$5‘ k ;080;"%$5 …(r‡!501‡q h ‘R‘ k $C!J7‡+'$ k )Z+-"./0 i ". o

".ˆ’5‘/p-IJBKGS_F_bFS\RGHT\GS^WX“\?”<Bs^ˆ•–P5—}AC˜‡ŒeeCe1eCe1e1eCe1eCe1eeCe1e1eCe1eCe1e1e1e1e1eCe1eCe1e1eCe1eeCe1eCe1e1eCe1eCe1eeCe1e1eCe1eCe1e1eCee1eCe1eCe1e1eCe1eCee6™š

h W›l6+S!5/788qœt|$ƒqR1mJ'$ k )Z+6".w;01žW$ k q* !#"%$ž!5wq i ! k O…($Ÿ'8…‘S887‡+‹S!CCR".<

 HACGHTœIJB[\RGHTSL-_bFSBZd,B[F,L-NQFS¡<FS\RG={}zHA1G\G ^:¢€FS£1zS\RGWdZFbeeCe1e1eCe1eCe1eeCe1e1eCe1eCe1e1eCee1eCe1eCe1e1eCe1eeCe1eCe1e1eCe1eCe1eeCe1e1eCe™R¤

)Z+S! k!J7` k …S01 h 0801$ƒ7"%!#"%$:!5wq*)Q$5‘H$p-/q5".6:‹ ‘/m[!5`Œ ".".mK! k ".oR¥lh p-$*!#"%7l!5 #"%7f)m[!?080;"¦}78!5O"%$5 $(')Z+6".w;01:)Q$5‘H$p-/q?0

_bPJGHTJ¡<Bs\P*d,BOL/{}zS\RGHTRGHB[GHT_bF \GHT LHIRBs\G-ª©ACGHTœ«3\P\G ^­¬B[\RPR£1zSPRGHT* S\RGeCe1e1e1e1e1eCe1eCe1e1eCee1eCe1eCe1e1eCe1eCee1eCe1e1e`š g

)Q$*r1".S".6®p k !5m ®O¨v$ k †50!5/q `!5O"s0;#"%70Q'$ k )Z+6".w;01žW$ k qœw1H01

‚

"s01!5r"ERpS!5O"%$5

¯/zHB[V\Pd,FZLS~ŒBKGST*dZBKF\RGS^ˆy-zHA1GHTœdZBe1eCe1eCe1e1e1e1e1eCe1eCe1e1eCee1eCe1eCe1e1eCe1eCee1eCe1e1eCe1eCe1e1e1e1e1eCe1eCe1e1eCe1eeCe1eCe1e1eCe1eCe1eeCe1e%š<

)Z+-"./0ž$ k qœwR*1S`!5O"%$5&r o )–mK!50‡0;"¦l7‡!#"%$°$(')Z+H! k !R71 k 0

{}zHP6PRBKM`d,B[GHTœ«±“_LH@|\J²\?a6FHc6B³‡\zS\R˜‡\\G ^­NQF´ŸBZ@|\5µ8(FSVPRµ²P–e1eCe1eeCe1eCe1e1eCe1eCe1eeCe1e1eCe1eCe1e1eCee1eCe1eCe1e1eCe1eCee1e#¶<

!J7O"%$5&$Ÿ'¸b".mn".<RpS!5m ‚ $*!".¹Œ $mK$²

o

'$

k

t=8q5"%78!m ‚ $5!5".ºWƒrf8!

k

78+ IJFHBEM` HACGHTNQA1z,Lw{}z<FSGHTM`_²BK—€Gˆ»&F,L @B[GHTRMIJFHG|{}zHA1G\G ^:dZB[\RGHTM`¼8zHB[zœNQF“eCeCe1e1e1e1e1eCe1eCe1e1eCe1eeCe1eCe1e1eCe1eCe1ee(½<¶

h `!5O"s0;#"%78!5m,t|$ƒqR1mJ'$ k¾ !<<p-mn…¾ !5C¿1!|)Q$5 x? k 0;"%$5&".À‹ k "./$mK$‡ o

‚

$5!5".

IJBKG-M¬Bs\*_bFS\RGHTSL/y6FHG-M`@A1AfÁŒ\Af\RGS^WU3A1a<M²FHG{}zHPRB5eCe1e1eCee1eCe1eCe1e1eCe1eCee1eCe1e1eCe1eCe1e1e1e1e1eCe1eCe1e1eCe1eeCe1eCe1e1eCe1eCe1ee‡R™

)Z+-"./0‹ k ›>·Jk !R71#"%$' k $݃r i !CJ0“¸!?018q:$Ä)Q$5‘H$p-/q°‹S k iŒk $CqpS7O".xƒ".o

_bB[˜‡PJ²zHB,ŏ\Rc?\RTJ\?—Œ\HL6_bB[˜²P5a6FHc6B,U3PR´ŸBKV\\GS^W³bc6B[˜‡\*@|\RAC^H\–e1e1e1eCe1eCee1eCe1e1eCe1eCe1e1e1e1e1eCe1eCe1e1eCe1eeCe1eCe1e1eCe1eCe1eeCe1e

940;".<œ o /$ o Ƅ1mK!5O"%$5S0f".Ä)Z+6"./01:)>$m.mK$ƒ7‡!#"%$=›>·Jk !J7O"%$5

»°\G6a6BKGd,BOL ÇB[GWdZFˆ\RGS^W¢€FHBKª©A1GHT¬F*e1e1e1e1e1eCe1eCe1e1eCe1eeCe1eCe1e1eCe1eCe1eeCe1e1eCe1eCe1e1eCee1eCe1eCe1e1eCe1eCee1eCe1e1eCe1eCe1e1e1e1e1eCe1e1¤J½

)Q$*r1".S".6 iŒk$?01$ƒq5"%7!5wqž‹ ²·J>ÈZ8!5Op k 0–'$ k wR*1S`!5O"%$5&$Ÿ'3t=!5wqR! k ".¸ k $ƒ!RqR78!508Q®¨–0

«“B[GS\5M`³bGHGSAd,A1”JP5—Ée1eCe1eCe1e1eCe1eCee1eCe1e1eCe1eCe1eeCe1e1eCe1eCe1e1eCee1eCe1eCe1e1eCe1eCee1eCe1e1eCe1eCe1e1e1e1e1eCe1eCe1e1eCe1eeCe1eCe1e1eCe1eCe1eeCe1e1eCeHgCÊ<™

h p-`$5!5O"%7w1*!SO"%7„$mK h 080;"E *1S/'$ k !&‹ k 8 k pS7Op k

IJB[\M@B[GHTNQPRFW\G ^ˆU3ACz-M`IRBs\GHG{}zHA1GŒeCe1e1eCe1eeCe1eCe1e1eCe1eCe1eeCe1e1eCe1eCe1e1eCee1eCe1eCe1e1eCe1eCee1eCe1e1eCe1eCe1e1e1e1e1eCe1eCe1e1eCee1eCeŸgCÊJ

h 2Z! k<…7‡!mKw1*!SO"%7f/k pS71#p k >'$ k )Z+6".w;01w1S`1/78;0

d,B4~4\RGHTSLHXPRGHTJzHPRGSTIJBZ\G ^:dZBKGHTJŽ ACGHTNv\GHT:e1eCe1e1e1e1e1eCe1eCe1e1eCe1eeCe1eCe1e1eCe1eCe1eeCe1e1eCe1eCe1e1eCee1eCe1eCe1e1eCe1eCee1eCe1e1e/gRgƒ½

‚

$˝“®88q)Z+6".w;01žW$ k q* !#"%$'$ k `!5O"s08O"%78!5mt=!J7‡+-"./‹ k !5H0;mK!#"%$QÌ

IJB[\¬FZLS¢€Bs¼8zS\˜8^W¯wA1GS€\RGS^W_bA1˜‡V\GHGWÅbACae1eCe1e1eCee1eCe1eCe1e1eCe1eCee1eCe1e1eCe1eCe1e1e1e1e1eCe1eCe1e1eCe1eeCe1eCe1e1eCe1eCe1eeCe1e1eCe1e6gƒ™J™

(6)

»°\G-D-Bs\GHT­{}zHAJLS~ŒBKGST*dZBKF\RGS^ˆy-zHA1GHTœdZB5e1eCe1e1eCee1eCe1eCe1e1eCe1eCee1eCe1e1eCe1eCe1e1e1e1e1eCe1eCe1e1eCe1eeCe1eCe1e1eCe1eCee1eCe1e1eCe1eCe1e;gC§Rš

h mE"E ".<*¸€".mn".6pS!m})Q$ k ‘H$ k !Ë940;".<œw1S`w7‡02Z$ƒ78!5O"%$5ˆ’C'$ k *!#"%$

»&A1B[TJ\GST*dZB#L ~ŒBKGHTdZBKF,Lw¯/zHA1Gˆ»°\GHTœ\RGS^ˆy-zHA1GHTdZB4eCe1e1eCee1eCe1eCe1e1eCe1eCee1eCe1e1eCe1eCe1e1e1e1e1eCe1eCe1e1eCee1eCe1eCe1e1eCe1eCeSg1š g

h

ˆ’;S`

k

!5`‡q*t|1©+H$ƒq'$

k

)Z+6"./01=946†/$¨>ɝ$

k

q*›>·J

k

!R71#"%$

(7)

vii

Technical Program Schedule

Sunday, July 25

8:45-8:50

Welcome

8:50-9:10

Segmentation of Chinese Long Sentences Using Commas

Meixun Jin, Mi-Young Kim, Dongil Kim and Jong-Hyeok Lee

9:15-9:35

A Preliminary Study on Probabilistic Models for Chinese Abbreviations

Jing-Shin Chang and Yu-Tso Lai

9:40-10:00

Document Re-ranking based on Global and Local Terms

Lingpeng Yang, DongHong Ji and Li Tang

Coffee Break

10:30-10:50

Chinese Chunking with Another Type of Spec

Hongqiao Li, Changning Huang, Jianfeng Gao and Xiaozhong Fan

10:55-11:15

Chinese Word Segmentation by Classification of Characters

Chooi-Ling GOH, Masayuki Asahara and Yuji Matsumoto

11:20-11:40

Automated Alignment and Extraction of Bilingual Domain Ontology for

Medical Domain Web Search

Jui-Feng Yeh, Chung-Hsiwn Wu, Ming-Jun Chen and Liang-chih Yu

11:45-12:05

Using Synonym Relations in Chinese Collocation Extraction

Wanyin Li, Qin Lu and Ruifeng Xu

Lunch Break

13:50-14:10

Combining Prosodic and Text Features for Segmentation of Mandarin

Broadcast News

Gina-Anne Levow

14:15-14:35

Automatic Semantic Role Assignment for a Tree Structure

Jia-Ming You and Keh-Jiann Chen

14:40-15:00

A Large-Scale Semantic Structure for Chinese Sentences

Li Tang, Donghong Ji and Lingpeng Yang

15:05-15:25

Aligning Bilingual Corpora Using Sentences Location Information

Weigang Li, Ting Liu, Zhen Wang and Sheng Li

(8)

viii

Poster Session

15:50-17:10

An Integrated Method for Chinese Unknown Word Extraction

Zhiyong Luo and Rou Song

15:50-17:10

Adaptive Compression-based Approach for Chinese Pinyin Input

JinHu Huang and David Powers

15:50-17:10

Character-Sense Association and Compounding Template Similarity:

Automatic Semantic Classification

Chao-Jan Chen

15:50-17:10

Combining Neural Networks and Statistics for Chinese Word Sense

Disambiguation

Zhimao Lu, Ting Liu and Sheng Li

15:50-17:10

A Statistical Model for Hangeul-Hanja Conversion in Terminology Domain

Jin-Xia Huang, Sun-Mee Bae and Key-sun Choi

15:50-17:10

Chinese Term Extraction from Web Pages Based on Compound Term

Productivity

Hiroshi Nakagawa, Hiroyuki Kojima and Akira Maeda

15:50-17:10

The Construction of A Chinese Shallow Treebank

Ruifeng Xu, Qin Lu, Yin Li and Wanyin Li

15:50-17:10

Do We Need Chinese Word Segmentation for Statistical Machine

Translation?

Jia Xu, Richard Zens and Hermann Ney

15:50-17:10

A New Chinese Natural Language Understanding Architecture Based on

Multilayer Search Mechanism

Wanxiang Che, Ting Liu and Sheng Li

15:50-17:10

A Semi-Supervised Approach to Build Annotated Corpus for Chinese

Named Entity Recognition

Xiaoshan Fang, Jianfeng Gao and Huanye Sheng

15:50-17:10

An Enhanced Model for Chinese Word Segmentation and Part-of-Speech

Tagging

Feng Jiang, Hui Liu, Yuquan Chen and Ruzhan Lu

SIGHAN

Meeting

17:20-18:20

Organizational Meeting

(9)
(10)

! "#%$'&(')+*,*-*.*,*-*.*-*.*-*-*.*-*.*,*-*.*-*-*.*-*.*-*,*.*-*-*0/#1 2 '&345 2.26*-*.*-*,*.*-*.*-*-*.*-*.*-*,*.*-*-*.*-*.*-*-*.*,*-*.*-*.*-*-*.*718 9 :3;:<)=3;4')=3>*-*.*,*-*.*-*.*-*-*.*-*.*,*-*.*-*-*.*-*.*-*-*-*-*-*.*-*.*-*? 9 2@A3!B!)C3;D*,*-*.*-*,*.*-*-*.*-*.*-*-*.*,*-*.*-*.*-*-*.*-*.*,*-*.*-*-*E.FG 9 2-3 9 :H4I<+3>*0*-*.*-*.*-*-*-*-*-*.*-*.*-*-*.*-*,*.*-*.*-*-*.*-*.*-*,*.*FF 9 2-3 J2.!4I<)C33*,*-*.*-*.*-*-*-*-*-*.*-*.*-*-*.*-*,*.*-*.*-*-*.*-*.*-*%E.K#? 9 2-3 ")L3;45<#&3J*,*.*-*-*.*-*.*-*-*-*-*-*.*-*.*-*-*.*-*,*.*-*.*-*-*.*-*.*,*NM#/

References

Related documents

A unit vector f is a stationary vector of a selfadjoint operator A iff there exism two eigenvectors whose appropriate linear combination (in the sense given below) yields

thinking, responsible business leaders based on the experience of the business school

INFLUENCE OF LEADING AND LAGGING INDICATORS ON PRICE- EARNINGS RATIO OF NIFTY: A STUDY..

A Monthly Double-Blind Peer Reviewed Refereed Open Access International e-Journal - Included in the International Serial Directories. Page | 762 few people is also

For experiments examining the effects of osmolality on K transport, plexuses were exposed to the media of different osmolalities for 10 min before and during the measurement of the

Chapters 4 and 5 provide the information, such as phenomena identification and ranking, knowledge level ranking, the path forward, etc., for the events of station

Interestingly, the p-JNK, ATF2 and p-ATF2 could only be detected in KBD cartilage samples; the decrease in JNK and ATF2 total protein levels in cultured chondrocytes was in line

A mathematical treatment of the unloaded bearing was derived., while experimental correlations from the test results were obtained to calculate the heat.convected,