• No results found

Study of Practical Effectiveness for Machine Translation Using Recursive Chain link type Learning

N/A
N/A
Protected

Academic year: 2020

Share "Study of Practical Effectiveness for Machine Translation Using Recursive Chain link type Learning"

Copied!
7
0
0

Loading.... (view fulltext now)

Full text

(1)

Study of Practical Effectiveness for Machine Translation Using

Recursive Chain-link-type Learning

Hiroshi Echizen-ya

Dept. of Electronics and Information Hokkai-Gakuen University S 26-Jo, W 11-Chome, Chuo-ku

Sapporo, 064-0926 Japan

[email protected]

Kenji Araki

Division of Electronics and Information Hokkaido University

N 13-Jo, W 8-Chome, Kita-ku Sapporo, 060-8628 Japan

[email protected]

Yoshio Momouchi

Dept. of Electronics and Information Hokkai-Gakuen University S 26-Jo, W 11-Chome, Chuo-ku

Sapporo, 064-0926 Japan

[email protected]

Koji Tochinai

Division of Business Administration Hokkai-Gakuen University 4-Chome, Asahi-machi, Toyohira-ku

Sapporo, 060-8790 Japan

[email protected]

Abstract

A number of machine translation systems based on the learning algorithms are presented. These methods acquire translation rules from pairs of similar sentences in a bilingual text cor-pora. This means that it is difficult for the systems to acquire the translation rules from sparse data. As a result, these methods require large amounts of training data in order to ac-quire high-quality translation rules. To over-come this problem, we propose a method of ma-chine translation using a Recursive Chain-link-type Learning. In our new method, the system can acquire many new high-quality translation rules from sparse translation examples based on already acquired translation rules. Therefore, acquisition of new translation rules results in the generation of more new translation rules. Such a process of acquisition of translation rules is like a linked chain. From the results of evalua-tion experiments, we confirmed the effectiveness of Recursive Chain-link-type Learning.

1 Introduction

Rule-Based Machine Translation(MT)(Hutchins and Somers, 1992) requires large-scale knowl-edge to analyze both source language(SL) sentences and target language(TL) sentences. Moreover, it is difficult for a developer to com-pletely describe large-scale knowledge that can analyze various linguistic phenomena. There-fore, Rule-Based MT is time-consuming and expensive. Statistical MT and Example-Based

MT have been proposed to overcome the dif-ficulties of Rule-Based MT. These approaches correspond to Based approach. Corpus-Based approach uses translation examples that keep including linguistic knowledge. This means that the system can improve the quality of its translation only by adding new translation examples. However, in Statistical MT(Brown et al., 1990), large amounts of translation examples are required in order to obtain high-quality translation. Moreover, Example-Based MT(Sato and Nagao, 1990; Watanabe and Takeda, 1998; Brown, 2001; Carl, 2001) which relies on various knowledge resources results in the same difficulties as Rule-Based MT. Therefore, Example-Based MT, which automatically acquires the translation rules from only bilingual text corpora, is very effec-tive. However, existing Example-Based MT systems using the learning algorithms require large amounts of translation pairs to acquire high-quality translation rules.

(2)

SL sentences and the TL sentences. As a re-sult, many translation rules cannot be acquired. (McTait, 2001) generalizes both the different parts and the common parts as shown in Fig-ure 1(2). This means that (McTait, 2001) al-lows m:nmappings in the number of the differ-ent parts, or the number of the common parts. However, it is difficult to acquire the translation rules that correspond to the lexicon level. On the other hand, we have proposed a method of

MachineTranslation usingInductiveLearning with Genetic A lgorithms(GA-ILMT)(Echizen-ya et al., 1996). This method automatically generates the similar translation examples from only given translation examples by applying ge-netic algorithms(Goldberg, 1989) as shown in (3a) of Figure 1. Moreover, the system per-forms Inductive Learning. By using Inductive Learning, the abstract translation rules are ac-quired by performing phased extraction of dif-ferent parts as shown in Figure 1(3b) and (3c). In all methods shown in Figure 1, the condi-tion of acquisicondi-tion of translacondi-tion rules is that two similar translation examples must exist. As a result, the systems require large amounts of translation examples.

We propose a method of MT using Recur-sive Chain-link-type Learning as a method to overcome the above problem. In our method, the system acquires new translation rules from sparse data using other already acquired trans-lation rules. For example, first, translation rule B is acquired by using translation rule A when the translation rule A exists in the dictio-nary. Moreover, translation rule C is acquired by using the translation rule B. Such a pro-cess of acquisition of translation rules is like a chain where each ring is linked. Therefore, we call this mechanism Recursive Chain-link-type Learning(RCL). This method can effectively ac-quire many translation rules from sparse data without depending on the different parts of sim-ilar translation pairs. In this paper, we describe the effectiveness of RCL through evaluation ex-periments.

2 Basic Idea

RCL is a method with an ability that automat-ically acquires translation knowledge in a com-puter without any analytical knowledge, such as GA-ILMT. This is the ability to extract

corre-㩿㪈㪀㩷㪞㪼㫅㪼㫉㪸㫃㫀㫑㪸㫋㫀㫆㫅㩷㫆㪽㩷㪛㫀㪽㪽㪼㫉㪼㫅㫋㩷㪧㪸㫉㫋㫊㩷㩿㪞㬲㫍㪼㫅㫀㫉 㪸㫅㪻㩷㪚㫀㪺㪼㫂㫃㫀㪃㩷㪈㪐㪐㪏㪀 㩿㪈㪸㪀㪘㫃㫀㪾㫅㫄㪼㫅㫋㫊㪑

㪠㪠㪠㪠㪾㪸㫍㪼㩷㫋㪿㪼㩷㪹㫆㫆㫂㪹㫆㫆㫂㪹㫆㫆㫂㪹㫆㫆㫂㹤㫂㫀㫋㪸㪹㫂㫀㫋㪸㪹㫂㫀㫋㪸㪹㫂㫀㫋㪸㪹㪂㫐㪟 㫍㪼㫉㪂㪛㪟㪂㫄㪂㫄㪂㫄㪂㫄 㪰㫆㫌

㪰㫆㫌㪰㫆㫌

㪰㫆㫌㪾㪸㫍㪼㩷㫋㪿㪼㩷㫇㪼㫅㫇㪼㫅㫇㪼㫅㫇㪼㫅㹤㫂㪸㫃㪼㫄㫂㪸㫃㪼㫄㫂㪸㫃㪼㫄㫂㪸㫃㪼㫄㪂㫐㪟 㫍㪼㫉㪂㪛㪟㪂㫅㪂㫅㪂㫅㪂㫅

㩿㪈㪹㪀㪞㪼㫅㪼㫉㪸㫃㫀㫑㪸㫋㫀㫆㫅㩷㫆㪽㩷㪛㫀㪽㪽㪼㫉㪼㫅㫋㩷㪧㪸㫉㫋㫊㪑 㫀㪽㩷㫀㹤㪂㫄 㪸㫅㪻㩷㫐㫆㫌㩷㹤㪂㫅㩷㫋㪿㪼㫅

㪾㪸㫍㪼㩷㫋㪿㪼㩷㪯㹤㪯㪂㫐㪟㩷㫍㪼㫉㪂㪛㪟㪯

㩿㪉㪀㪞㪼㫅㪼㫉㪸㫃㫀㫑㪸㫋㫀㫆㫅㩷㫆㪽㩷㪛㫀㪽㪽㪼㫉㪼㫅㫋㩷㪧㪸㫉㫋㫊㩷㪸㫅㪻㩷㪚㫆㫄㫄㫆㫅㩷㪧㪸㫉㫋㫊㩷㩿㪤㪺㪫㪸㫀㫋㪃㩷㪉㪇㪇㪈㪀 㩿㪉㪸㪀㪘㫃㫀㪾㫅㫄㪼㫅㫋㫊㪑

㪫㪿㪼㩷㪺㫆㫄㫄㫀㫊㫊㫀㫆㫅 㪫㪿㪼㩷㪺㫆㫄㫄㫀㫊㫊㫀㫆㫅㪫㪿㪼㩷㪺㫆㫄㫄㫀㫊㫊㫀㫆㫅

㪫㪿㪼㩷㪺㫆㫄㫄㫀㫊㫊㫀㫆㫅㪾㪸㫍㪼㩷㫋㪿㪼㩷㫇㫃㪸㫅㩷㫋㪿㪼㩷㫇㫃㪸㫅㩷㫋㪿㪼㩷㫇㫃㪸㫅㩷㫋㪿㪼㩷㫇㫃㪸㫅㩷㫌㫇

㪦㫌㫉㩷㪾㫆㫍㪼㫉㫅㫄㪼㫅㫋 㪦㫌㫉㩷㪾㫆㫍㪼㫉㫅㫄㪼㫅㫋㪦㫌㫉㩷㪾㫆㫍㪼㫉㫅㫄㪼㫅㫋

㪦㫌㫉㩷㪾㫆㫍㪼㫉㫅㫄㪼㫅㫋㪾㪸㫍㪼㩷㪸㫃㫃㩷㫃㪸㫎㫊㩷㪸㫃㫃㩷㫃㪸㫎㫊㩷㪸㫃㫃㩷㫃㪸㫎㫊㩷㪸㫃㫃㩷㫃㪸㫎㫊㩷㫌㫇

㩿㪉㪹㪀㪞㪼㫅㪼㫉㪸㫃㫀㫑㪸㫋㫀㫆㫅㩷㫆㪽㩷㪛㫀㪽㪽㪼㫉㪼㫅㫋㩷㪧㪸㫉㫋㫊㪑 㩿㵺㪀㩷㪾㪸㫍㪼㩷㩿㵺㪀㩷㫌㫇㩷㹤 㩿㵺㪀㩷㪸㪹㪸㫅㪻㫆㫅㫅㪸 㩿㵺㪀 㩿㪉㪺㪀㪞㪼㫅㪼㫉㪸㫃㫀㫑㪸㫋㫀㫆㫅㩷㫆㪽㩷㪚㫆㫄㫄㫆㫅㩷㪧㪸㫉㫋㫊㪑

㪫㪿㪼㩷㪺㫆㫄㫄㫀㫊㫊㫀㫆㫅㩷㩿㵺㪀㩷㫋㪿㪼㩷㫇㫃㪸㫅㩷㩿㵺㪀㩷㹤 㪣㪸㩷㪺㫆㫄㫄㫀㫊㫊㫀㫆㫅㩷㩿㵺㪀㩷㫃㪼㩷㫇㫃㪸㫅 㪦㫌㫉㩷㪾㫆㫍㪼㫉㫅㫄㪼㫅㫋㩷㩿㵺㪀㩷㪸㫃㫃㩷㫃㪸㫎㫊㩷㩿㵺㪀

㩿㪊㪀㪞㪘㪄㪠㪣㪤㪫㩷㩿㪜㪺㪿㫀㫑㪼㫅㪄㫐㪸 㪼㫋㩷㪸㫃㪅㪃㩷㪈㪐㪐㪍㪀

㹤㪥㫆㫋㫉㪼㩷㪥㫆㫋㫉㪼㩷㪥㫆㫋㫉㪼㩷㪾㫆㫌㫍㪼㫉㫅㪼㫄㪼㫅㫋㪥㫆㫋㫉㪼㩷㪾㫆㫌㫍㪼㫉㫅㪼㫄㪼㫅㫋㪾㫆㫌㫍㪼㫉㫅㪼㫄㪼㫅㫋㪾㫆㫌㫍㪼㫉㫅㪼㫄㪼㫅㫋㪸㪹㪸㫅㪻㫆㫅㫅㪸㫋㫆㫌㫋㪼㫊 㫃㪼㫊㩷㫋㫆㫌㫋㪼㫊㫋㫆㫌㫋㪼㫊㫋㫆㫌㫋㪼㫊㫃㪼㫊㩷㫃㪼㫊㩷㫃㪼㫊㩷㫃㫆㫀㫊㫃㫆㫀㫊㫃㫆㫀㫊㫃㫆㫀㫊

㹤㪣㪸㩷㪺㫆㫄㫄㫀㫊㫊㫀㫆㫅㪣㪸㩷㪺㫆㫄㫄㫀㫊㫊㫀㫆㫅㪣㪸㩷㪺㫆㫄㫄㫀㫊㫊㫀㫆㫅㪣㪸㩷㪺㫆㫄㫄㫀㫊㫊㫀㫆㫅㪸㪹㪸㫅㪻㫆㫅㫅㪸㫃㪼㩷㫇㫃㪸㫅㫃㪼㩷㫇㫃㪸㫅㫃㪼㩷㫇㫃㪸㫅㫃㪼㩷㫇㫃㪸㫅

㹤 㪥㫆㫋㫉㪼㩷㪾㫆㫌㫍㪼㫉㫅㪼㫄㪼㫅㫋 㩿㵺㪀㩷㫋㫆㫌㫋㪼㫊 㫃㪼㫊㩷㫃㫆㫀㫊

䋨㪟㪼㩷㫃㫀㫂㪼㫊 㫋㪼㪸㪅㩷㩷㩷㩷㩷䋻ᓐ㪆䈲 㪆䈍⨥㪆䈏㪆ᅢ䈐㪆䈪䈜䋮㪀 䋨㪪㪿㪼㫃㫀㫂㪼㫊 㫋㪼㫅㫅㫀㫊㪅䋻ᓐᅚ㪆䈲 㪆䊁䊆䉴㪆䈏㪆ᅢ䈐㪆䈪䈜䋮㪀

㩿㪊㪸㪀㪞㪼㫅㪼㫉㪸㫋㫀㫆㫅㩷㫆㪽㩷㪫㫉㪸㫅㫊㫃㪸㫋㫀㫆㫅㩷㪜㫏㪸㫄㫇㫃㪼㫊㩷㪹㫐㩷㪘㫇㫇㫃㫐㫀㫅㪾㩷㪞㪼㫅㪼㫋㫀㪺㩷㪘㫃㪾㫆㫉㫀㫋㪿㫄㫊

䋨㪟㪼㩷㫃㫀㫂㪼㫊㩷㫋㪼㫅㫅㫀㫊㪅䋻ᓐ㪆䈲㪆䊁䊆䉴㪆䈏㪆ᅢ䈐㪆䈪䈜䋮㪀 䋨㪪㪿㪼㩷㫃㫀㫂㪼㫊㩷㫋㪼㪸㪅䋻ᓐᅚ㪆䈲㪆䈍⨥㪆䈏㪆ᅢ䈐㪆䈪䈜䋮㪀 㩿㪊㪹㪀㪞㪼㫅㪼㫉㪸㫃㫀㫑㪸㫋㫀㫆㫅㩷㪸㫅㪻㩷㪜㫏㫋㫉㪸㪺㫋㫀㫆㫅㩷㫆㪽㩷㪛㫀㪽㪽㪼㫉㪼㫅㫋㩷㪧㪸㫉㫋㫊㩷㪹㫐㩷㪠㫅㪻㫌㪺㫋㫀㫍㪼㩷㪣㪼㪸㫉㫅㫀㫅㪾

䋨㪟㪼㩷㫃㫀㫂㪼㫊㩷㫋㪼㫅㫅㫀㫊㫋㪼㫅㫅㫀㫊㫋㪼㫅㫅㫀㫊㫋㪼㫅㫅㫀㫊㪅䋻ᓐ㪆䈲㪆䊁䊆䉴䊁䊆䉴䊁䊆䉴䊁䊆䉴㪆䈏㪆ᅢ䈐㪆䈪䈜䋮㪀

䋨㪟㪼㩷㫃㫀㫂㪼㫊㩷㫋㪼㪸㫋㪼㪸㫋㪼㪸㫋㪼㪸㪅䋻ᓐ㪆䈲㪆䈍䈍䈍䈍⨥⨥⨥⨥㪆䈏㪆ᅢ䈐㪆䈪䈜䋮㪀

䋨㪟㪼㩷㫃㫀㫂㪼㫊㩷㪗㪇㪅䋻ᓐ㪆䈲㪆㪗㪇㪆䈏㪆ᅢ䈐㪆䈪䈜䋮㪀

㩿㪊㪺㪀㪤㫆㫉㪼㩷㪞㪼㫅㪼㫉㪸㫃㫀㫑㪸㫋㫀㫆㫅㩷㪸㫅㪻㩷㪜㫏㫋㫉㪸㪺㫋㫀㫆㫅㩷㫆㪽㩷㪛㫀㪽㪽㪼㫉㪼㫅㫋㩷㪧㪸㫉㫋㫊㩷㪹㫐

䋨㪟㪼㪟㪼㪟㪼㪟㪼㫃㫀㫂㪼㫊㩷㪗㪇㪅䋻ᓐᓐᓐᓐ㪆䈲㪆㪗㪇㪆䈏㪆ᅢ䈐㪆䈪䈜䋮㪀 䋨㪪㪿㪼㪪㪿㪼㪪㪿㪼㪪㪿㪼㫃㫀㫂㪼㫊㩷㪗㪇㪅䋻ᓐᅚᓐᅚᓐᅚᓐᅚ㪆䈲㪆㪗㪇㪆䈏㪆ᅢ䈐㪆䈪䈜䋮㪀

䋨㪗㪈㩷㫃㫀㫂㪼㫊㩷㪗㪇㪅䋻㪗㪈㪆䈲㪆㪗㪇㪆䈏㪆ᅢ䈐㪆䈪䈜䋮㪀 㪞㫀㫍㪼㫅㩷㪫㫉㪸㫅㫊㫃㪸㫋㫀㫆㫅㩷㪜㫏㪸㫄㫇㫃㪼㫊

㪞㪼㫅㪼㫉㪸㫋㪼㪻㩷㪫㫉㪸㫅㫊㫃㪸㫋㫀㫆㫅㩷㪜㫏㪸㫄㫇㫃㪼㫊

䋨㫋㪼㫅㫅㫀㫊䋻䊁䊆䉴䋩䋬䋨㫋㪼㪸䋻䈍⨥䋩

[image:2.612.313.554.33.405.2]

䋨㪟㪼䋻ᓐ䋩䋬䋨㪪㪿㪼䋻ᓐᅚ䋩 㪠㫅㪻㫌㪺㫋㫀㫍㪼㩷㪣㪼㪸㫉㫅㫀㫅㪾

Figure 1: Previous works.

sponding parts from pairs of objects with which it corresponds. In this paper, we apply this abil-ity to a translation example that consists of SL and TL sentences. A system with RCL can ac-quire translation rules from sparse translation examples. Figure 2 shows how translation rules are acquired using this method1.

Figure 2 shows the process where translation rules B, C and D are acquired one after another using RCL. In this paper, source parts are those parts that are extracted from the SL sentences of translation examples, and target parts are those parts that are extracted from the TL sen-tences of translation examples. Moreover, part translation rules are pairs of source parts and

1In Figure 2, the use of a Greek character means that

(3)

䋨䋨䋨䋨

㰳㰳㰳㰳

䋻䋻䋻䋻

㱒㱒㱒㱒

䋩䋩䋩䋩

䋨㰮㰯

㰰㰱㰲

㰴㪅

䋻㱍㱔㱒

㱑㱐㱕

䋨䋨䋨䋨

㰮㰯

㰰㰱㰲

㰮㰯

㰰㰱㰲

㰮㰯

㰰㰱㰲

㰮㰯

㰰㰱㰲

㪗㪗㪗㪗䋪䋪䋪䋪

㰴㰴㰴㰴㪅

㪅㪅㪅

䋻䋻䋻䋻

㱍㱔㱍㱔㱍㱔㱍㱔

㪗㪗㪗㪗䋪䋪䋪䋪

㱓㱑

㱐㱕

㱓㱑

㱐㱕

㱓㱑

㱐㱕

㱓㱑

㱐㱕

䋮䋮䋮䋮

䋩䋩䋩䋩

䋨㰷㰸

㰹㰲㰺

㰴㪅

䋻㱖㱔㱙㱥

㱓㱑㱘

䋨䋨䋨䋨

㰺㱅㰺㱅㰺㱅㰺㱅

䋻䋻䋻䋻

㱙㱥㱙㱥㱙㱥㱙㱥

䋩䋩䋩䋩

䋨㰻㰼

㱄㰺㱅

䌈㪅䋻

㱔㱙㱥

㱤㱛㱜

䋨䋨䋨䋨

㰻㰼

㰻㰼

㰻㰼

㰻㰼

㪗㪗㪗㪗䋪䋪䋪䋪

䌈䌈䌈䌈㪅

㪅㪅㪅

䋻䋻䋻䋻

㱚㱔㱚㱔㱚㱔㱚㱔

㪗㪗㪗㪗䋪䋪䋪䋪

㱓㱤㱛

㱓㱤㱛

㱓㱤㱛

㱓㱤㱛

䋮䋮䋮䋮

䋩䋩䋩䋩

㪫㫉㪸

㫅㫊㫃

㪸㫋㫀㫆㫅

㩷㪼㫏㪸

㫇㫃㪼

㩷㪥㫆㪅㪈

Pr

ocess

㪫㫉㪸

㫅㫊㫃

㪸㫋㫀㫆㫅

㩷㪼㫏㪸

㫇㫃㪼

㩷㪥㫆㪅㪉

㪫㫉㪸

㫅㫊㫃

㪸㫋㫀㫆㫅

㩷㪼㫏㪸

㫇㫃㪼

㩷㪥㫆㪅㪊

㪥㫆㪅㪈

Pr

ocess

㪥㫆㪅㪉

Pr

ocess

㪥㫆㪅㪊

㪧㪸㫉㫋

㩷㫋㫉㪸㫅㫊㫃㪸㫋㫀㫆㫅㩷㫉㫌㫃

㪼㩷㪘

㪧㪸㫉㫋

㩷㫋㫉㪸㫅㫊㫃㪸㫋㫀㫆㫅㩷㫉㫌㫃

㪼㩷㪘

㪧㪸㫉㫋

㩷㫋㫉㪸㫅㫊㫃㪸㫋㫀㫆㫅㩷㫉㫌㫃

㪼㩷㪘

㪧㪸㫉㫋

㩷㫋㫉㪸㫅㫊㫃㪸㫋㫀㫆㫅㩷㫉㫌㫃

㪼㩷㪘

㪪㪼㫅㫋㪼㫅

㪺㪼

㩷㫋㫉㪸㫅㫊㫃㪸㫋㫀㫆㫅㩷

㩷㩷㩷㫉㫌㫃

㪼㩷

㪪㪼㫅㫋㪼㫅

㪺㪼

㩷㫋㫉㪸㫅㫊㫃㪸㫋㫀㫆㫅㩷

㩷㩷㩷㫉㫌㫃

㪼㩷

㪪㪼㫅㫋㪼㫅

㪺㪼

㩷㫋㫉㪸㫅㫊㫃㪸㫋㫀㫆㫅㩷

㩷㩷㩷㫉㫌㫃

㪼㩷

㪪㪼㫅㫋㪼㫅

㪺㪼

㩷㫋㫉㪸㫅㫊㫃㪸㫋㫀㫆㫅㩷

㩷㩷㩷㫉㫌㫃

㪼㩷

㪧㪸㫉㫋

㩷㫋㫉㪸㫅㫊㫃㪸㫋㫀㫆㫅㩷㫉㫌㫃

㪼㩷㪚

㪧㪸㫉㫋

㩷㫋㫉㪸㫅㫊㫃㪸㫋㫀㫆㫅㩷㫉㫌㫃

㪼㩷㪚

㪧㪸㫉㫋

㩷㫋㫉㪸㫅㫊㫃㪸㫋㫀㫆㫅㩷㫉㫌㫃

㪼㩷㪚

㪧㪸㫉㫋

㩷㫋㫉㪸㫅㫊㫃㪸㫋㫀㫆㫅㩷㫉㫌㫃

㪼㩷㪚

㪪㪼㫅㫋㪼㫅

㪺㪼

㩷㫋㫉㪸㫅㫊㫃㪸㫋㫀㫆㫅㩷㫉㫌㫃㪼㩷㪛

㪪㪼㫅㫋㪼㫅

㪺㪼

㩷㫋㫉㪸㫅㫊㫃㪸㫋㫀㫆㫅㩷㫉㫌㫃㪼㩷㪛

㪪㪼㫅㫋㪼㫅

㪺㪼

㩷㫋㫉㪸㫅㫊㫃㪸㫋㫀㫆㫅㩷㫉㫌㫃㪼㩷㪛

㪪㪼㫅㫋㪼㫅

㪺㪼

㩷㫋㫉㪸㫅㫊㫃㪸㫋㫀㫆㫅㩷㫉㫌㫃㪼㩷㪛

䋨䋨䋨䋨

㪪㫆

㫌㫉

㪺㪼

㩷㫇㪸

㫉㫋

㪪㫆

㫌㫉

㪺㪼

㩷㫇㪸

㫉㫋

㪪㫆

㫌㫉

㪺㪼

㩷㫇㪸

㫉㫋

㪪㫆

㫌㫉

㪺㪼

㩷㫇㪸

㫉㫋

䋻䋻䋻䋻

㪫㪸㫉㪾㪼㫋㩷

㫇㪸㫉㫋

㪫㪸㫉㪾㪼㫋㩷

㫇㪸㫉㫋

㪫㪸㫉㪾㪼㫋㩷

㫇㪸㫉㫋

㪫㪸㫉㪾㪼㫋㩷

㫇㪸㫉㫋

䋩䋩䋩䋩

㪫㫉

㪸㫅㫊㫃㪸㫋㫀㫆㫅㩷㫉㫌㫃㪼

㪫㫉

㪸㫅㫊㫃㪸㫋㫀㫆㫅㩷㫉㫌㫃㪼

㪫㫉

㪸㫅㫊㫃㪸㫋㫀㫆㫅㩷㫉㫌㫃㪼

㪫㫉

㪸㫅㫊㫃㪸㫋㫀㫆㫅㩷㫉㫌㫃㪼

Figure 2: Sc hema in pro cess of acquisition of translation rules using R CL. target parts, extracted as parts like translation rules A and C in Figure 2. Sen tence translation rules are pairs of source and target parts ex-tracted as sen tences like translation rules B and D in Figure 2. On the other hand, translation rules that are used as starting p oin ts in the ac-quisition pro cess of translation rules, like trans-lation rule A in Figure 2, are acquired b y using GA-ILMT. The reason b eing that the system can p erform translation based on only learning abilit y without an y analytical kno wledge, b y us-ing GA-ILMT and R CL. A system with R C L acquires part translation rules and sen tence translation rules together. As a result, a chain reaction causes the ac-quisition of translation rules. In the pro cess No.1 of Figure 2, translation rule A has infor-mation that the system can extract “Z” from the SL sen tences of translation examples, or the source parts of translation rules, and can extract “ ζ ” from the TL sen tences or the target parts. Therefore, the system can acquire the sen tence translation rule B b y extracting “Z” from the SL sen tence of translation example No.1 and “ ζ ” from the TL sen tence of translation exam-ple No.1. The acquired translation rule B has information that the system can extract from the righ t o f “E” to the left of “H” in the SL sen tences of translation examples, or the source parts of translation rules, and can extract from the righ t o f “ θ ” to the left of “ η ” in the TL sen tences or the target parts. By using this in-formation, in pro cess 2, the system can acquire part translation rule C ( NΩ ; νω ) b y extract-ing “NΩ” from the SL sen tence of translation example No.2 and “ νω ” from the TL sen tence. Moreo v er, in pro cess 3, translation rule D is ac-quired based on translation rule C. Suc h pro cess is p erformed b y deciding the common and dif-feren t parts in character strings of translation examples (Araki et al., 1995). Therefore, the system p ossesses an abilit y to decide common and differen t parts b et w een tw o ob jects. 3 Outline

㪪㫆㫌㫉

㪺㪼

㩷㪪

㪼㫅

㫋㪼

㫅㪺

㪫㫉㪸

㫅㫊㫃

㪸㫋㫀㫆㫅

㩷㪧㫉㫆

㪺㪼㫊㫊

㪫㫉㪸

㫅㫊㫃

㪸㫋㫀㫆㫅

㩷㪩

㪼㫊㫌㫃

㪧㫉㫆㫆㪽㫉㪼

㪸㪻㫀

㫅㪾

㫆㫉㫉

㪼㪺

㫋㩷

㪫㫉㪸㫅

㫊㫃㪸㫋

㫀㫆㫅㩷㪩

㪼㫊

㫌㫃㫋

㪝㪼㪼㪻㪹㪸

㪺㫂㩷㪧㫉㫆

㪺㪼㫊㫊

㪣㪼

㪸㫉

㫅㫀

㫅㪾㩷㪧

㫉㫆㪺

㪼㫊㫊

㪛㫀㪺

㫋㫀㫆

㫅㪸

㫉㫐㩷

㪽㫆

㫉㩷

㪫㫉㪸

㫅㫊㫃

㪸㫋㫀㫆㫅

㩷㪩

㫌㫃㪼㫊

[image:3.612.124.585.69.544.2]
(4)

4 Process

4.1 Translation process

In the translation process, the system gener-ates translation results using acquired transla-tion rules. First, the system selects the sen-tence translation rules that can be applied to the SL sentence. Second, the system generates the translation results by replacing the variables in the sentence translation rules with the part translation rules.

4.2 Feedback process

In the feedbackprocess, the system evaluates the translation rules used. First, the system evaluates the translation rules without variables by using the results of combinations between the translation rules with variables and the translation rules without variables(Echizen-ya et al., 1996). Next, the system evaluates trans-lation rules with variables by using the processes of combinations between the translations rules with variables and the translation rules without variables(Echizen-ya et al., 2000). As a result, the system increases the correct translation fre-quencies, or the erroneous translation frequen-cies, of the translation rules by using these eval-uation methods for the translation rules.

4.3 Learning process 4.3.1 GA-ILMT

In this paper, by using the process of acquisi-tion of translaacquisi-tion rules in GA-ILMT, the sys-tem acquires both sentence and part transla-tion rules. These rules are then used as starting points when the system performs RCL.

4.3.2 Recursive Chain-link-type Learning(RCL)

In this section, we describe the process of acqui-sition of translation rules using RCL. The de-tails of the process of acquisition of part trans-lation rules are as follows.

(1)The system selects translation examples that have common parts with the sentence translation rules.

(2)The system extracts the parts that corre-spond to the variables in the source parts and in the target parts of the sentence translation rules from the SL sentences, and the TL sentences of the translation ex-amples.

(3)The system registers pairs, of the parts ex-tracted from the SL sentences and the parts extracted from the TL sentences, as the part translation rules.

(4)The system gives the correct and erroneous frequencies of sentence translation rules to the acquired part translation rules.

Figure 42 shows an example of the acquisi-tion of a part translaacquisi-tion rule using the sentence translation rule. In Figure 4, (thirty;30[sanju]) as the part translation rule is acquired be-cause “thirty” corresponds to the variable in the source part of sentence translation rule and “30[sanju]” corresponds to the variable in the target part of sentence translation rule.

㩿㪦㫃㪻㩷㪝㪸㫀㫋㪿㪽㫌㫃㩷㫎㫀㫃㫃㩷㪹㫃㫆㫎㩷㫀㫅㫋㪿㫀㫉㫋㫐㫋㪿㫀㫉㫋㫐㫋㪿㫀㫉㫋㫐㫋㪿㫀㫉㫋㫐㫄㫀㫅㫌㫋㪼㫊㪅䋻

䋨㪫㪿㪼㫐㩷㪸㫉㪼㩷㪾㫆㫀㫅㪾㩷㫋㫆㩷㫃㪼㪸㫍㪼㩷㫀㫅㪗㪇㩷㩷㩷㩷㩷㩷㫄㫀㫅㫌㫋㪼㫊㪅䋻

㪘㪺㫈㫌㫀㫊㫀㫋㫀㫆㫅㩷㫆㪽㩷㫋㪿㪼㩷㫇㪸㫉㫋㩷㫋㫉㪸㫅㫊㫃㪸㫋㫀㫆㫅㩷㫉㫌㫃㪼㩷 㫌㫊㫀㫅㪾㩷㫋㪿㪼㩷㫊㪼㫅㫋㪼㫅㪺㪼㩷㫋㫉㪸㫅㫊㫃㪸㫋㫀㫆㫅㩷㫉㫌㫃㪼

䋨㫋㪿㫀㫉㫋㫐䋻䋳䋰㪲㫊㪸㫅㫁㫌㪴䋩

㪫㫉㪸㫅㫊㫃㪸㫋㫀㫆㫅㩷㪼㫏㪸㫄㫇㫃㪼

㪧㪸㫉㫋㩷㫋㫉㪸㫅㫊㫃㪸㫋㫀㫆㫅㩷㫉㫌㫃㪼

䉥䊷䊦䊄㪆䊶㪆䊐䉢䉟䉴䊐䊦 㪆䈲㪆䋳䋰䋳䋰䋳䋰䋳䋰㪆ಽ㪆䈪㪆็䈐㪆਄䈏䉎㪆䈪䈚䉊䈉䋮䋩

ᓐ䉌㪆䈲㪆㪗㪇㪆ಽ㪆䈚㪆䈢䉌㪆಴⊒㪆䈜䉎㪆䈪䈚䉊䈉䋮䋩

㪪㪼㫅㫋㪼㫅㪺㪼㩷㫋㫉㪸㫅㫊㫃㪸㫋㫀㫆㫅㩷㫉㫌㫃㪼㪲㪦㫌㫉㫌㪻㫆 㪽㪼㫀㫊㫌㪽㫌㫉㫌 㫎㪸㫊㪸㫅㫁㫌㫇㫊㪸㫅㫁㫌㫇㫊㪸㫅㫁㫌㫇㫊㪸㫅㫁㫌㫇㫇㫌㫅 㪻㪼㩷㪽㫌㫂㫀 㪸㪾㪸㫉㫌 㪻㪼㫊㪿㫆㪅㪴

㪲㪢㪸㫉㪼 㫎㪸 㪗㪇㩷㫇㫌㫅 㫊㪿㫀㩷㫋㪸㫉㪸 㫊㪿㫌㫋㫇㪸㫋㫊㫌 㫊㫌㫉㫌 㪻㪼㫊㪿㫆㪅㪴

Figure 4: Example of the acquisition of a part translation rule using the sentence translation rule.

The details of the process of acquisition of sentence translation rules are as follows:

(1)The system selects the part translation rules in which the source parts are included in the SL sentences of the translation example or in the source parts of sentence transla-tion rules, and in which the target parts are included in the TL sentences of the trans-lation examples or in the target parts of sentence translation rules.

(2)The system acquires new sentence transla-tion rules by replacing the parts which are same as the part translation rules with the variables to the translation examples or the sentence translation rules.

(3)The system gives the correct and erroneous frequencies of the part translation rules to the acquired sentence translation rules.

(5)

Figure 5 shows examples of the acquisition of the sentence translation rules using the part translation rules. In Figure 5, the system acquires(It starts in @0 minutes.;それ/は

/@0/分/たて/ば/始まり/ます.[Sore wa @0 pun

tate ba hajimari masu.])as a sentence trans-lation rule by using the part transtrans-lation rule (thirty;30[sanju]) acquired in Figure 4, and(@1 starts in @0 minutes.;@1/は/@0/分/たて/ば/ 始まり/ます.[@1 wa @0 pun tate ba hajimari masu.])as the sentence translation rule, that is more abstracted, is acquired by using the part translation rule (it;それ[sore]).

䋨㪠㫋㩷㫊㫋㪸㫉㫋㫊㩷㫀㫅㩷㫋㪿㫀㫉㫋㫐 㫄㫀㫅㫌㫋㪼㫊㪅

㪘㪺㫈㫌㫀㫊㫀㫋㫀㫆㫅㩷㫆㪽㩷㫋㪿㪼㩷㫊㪼㫅㫋㪼㫅㪺㪼㩷㫋㫉㪸㫅㫊㫃㪸㫋㫀㫆㫅㩷 㫉㫌㫃㪼㩷㫌㫊㫀㫅㪾㩷㫋㪿㪼㩷㫇㪸㫉㫋㩷㫋㫉㪸㫅㫊㫃㪸㫋㫀㫆㫅㩷㫉㫌㫃㪼 䋨㫋㪿㫀㫉㫋㫐䋻䋳䋰 㪲㫊㪸㫅㫁㫌㪴䋩

㪞㫀㫍㪼㫅㩷㫋㫉㪸㫅㫊㫃㪸㫋㫀㫆㫅㩷 㪼㫏㪸㫄㫇㫃㪼 㪧㪸㫉㫋㩷㫋㫉㪸㫅㫊㫃㪸㫋㫀㫆㫅㩷㫉㫌㫃㪼

䋨㪠㫋㩷㫊㫋㪸㫉㫋㫊㩷㫀㫅㩷㪗㪇㩷㫄㫀㫅㫌㫋㪼㫊㪅

䋻䈠䉏㪆䈲㪆䋳䋰㪆ಽ㪆䈢䈩㪆䈳㪆ᆎ䉁䉍㪆䉁䈜䋮

䋻䈠䉏㪆䈲㪆㪗㪇㪆ಽ㪆䈢䈩㪆䈳㪆ᆎ䉁䉍㪆䉁䈜䋮

䋨㫀㫋䋻䈠䉏㪲㫊㫆㫉㪼㪴䋩 㪧㪸㫉㫋㩷㫋㫉㪸㫅㫊㫃㪸㫋㫀㫆㫅㩷㫉㫌㫃㪼

㪩㪼㪺㫌㫉㫊㫀㫍㪼㩷㪘㪺㫈㫌㫀㫊㫀㫋㫀㫆㫅㩷㫆㪽㩷㫋㪿㪼㩷㫊㪼㫅㫋㪼㫅㪺㪼㩷 㫋㫉㪸㫅㫊㫃㪸㫋㫀㫆㫅㩷㫉㫌㫃㪼㩷㫌㫊㫀㫅㪾㩷㫋㪿㪼㩷㫇㪸㫉㫋㩷㫋㫉㪸㫅㫊㫃㪸㫋㫀㫆㫅㩷㫉㫌㫃㪼 䋨㪗㪈㩷㫊㫋㪸㫉㫋㫊㩷㫀㫅㩷㪗㪇㩷㫄㫀㫅㫌㫋㪼㫊㪅䋻㪗㪈㪆䈲㪆㪗㪇㪆ಽ㪆䈢䈩㪆䈳㪆ᆎ䉁䉍㪆䉁䈜䋮

䋨㪠㫋㫊㫋㪸㫉㫋㫊㩷㫀㫅㩷㪗㪇㩷㫄㫀㫅㫌㫋㪼㫊㪅 䋻䈠䉏㪆䈲㪆㪗㪇㪆ಽ㪆䈢䈩㪆䈳㪆ᆎ䉁䉍㪆䉁䈜䋮 㪘㪺㫈㫌㫀㫉㪼㪻㩷㫊㪼㫅㫋㪼㫅㪺㪼㩷㩷㫋㫉㪸㫅㫊㫃㪸㫋㫀㫆㫅㩷㩷㫉㫌㫃㪼

㪲㪪㫆㫉㪼㩷㫎㪸 㪗㪇㩷㫇㫌㫅㩷㫋㪸㫋㪼 㪹㪸 㪿㪸㫁㫀㫄㪸㫉㫀 㫄㪸㫊㫌㪅㪴㪀 㪲㪪㫆㫉㪼㩷㫎㪸 㫊㪸㫅㫁㫌㫇 㫇㫌㫅㩷㫋㪸㫋㪼 㪹㪸 㪿㪸㫁㫀㫄㪸㫉㫀 㫄㪸㫊㫌㪅㪴㪀

㪲㪗㪈㩷㫎㪸 㪗㪇㩷㫇㫌㫅㩷㫋㪸㫋㪼 㪹㪸 㪿㪸㫁㫀㫄㪸㫉㫀 㫄㪸㫊㫌㪅㪴㪀 㪲㪪㫆㫉㪼 㫎㪸 㪗㪇㩷㫇㫌㫅㩷㫋㪸㫋㪼 㪹㪸 㪿㪸㫁㫀㫄㪸㫉㫀 㫄㪸㫊㫌㪅㪴㪀

Figure 5: Examples of the acquisition of a sen-tence translation rule using the part translation rule.

5 Experiments for performance

evaluation

5.1 Experimental procedure

There are two kinds of data as experimental data. One is learning data and the other is evaluation data. In these experiments, 1,759 translation examples were used as learning data. These translation examples were taken from textbooks(Nihon Kyozai(1), 2001; Nihon Ky-ozai(2), 2001; Hoyu Shuppan, 2001) for second-grade junior high school students. As well, 1,097 translation examples were used as eval-uation data. These translation examples were taken from textbooks(Bunri, 2001; Sinko Shup-pan, 2001) for second-grade junior high school students. All of these translation examples were processed by the method outlined in Figure 3. The initial condition of the dictionary is empty. Moreover, we used three other commercial Rule-Based MT systems, comparing our system with

those systems. We call these three MT systems A, B and C respectively.

5.2 Evaluation standards

The correct translation results are grouped into two categories:

(1) The correct translation

This means that the translation results cor-respond to the correct translation results taken from textbooks respectively(Bunri, 2001; Sinko Shuppan, 2001).

(2) A correct translation which includes un-known words

This means that the translation results with substituted nouns or adjectives as variables correspond to the correct trans-lation results taken from textbooks respec-tively(Bunri, 2001; Sinko Shuppan, 2001).

In this paper, the effective translation results are the translation results that correspond to (1) and (2), and the ineffective translation re-sults are the translation rere-sults that do not cor-respond to (1) and (2). Moreover, the effec-tive translation rate is the rate of the effeceffec-tive translation results in all the evaluation data. The translation results are ranked when several translation results are generated. The transla-tion results using the translatransla-tion rules whose rate of correct translation frequency is high, are ranked at the top. We evaluated the translation results that are ranked from No.1 to No.3.

5.3 Experimental results and discussion

(6)

Table 1: Examples of effective translation results.

Examples of the correct translation results

SL sentences TL sentences

This bag was made in France. このバッグはフランス製です.[Kono baggu wa furansu sei desu.]

We went there to play わたしたちは野球をするためそこへ行きました.

baseball. [Watashi tachi wa yakyu wo suru tame soko e iki mashi ta.]

Examples of the correct translation results which includes the unknown words

SL sentences TL sentences

@0へ連れていってあげまし ょうか?

Shall I take you to the [@0 e tsure te itte age masho ka?]

amusement park?  @0 requires “遊園地[yuen chi]” which is equivalent for

 the noun “the amusement park”.

@0から広島までど のくらいの距離がありますか? [@0 kara

How far is it from Kyoto to hiroshima made dono kurai no kyori ga ari masu ka?]

Hiroshima?  @0 requires “京都[kyoto]” which is equivalent for

 the noun “Kyoto”.

[image:6.612.71.304.376.457.2]

described in section 5.2 were evaluated as the correct translation results. The effective trans-lation rate in the system with only GA-ILMT was 45.1%. In Table 2, the effective translation rate of system with RCL is almost the same as the effective translation rates of system A, but is higher than systems B and C.

Table 2: Results of comparative experiments.

Effective trans- Details

System lation rates (1) (2)

Our system 85.0% 41.6% 58.4%

system A 85.8% 84.0% 16.0%

system B 81.7% 83.7% 16.3%

system C 76.9% 82.7% 17.3%

Table 3: Comparison of effective translation rates based on quality.

Effective trans- Details

System lation rates (1) (2)

Our system 73.7% 7.5% 52.5%

system A 70.3% 84.2% 15.8%

system B 63.8% 85.0% 15.0%

system C 58.7% 82.8% 17.2%

Moreover, we evaluated translation results more strictly in terms of the quality of trans-lation. Meaning that only translation results that had almost the same character strings as the correct translation results taken from the textbooks(Bunri, 2001; Sinko Shuppan, 2001) were effective translation results. For

exam-ple, “それは約10分かかります[Sore wa yaku

juppun kakari masu.]” is an ineffective trans-lation result because of the correct transla-tion results for “It takes about ten minutes.”

is “約 10 分かか ります [Yaku juppun kakari

masu.]” in textbook(Bunri, 2001; Sinko Shup-pan, 2001). In this Japanese sentence, phrase “それは[sore wa]” results in needlessly long. Therefore, we evaluate the translation results that have different phrases to the correct trans-lation results as the ineffective transtrans-lation re-sults in terms of the quality of translation. Ta-ble 3 shows a comparison of effective translation rates based on quality. In Table 3, we confirmed that the system with RCL can generate more high-quality translation results than the three other Rule-Based MT systems.

[image:6.612.71.306.497.579.2]
(7)

6 Conclusion

In existing Example-Based MT systems based on learning algorithms, similar translation pairs must exist to acquire high-quality translation rules. This means that the systems require large amounts of translation examples to ac-quire high-quality translation rules. On the other hand, a system with RCL can acquire many new translation rules from sparse trans-lation examples because it uses other already acquired translation rules based on the learn-ing algorithms described in section 2. As a re-sult, the quality of the translation and the ef-fective translation rate of our system is higher than other Rule-Based MT systems. However, our system still does not reach the level of a practical MT system and requires more transla-tion rules to realize the goal of a practical MT system. Although our system is not a practical enough MT system, the system can effectively acquire the translation rules from sparse data by using RCL. Therefore, we consider that the quality of translation improves only by adding new translation examples without the difficulty of Rule-Based MT systems in which a developer must completely describe large-scale knowledge. In the future, we plan to add a mechanism that effectively combines the acquired transla-tion rules so that the system realizes the trans-lation of practical SL sentences.

7 Acknowledgements

This workwas partially supported by the Grants from the High-Tech Research Cen-ter of Hokkai-Gakuen University and a Gov-ernment subsidy for aiding scientific research (No.14658097) of the Ministry of Education, Culture, Sports, Science and Technology of Japan.

References

Hutchins, W. J and H. L. Somers. 1992. An Introduction to Machine Translation. ACA-DEMIC PRESS.

Sato, S and M. Nagao. 1990. Toward Memory-based Translation. In proceedings of the Col-ing’90.

Brown, P., J. Cocke, S. Della Pietra, V. J. Della Pietra, F. Jelinek, J. D. Lafferty, R. L. Mer-cer and P. S. Roossin. 1990. A Statistical Approach to Machine Translation. Computa-tional Linguistics Vol.16, No.2.

Watanabe, H and K. Takeda. 1998. A Pattern-based Machine Translation System Extended by Example-based Processing. In proceedings of the Coling-ACL’98.

Brown, R.D. 2001. Transfer-Rule Induction for Example-Based Translation. In proceed-ings of the Workshop on EBMT, MT Summit VIII.

Carl, M. 2001. Inducing Translation Grammars from Bracketed Alignments. In proceedings of the Workshop on EBMT, MT Summit VIII. Malavazos, C and S. Piperidis. 2000.

Appli-cation of Analogical Modelling to Example Based Machine Translation. In proceedings of the Coling2000.

G¨uvenir, H.A and I. Cicekli. 1998. Learning Translation Templates from Examples. Infor-mation Systems Vol.23, No.6.

McTait, K. 2001. Linguistic Knowledge and Complexity in an EBMT System Based on Translation Patterns. In proceedings of the Workshop on EBMT, MT Summit VIII. Echizen-ya, H., K. Araki, Y. Momouchi and K.

Tochinai. 1996. Machine Translation Method using Inductive Learning with Genetic Algo-rithms. In proceedings of the Coling’96. Goldberg, D. E. 1989. Genetic Algorithms in

Search, Optimization, and Machine Learning. Addison-Wesley.

Araki, K., Y. Momouchi and K. Tochinai. 1995. Evaluation for Adaptability of Kana-kanji Translation of Non-segmented Japanese Kana Sentences using Inductive Learning. In pro-ceedings of the PACLING’95.

Echizen-ya, H., K. Araki, Y. Momouchi and K. Tochinai. 2000 Effectiveness of Layering Translation Rules based on Transition Net-works in Machine Translation using Inductive Learning with Genetic Algorithms. In pro-ceedings of the MT and Multilingual Applica-tions in the New Millennium.

Nihon-Kyozai(1). 2001. One World English Course 1 new edition. Tokyo.

Nihon-Kyozai(2). 2001. One World English Course 2 new edition. Tokyo.

Hoyu Shuppan. 2001. System English Course 2 new edition. Tokyo.

Bunri. 2001. WorkEnglish Course 2 new edi-tion. Tokyo.

Figure

Figure 1: Previous works.
Figure 3: Process flow.
Table 2: Results of comparative experiments.

References

Related documents

It was decided that with the presence of such significant red flag signs that she should undergo advanced imaging, in this case an MRI, that revealed an underlying malignancy, which

diagnosis of heart disease in children. : The role of the pulmonary vascular. bed in congenital heart disease.. natal structural changes in intrapulmon- ary arteries and arterioles.

Field experiments were conducted at Ebonyi State University Research Farm during 2009 and 2010 farming seasons to evaluate the effect of intercropping maize with

Materials and Methods: The specificity of the assay was tested using reference strains of vancomycin-resistant and susceptible enterococci.. In total, 193

Before beginning our content analysis we describe a case study to illustrate the development of an M-health solution app developed using Design Science Research to

Highly Efficient, Solar Active, and Reusable Photocatalyst: Zr-Loaded Ag-ZnO for Reactive Red 120 Dye Degradation with Synergistic Effect and Dye- Sensitized

Considering the wide range of materials and properties of existing FDM systems, and the influence of geometry and process parameters on the mechanical properties of FDM parts,

Using a stepped- wedge trial design to randomly assign HIV care clinics to scaled initiation of PrEP integrated into HIV care for cou- ples, the RE-AIM (Reach, Effectiveness,