• No results found

Applications to Technical Language

In document MAKE IT SIMPLE WITH PARAPHRASES. (Page 158-163)

8.5. CONTROLLED LANGUAGE

8.5.2. Applications to Technical Language

The applicability of the linguistic resources developed in the current research goes beyond controlled language in general domains. We believe that paraphrasing tools will be useful in specific semantic domains too. Even though in technical language, texts are more controlled and carefully written by professionals who know the terminologies and writing conventions of their field of knowledge, high-quality editing tools with good paraphrasing capabilities can help speed up the writing task.

Specialized and scientific domains already use controlled language to create less ambiguous and clearer and more precise texts. Reducing ambiguity and complexity is achieved by "controlling" grammar and vocabulary, and results in a better quality of technical documentation.

In technical texts, many predicates are nouns. We experimented by paraphrasing a few support verb constructions which use domain specific nouns or terms (most technical nouns are predicative), such as from the financial/auditing field (viz. fazer uma auditoria financeira – to perform a financial audit), from the legal field (viz. dar o poder paternal – to give parental rights) or from the biomedical field (viz. fazer uma operação – to do an operation). We verified that, contrary to what happens with support verb constructions that use non technical terms, and where it is very common to find corresponding verbs to replace support verb constructions, viz. tomar uma decisão (to make a decision), and decidir (to decide), in scientific and technical fields, there are often no corresponding verbs for the terms that name relations, procedures, etc. However, the paraphrasing of these expressions is done by employing a different linguistic strategy, the replacement of the elementary support verb with more sophisticated stylistic variants.

NEW RESOURCES AND APPLICATIONS

139 As mentioned in the preview presented in Chapter 1, support verb constructions, such as fazer uma operação (to do an operation) can often be disambiguated by being paraphrased with corresponding correct paraphrasing capabilities. The paraphrasing capabilities can be verbs (operar > to operate on) or lexical-syntactic extensions (realizar uma operação > to perform an operation or submeter-se a uma operação > to undergo/have an operation = to be operated on), depending on the context, and on the semantic classes of the arguments. These lexical-syntactic extensions represent stylistic variants of support verb constructions. For example, [Chacoto, 2005] presents a sample list of about 200 health related Portuguese predicate nouns that co-occur with the support verb fazer (to do). These nouns refer to clinical exams and medical treatments or surgical interventions, such as fazer uma radiografia (to perform/have diagnostic X-ray examination) or fazer uma lobotomia (to perform/have a lobotomy). Doctors and other medical professionals use these expressions in their daily language when communicating with their colleagues or with their patients. For example, obstetricians order their patients to have medical exams using expressions such as fazer uma ecografia (to have an ultrasonography), fazer uma amniocentese (to have an amniocentesis or amniotic fluid test), fazer análises à urina (to have an urine test or urinalysis), fazer análises ao sangue (to have blood tests), fazer o parto (to deliver a baby), etc. These expressions appear in Portuguese or Brazilian blogs, including doctor’s blogs, in e-mails, in websites describing medical exams or surgical procedures, in online newspapers, etc.

Our linguistic resources can be used and the monolingual paraphraser ReWriter employed to re-write these support verb constructions. Figure 29 shows a concordance where Portuguese biomedical-related support verb constructions are recognized and paraphrased as lexical strong verbs or as stylistic variants. Stylistic variants sujeitar-se a and submeter-se a (Ŧ to be submitted to) are only allowed when the subject is a patient. Some lexical strong verbs are only allowed with agentive subjects. There is a strong connection between predicate-argument structure knowledge and the use of a particular stylistic variant.

NEW RESOURCES AND APPLICATIONS

140

Figure 29: Recognition and monolingual paraphrasing of biomedical-related support verb constructions (support verb construction / corresponding verb or stylistic variant)

Technical controlled languages limit language so it is easier to translate. We foresee the increasing ability to create controlled language in specific domains, which will clarify writing, making it more precise and meaningful, within the domain of the user, whether they are in linguistics, computer science, medicine or sports. In the business world, how the company conducts its business is also determined by how their documents are written.

To sum up, in the particular case of support verb constructions, the most important aspect of paraphrasing is word reduction, in particular nouns, but also prepositions, determiners and sometimes even adjectives and other grammatical words and replacement of a semantically weak verb by a strong verb. General language, but specially scientific and technical fields are rich in nouns, which correspond to the subject matter or domain terminology. So the strategy of reducing the number of support verbs and consequently the number of nominalizations and increasing the amount of morpho- syntactically related verbs by paraphrasing balances the text style. The ideas expressed in

NEW RESOURCES AND APPLICATIONS

141 the support verb construction can be replaced in most cases by the verb, which is a strong and simple way of moving ideas along saving on sentence length and complexity and making it easier to understand. This is also useful for people with fewer linguistic skills. Taken together, reduction of grammatical words such as determiners and prepositions is a machine friendly technique, since grammatical words create noise and give rise to bad output by many machine translation. It is important to note that, reducing the support verb constructions to single verbs is a way of tightening style in written language and making it more accessible to machine translation, but spoken language would often sound too formal if it were constructed this way. The support verb constructions form part of the complex web of less formal, deliberately vague elements that are part of the interpersonal or ‘politeness’ strategies that we use in spoken communication. In many cases the support verb constructions are used in spoken or colloquial situations, with the simpler verb version being more formal. Naturally, machine translation cannot take the textual element into account, and the fact that support verb constructions appear in less formal use brings with it the fact that informal, loosely structured language is more difficult for both humans and machines to translate than more formal texts using carefully constructed sentences.

Chapter 9

Evaluation

*

Chapter Nine is dedicated to evaluation. It refers to some methods for evaluating the quality of text which has been translated using machine translation. It briefly presents the pioneer work performed by Linguateca. Then it focusses on the machine translation problems presented by the support verb constructions and presents two experiments which show evidence of correction of these problems. A paraphrase suitability index and other evaluation measures are suggested.

*

In document MAKE IT SIMPLE WITH PARAPHRASES. (Page 158-163)