Data selection - Money can't buy expertise: An exploratory study into translators processing np

4. Methodology

4.7 Data selection

In order to investigate whether the npaparp structure poses a translation problem for untrained and trained students and for the professional translator, the following characteristics of the translation process are studied in detail:

1. the translation duration of the paparp/npaparp sentence and the paparp/npaparp clause 2. the number of pauses per paparp/npaparp sentence and per paparp/npaparp clause 3. the total pause duration per paparp/npaparp sentence and per paparp/npaparp clause 4. the average single pause duration per paparp/npaparp sentence and per paparp/npaparp clause

Furthermore, this study explores whether any relation could be observed between a participant's LexTALE score and his/her average translation time, average number of pauses and average total/single pause duration, regardless of the (n)paparp structure. As a result, the following was also studied:

5. the LexTALE score in relation to translation duration and pause data

4.7.1 Calculations and measurements

For every session, a general and linear analysis were downloaded from Inputlog to serve as a basis for calculating the translation duration, (total/single) pause duration and number of pauses. Firstly, all critical and control sentences and clauses (npaparp/paparp) were located in the general analysis and their start and end time were written down.

The end time of the last keystroke dedicated to the translation of the sentence prior to the critical or control sentence, was considered as the start time of the paparp/npaparp sentence.

That final keystroke of the previous sentence could either be (1) the full stop (.) or (2) any adjustment made to the translation of that sentence before moving on to the paparp/npaparp sentence or (3) the space previous to the paparp/npaparp sentence (which was often typed immediately after the full stop and before the pause in between sentences). The pause between two sentences was considered processing time for the paparp/npaparp sentence and was thus included in the translation duration. The end time of the paparp/npaparp sentence was indicated by the last keystroke of the translation of that sentence. Finally, the translation duration was obtained by subtracting the start time from the end time.

A similar calculation was made for the translation duration of the paparp/npaparp clauses. As described in section 3.1, the beginning of the critical/control clause could either be (1) at the beginning of the critical/control sentence or (2) further on in the sentence at a conjunction and/or after a comma. The end of the critical/control clause could either correspond to the end of the critical/control sentence and to a conjunction and/or comma within the sentence.

The number of pauses per paparp/npaparp sentence and paparp/npaparp clause was counted manually in the Inputlog files. To avoid mistakes the number of pauses was counted twice for every sentence and clause. The same start and end times were used as described above. As defined in chapter , in the present study a pause is considered as a break of one second or more in which no movements are observed.

The total pause duration per paparp/npaparp sentence and paparp/npaparp clause was calculated by adding up all pause durations over one second within the critical/control sentence or clause. Depending on the number of pauses, this was either done manually with a calculator or with the Excel function SUMIF.

The average single pause duration per paparp/npaparp sentence and paparp/npaparp clause was rendered automatically in Excel by dividing the total pause duration by the number of pauses within the critical/control sentence or clause.

To see whether any link could be observed between a participant's LexTALE score and their translation behaviour, an Excel file was drawn up with the following information:

✓ LexTALE score

✓ average translation time per session

✓ average translation time per paragraph

✓ average translation time per sentence

✓ average number of pauses per sentence

✓ average total pause duration per sentence

✓ average translation time per clause

✓ average number of pauses per clause

✓ average total pause duration per clause

This file was studied qualitatively and hypothetically. It was assumed that those participants with the highest LexTALE score would translate faster and show fewer pauses because they have a greater vocabulary knowledge and would have a high level of proficiency in English.

During the course of the experiment, several participants accidentally deleted the paragraph they were working with the key combination CTRL+Z. This is because on most computers this combination is used to reverse your last action, but in the CASMACAT programme it deletes the entire translation segment that is open at that moment. Whenever this accidental deletion occurred, only the first translation of a passage was studied, even if a participant's second attempt (after the deletion) resulted in a slightly different translation. This precaution was taken to ensure that the validity of the data and calculations was not affected, as the participants might try to retrieve their original translation rather than translate the paparp/npaparp sentence. Moreover, translation time and pause behaviour could never be identical when they were translating a paragraph, sentence or clause for the second time.

4.7.2 Data restriction

In May, it became clear that it would not be possible to process the data of all twenty-four participants that took part in the experiment within the time frame of this dissertation.

Therefore, the number of participants that would be analysed was restricted to fifteen:

fourteen students and the professional translator. To ensure an equal representation of both groups of students, seven trained and seven untrained students were selected on the basis of two criteria. The first criterion was the completeness of the participants data sets as incomplete data files would hamper the data processing and statistical testing. Despite the fact that almost all students completed five translations sessions, some Inputlog files were incomplete or could not be retrieved. This was the case for seven participants: P02, P03, P05,

P08, P13, P16 and P22. Moreover, one participant (P21) decided to withdraw from the

Table 1. Overview of participants with missing data

Secondly, the researcher wished to include the most recently recorded data into the analyses (participants P19-P23, three untrained + two trained), since those participants had not been studied yet in previous dissertations (a.o. De Graeve, 2015; Vogeleere, 2018). However, P21 (untrained) was excluded because she only completed one translation session.

Given the fact that only nine untrained students took part in the experiment, only one more participant could be left out. It was decided to discard the data of P08, since two data files (12 paragraphs) were missing. With respect to the untrained students, P13 and P16 were discarded because of missing paragraphs. As the researcher had already processed the data of participants P01 till P07, five of which were trained, before it became clear that not all participants could be studied, participants P02 and P05 were not left out of the analyses in spite of the missing paragraphs. As indicated above, participants P20 and P23 were also selected for data analyses, resulting in a total of seven trained students.

Selected participants

Table 2. Overview of participants selected for analysis

In document Money can't buy expertise: An exploratory study into translators processing npaparps (Page 30-35)