5.3 Machine learning experiments
5.3.5 Contribution to accentuation
In this final experiment we assessed the added value of using information about ar-gument versus condition in PROS-3 (Dirksen, 1994). We compared the accentuation of the sentence final verb phrase according to PROS-3 and according to PROS-3 com-plemented with information about the identity of the preceding nominal constituent, to the reference transcription (mentioned in section 4.2). We did this for a subset (38 phrases; 27 with an argument and 11 with a condition) of the held-out corpus, because the reference transcription was not available for all 61 phrases of the newspaper and e-mail data.
Table 5.5 shows the performance measures for this comparison, indicating that the accentuation of the sentence final verb is slightly improved when using information about the status of the nominal constituent (ARG vs. COND). The results for PROS-3 complemented with this information derived withMBL are better than PROS-3 solely.
This improvement is mainly in the precision. The results for PROS-3 complemented with the information derived withRIPPERis only better in precision. Indeed, blocking certain incorrect placements of accents improves the precision on accentuation.
Table 5.5: Performance in percentages on accentuation of the sentence final verb by PROS-3 (for 38 instances), complemented with information about argument versus condition fromMBLandRIPPERfor all features, and with a “golden standard”.
accuracy precision recall Fβ=1
PROS-3 80 63 84 72
PROS-3 +MBL(all) 81 64 83 72
PROS-3 +RIPPER (all) 80 64 81 72
PROS-3 + golden standard 81 65 85 74
MBL attains the best improvement. Although it incorrectly prevents three intended ac-cents (when compared to the classification mentioned in section 4.2), it does in fact
5.4 DISCUSSION AND CONCLUSION
correctly prevent unintended accents in six other cases. Two instances of the latter are given in example 5.34, where - indicates the prevented accent.
(5.34) (a) . . .
MBL incorrectly inserts two accents, where there was no accent intended (when com-pared to the classification mentioned in section 4.2), while it does also correctly in-sert intended accents in two other cases. These two latter instances are given in exam-ple 5.35, where + indicates the location of the inserted accent.
(5.35) (a) . . .
Table 5.5 also shows the performance measures for PROS-3 complemented with the
“golden standard”. These results indicate the maximal attainable improvement of ac-centuation when using information whether the nominal constituent preceding the sentence final verb is an argument or a condition. The results that we obtained for PROS-3 complemented with the information fromMBLandRIPPERis worse than when complemented with the “golden standard”.
5.4 Discussion and conclusion
We discussed the criteria for a nominal constituent preceding a sentence final verb phrase to be an argument or a condition. This classification is a factor in the accentua-tion status of the verb (based on SAAR (Gussenhoven, 1982, 1984)). We discussed that overall there are two rules that apply to this: (i) the sentence final verb is not accented if it is preceded by an argument (including locatives that are subcategorized by the verb), and (ii) the sentence final verb is accented if it is preceded by a condition.
The whole experimental design can be seen as being somewhat cyclic, because we start with defining what we consider as arguments and conditions, whereafter we investi-gate to what extent we can predict these classifications through performing machine learning experiments. However, we think we sufficiently argued why we classify cer-tain instances of nominal constituents to be arguments or conditions, and we consis-tently use the specified classification rules. Therefore, we consider the experimental design to be legitimate.
The results of testing on the Spoken Dutch Corpus (CGN) data showed that machine learning experiments using lexical features and a co-occurrence feature (computed from WWW counts) are useful for the prediction of the status of the nominal con-stituent which precedes a sentence final verb phrase. The results on the held-out news-paper and e-mail data also showed that this method is useful, however the results are far less successful than for CGN data. This can be due to the small number of in-stances in the held-out corpus (only 38 sentences), and to the fact that we trained on spoken data and tested on written data (although we did not find such an effect for the experiments on PP attachment prediction). Besides the deviations (i.e. marked order of constituents and poor predictability of the constituent to the verb) for accentuation status of the verb, discussed in section 5.2, might also have been of influence.
The main conclusion from the machine learning experiments is that for disambiguation of arguments and conditions, the noun feature is the most important. The status of the nominal constituent preceding the sentence final verb can be predicted on the basis of the identity of the noun solely. It would also be interesting to perform machine learning experiments for directly predicting the accentuation status of the sentence final verb.
From the discussion about the validity of SAAR (section 5.2.2) we might expect that for directly predicting the accentuation status, the verb feature, instead of the noun feature, would be the most important.
Evaluation of the new prosody
module ECLIPSE 6
In this chapter we describe the evaluation of the new prosody mod-ule (ECLIPSE) that resulted from studies described in the previous chapters. The evaluation is dual, consisting of a objective evaluation through comparison with the reference transcription and a subjective evaluation by means of a perception experiment in which listeners had to indicate the acceptability of the different realizations of the same sentence. The results of the objective evaluation show that ECLIPSE performs considerably better than PROS-3. The results of the subjec-tive evaluation show that ECLIPSE is preferred by the listeners and the experts over PROS-3 and that there is no significant difference between ECLIPSE and the reference transcription.
6.1 Introduction
The research described in the previous chapters was used to create a new prosody module (ECLIPSE). This module uses syntactic and lexical information for the alloca-tion of phrase boundaries and accents. In this chapter we describe the investigaalloca-tion of the merit of using the revised algorithm for prosodic phrasing, using information about PP attachment and using information about the status of the nominal constituent that precedes a sentence final verb. First, we conduct a objective evaluation comparing the output of the prosody module with the reference transcription and the previously eval-uated algorithm PROS-3. Next, we perform a perception experiment in which listeners have to indicate the acceptability of the prosodic structure (i.e. subjective evaluation).
Finally, as a cross-check we compare the results from the perception experiment to quality judgements from three experts.
We hypothesize that ECLIPSE assigns a better and more acceptable prosodic structure than PROS-3. To test this hypothesis we need to make a fair comparison between the two algorithms. PROS-3 operates in tandem with a robust syntactic parser. This syntactic analysis is often incorrect. ECLIPSE uses syntactic information based on the Amazon1parser, which delivers a more correct syntactic analysis (which is considered as state-of-the-art). Comparing the performance of ECLIPSE with that of PROS-3 as such will give a distorted image of the differences. Therefore, we will also compare the performance of ECLIPSE to a version of PROS-3 that is based on syntactic information provided by Amazon. Henceforth, we will refer to this version of PROS-3 as PROS-3+.