5.2 Global evaluation
5.2.2 Utterance coverage
Secondly, how much of each utterance is covered by the parses at the various times? Figure 5.2a gives the results over time. The model reaches a state after approximately1500input items in which it is able to process almost the full utterance. It has to be kept in mind that this rapid peak may also be due to the fact that the model applies bootstrapping relatively eagerly.
When we split the values for utterance coverage over the correctly and incorrectly identified situations (figures 5.2b and 5.2c), we can observe that throughout the simulation, the analyses with incorrectly identified situations have lowerutterance coveragescores. This is due to two things. First of all, there are (especially initially) several cases in which the model simply only ignores all words. Secondly, the model, in several cases, misidentifies the situ- ation based on a partial understanding of the utterance. Given the continuity between subsequent situations, it is likely that the event and/or some partic- ipants of one situation are present in the next situation as well. When such a string of situations constitutes S, it is easy to see how the model, having understood one or two words, maps the analysis to the wrong situation.
Interestingly, in all simulations, the model reaches a peak in the coverage of the utterance before suffering from a slight dip in theutterance coverage, from with it recovers afterwards. When we look at the scores split over correctly and incorrectly identified target situations, we can see that the peak is found slightly earlier for the incorrectly identified ones (around 1100−1200) than for the correctly identified ones (around1300−1400). The dip in the utterance coverage mostly occurs at the time when the model is reaching convergence in the correct identification of the situation. This means that just before the convergence, the model is applying representations that cover more of the utterance, but do so with less success. In the next stage, the model uses slightly
0.00 0.25 0.50 0.75 1.00 0 2500 5000 7500 10000 time utter ance co v er age
(a) Utterance coverage for all input items.
0.00 0.25 0.50 0.75 1.00 0 2500 5000 7500 10000 time utter ance co v er age
(b) Utterance coverage for correctly iden- tified target situations.
0.00 0.25 0.50 0.75 1.00 0 2500 5000 7500 10000 time utter ance co v er age
(c) Utterance coverage for incorrectly identified target situations.
shorter representations to analyze the utterances, covering slightly less of the utterance, but making more accurate analyses. Finally, the model starts using the longer representations again, but now in an accurate way.
What the model does here is reminiscent of a phase of syntactic creativity that is only later constrained by more ‘fitting’ representations. As we will see in section 5.3 below, and in the closer inspection of the learning mechanisms in the next chapter, the period around2500input items is also the moment when the model has just acquired abstract representations and has ceased to apply the syntagmatization operation frequently. This means that by then the potential for generalization, in the form of abstract constructions (construc- tions with few semantic constraints, obtained through paradigmatization), is present, and that afterwards the model ‘recovers’ from applying these abstrac- tions too frequently by building up an inventory of more concrete construc- tions that ‘pre-empt’ the use of the abstract constructions in the analysis. The continuing accrual of relatively concrete constructions allows the model to overcome overgeneralization. As such, this robustness provides an argument for the apparent redundancy of storage, as many within the usage-based ap- proach have argued (Langacker 1988, Beekhuizen, Bod & Verhagen 2014).
Let us have a look at an example that illustrates this. In one of the sim- ulations, the model encounters, after some 200input items, the utterance in example (29). The utterance illustrates a construction which is relatively rare (compared to other kinds of three-word utterances that are formed on the ba- sis of a transitive construction). The optimal analysis the model assigns to this utterance is given in example (30). It involves an abstract intransitive con- struction and the bootstrapping ofgo. Some300input items later, the model encounters the same utterance, but now uses the analysis in example (31). This is a regular transitive construction, in which the action of a person on an ob- ject is expressed. With this construction, SPL erroneously takes the utterance to refer to a caused-motion event. Nonetheless, it covers the full utterance, as opposed to the analysis with the intransitive construction.
Finally, after another300input items, the model has an intransitive motion construction, as shown in example (32), which is combined with the known meanings ofgoandout. From this example, we can glean that the model ea- gerly applies abstract patterns to situations in which they lead to misinterpre- tations. These errors are overcome once a larger inventory of constructions is built up.
(29) you go out
(30) [ [ENTITY]→[HEARER/you] [EVENT]→(GO/go) ]
(31) [ [PERSON]→[HEARER/you] [CAUSE]→(CAUSE-MOVE/go) [OB-
JECT]→(ARTEFACT/out) ]
(32) [ [ PERSON ]→[ HEARER / you ] [ EVENT ]→[ CAUSE-MOVE / go ]
0.00 0.25 0.50 0.75 1.00 0 2500 5000 7500 10000 time situation co v er age
(a) Situation coverage for all input items.
0.00 0.25 0.50 0.75 1.00 0 2500 5000 7500 10000 time situation co v er age
(b) Situation coverage for correctly identi- fied target situations.
0.00 0.25 0.50 0.75 1.00 0 2500 5000 7500 10000 time situation co v er age
(c) Situation coverage for incorrectly iden- tified target situations.
Figure 5.3: Situation coverage for 10 simulations over time.