The Lack of Overgeneralization - Generalization from Positive Evidence

2.4 Generalization from Positive Evidence

2.4.2 The Lack of Overgeneralization

We have seen so far how expletive subjects allow the learner to identify a raising predicate

from a purely control one; however, the question of how children learn that some predi-

cates can be purely control without being optionally raising remains open. This question

of how a learner knows that some predicates do not have an optional raising counter-

part is what lead us to the learnability puzzle in the first place. We begin to answer this

question by turning to the Sufficiency Principle (Yang 2005, 2016), which provides a for-

mal model of rule learning and making generalizations. Using the Sufficiency Principle,

I show that the problem of the overgeneralization of ambiguous predicates never arises.

The Sufficiency Principle (Yang 2016), first stated in Chapter 1 and repeated in (26) below,

allows us to calculate the number of positive members required to generalize a rule.

(26) “The Sufficiency Principle: Let R be a generalization over N items, of which M

N – M< θ_N whereθ_N := N/lnN”

The Sufficiency Principle captures the intuition that a large amount of evidence is

needed in order for a rule to be productive. It also captures the idea of accessing informa-

tion as efficiently as possible by minimizing the retrieval times of the stored items. Build-

ing on the fact that language follows Zipf ’s law, the Sufficiency Principle determines the

point at which it is more efficient to list each item and its associated properties as opposed

to forming a rule and and listing exceptions in accordance with the Elsewhere Principle.

The Sufficiency Principle is a psychologically real model of rule learning, and therefore,

using it to determine when a rule can be generalized is desirable.

Yang 2016 provides a linguistic application of the Sufficiency Principle in addressing

Baker’s paradox in the acquisition of the dative alternation (Baker 1979). Yang 2016 shows

that the problem of overgeneralization does arise in the acquisition of the dative alterna-

tion. Children overgeneralize the double object construction to produce sentences likeshe

said me nofrom which they must eventually retreat. In this case, the earlier stages of ac-

quisition reveal that the number of verbs that show the dative alternation are sufficient for

this rule to be generalizable. Retreat from this overgeneralization then takes place when

the number of verbs that show the dative alternation fall below the threshold required by

the Sufficiency Principle; i.e., the data are insufficient for the rule to be generalizable.

Similarly, out of the 67 predicates considered here, there exists evidence for only 10

as raising. The Sufficiency Principle (N/lnN) requires 51 predicates following the rule to

generalize over a class of 67. Given the data, it cannot be concluded that the 11 members

will be generalized to the entire class of raising and control predicates. Moreover, a non-

referential subject was observed for only 4 of the ambiguous predicates. Consequently,

we find that children never get to the stage where the number of ambiguous or raising

predicates in the input are enough to be generalizable to the entire set of verbs. If 51 of the

predicates have non-referential subjects, then the child would be tempted to assume the

predicates. However, we are nowhere close to the sufficient level of positive evidence to

generalize so broadly.

Table 2.5 shows verbs sorted by frequency in the child production data in CHILDES,

and lists the number of raising and ambiguous predicates at the different stages. I exam-

ine the possibility of the learner either generalizing the raising property of verbs, which

includes ambiguous predicates, or the learner generalizing the fact that the verbs are both

raising and control. By arranging the verbs according to frequency in production, the

order in which children learn these raising and control verbs can be approximated. As we

can see, at no point in learning these predicates are there more raising or ambiguously

raising verbs than control verbs.

Table 2.5: Control, raising, and ambiguous verbs according to their frequency in the child production data in CHILDES. The threshold of generalization is calculated using

the Sufficiency Principle. Evidence for the raising or ambiguous predicate rule is never

sufficient enough to be generalizable.

Verbs Raising Ambiguous All Raising Ambiguous

Productive? Productive? 10 most frequent 2 2 4/10 no 2/10 no 20 most frequent 6 3 9/20 no 3/20 no 30 most frequent 8 4 12/30 no 4/30 no 40 most frequent 10 6 16/40 no 6/40 no 50 most frequent 12 9 21/50 no 9/50 no 60 most frequent 13 11 24/60 no 11/60 no

As the table above indicates, the learner does not have sufficient evidence to generalize

the raising rule to the entire class of predicates at any point in learning these predicates.

Therefore, the learner will have to lexically learn the predicates that are raising on an

item-by-item basis. We also see from the fact that there is only positive evidence for

Sufficiency Principle requires at least 16 exceptions to a class of 67 for the majority rule

to be unproductive. In this case, with only evidence for 11 verbs as raising, the learner

overgeneralizes the control structure for the 66 verbs. Since more evidence for the control

structure of verbs is present in the input, the learner has a control-predicate default.

Since the number of ambiguous or pure raising predicates is not large enough for

the learner to overgeneralize, the learnability problem does not arise. The learner only

overgeneralizes if there are less than N/ln(N) exceptions to a rule, and here, the number

of exceptions well exceeds the number of predicates that show properties of raising. This

mistreatment of the problem of generalization in the first place resulted in the learnability

puzzle, which does not exist.

Now that we have discussed distinguishing the raising predicates from control ones,

let us turn to how the learner identifies ambiguous predicates from the pure raising ones

in the following section.

In document Learning From Positive Evidence: The Acquisition Of Verb Argument Structure (Page 70-73)