2.4 Generalization from Positive Evidence
2.4.2 The Lack of Overgeneralization
We have seen so far how expletive subjects allow the learner to identify a raising predicate
from a purely control one; however, the question of how children learn that some predi-
cates can be purely control without being optionally raising remains open. This question
of how a learner knows that some predicates do not have an optional raising counter-
part is what lead us to the learnability puzzle in the first place. We begin to answer this
question by turning to the Sufficiency Principle (Yang 2005, 2016), which provides a for-
mal model of rule learning and making generalizations. Using the Sufficiency Principle,
I show that the problem of the overgeneralization of ambiguous predicates never arises.
The Sufficiency Principle (Yang 2016), first stated in Chapter 1 and repeated in (26) below,
allows us to calculate the number of positive members required to generalize a rule.
(26) “The Sufficiency Principle: Let R be a generalization over N items, of which M
N – M< θN whereθN := N/lnN”
The Sufficiency Principle captures the intuition that a large amount of evidence is
needed in order for a rule to be productive. It also captures the idea of accessing informa-
tion as efficiently as possible by minimizing the retrieval times of the stored items. Build-
ing on the fact that language follows Zipf ’s law, the Sufficiency Principle determines the
point at which it is more efficient to list each item and its associated properties as opposed
to forming a rule and and listing exceptions in accordance with the Elsewhere Principle.
The Sufficiency Principle is a psychologically real model of rule learning, and therefore,
using it to determine when a rule can be generalized is desirable.
Yang 2016 provides a linguistic application of the Sufficiency Principle in addressing
Baker’s paradox in the acquisition of the dative alternation (Baker 1979). Yang 2016 shows
that the problem of overgeneralization does arise in the acquisition of the dative alterna-
tion. Children overgeneralize the double object construction to produce sentences likeshe
said me nofrom which they must eventually retreat. In this case, the earlier stages of ac-
quisition reveal that the number of verbs that show the dative alternation are sufficient for
this rule to be generalizable. Retreat from this overgeneralization then takes place when
the number of verbs that show the dative alternation fall below the threshold required by
the Sufficiency Principle; i.e., the data are insufficient for the rule to be generalizable.
Similarly, out of the 67 predicates considered here, there exists evidence for only 10
as raising. The Sufficiency Principle (N/lnN) requires 51 predicates following the rule to
generalize over a class of 67. Given the data, it cannot be concluded that the 11 members
will be generalized to the entire class of raising and control predicates. Moreover, a non-
referential subject was observed for only 4 of the ambiguous predicates. Consequently,
we find that children never get to the stage where the number of ambiguous or raising
predicates in the input are enough to be generalizable to the entire set of verbs. If 51 of the
predicates have non-referential subjects, then the child would be tempted to assume the
predicates. However, we are nowhere close to the sufficient level of positive evidence to
generalize so broadly.
Table 2.5 shows verbs sorted by frequency in the child production data in CHILDES,
and lists the number of raising and ambiguous predicates at the different stages. I exam-
ine the possibility of the learner either generalizing the raising property of verbs, which
includes ambiguous predicates, or the learner generalizing the fact that the verbs are both
raising and control. By arranging the verbs according to frequency in production, the
order in which children learn these raising and control verbs can be approximated. As we
can see, at no point in learning these predicates are there more raising or ambiguously
raising verbs than control verbs.
Table 2.5: Control, raising, and ambiguous verbs according to their frequency in the child production data in CHILDES. The threshold of generalization is calculated using
the Sufficiency Principle. Evidence for the raising or ambiguous predicate rule is never
sufficient enough to be generalizable.
Verbs Raising Ambiguous All Raising Ambiguous
Productive? Productive? 10 most frequent 2 2 4/10 no 2/10 no 20 most frequent 6 3 9/20 no 3/20 no 30 most frequent 8 4 12/30 no 4/30 no 40 most frequent 10 6 16/40 no 6/40 no 50 most frequent 12 9 21/50 no 9/50 no 60 most frequent 13 11 24/60 no 11/60 no
As the table above indicates, the learner does not have sufficient evidence to generalize
the raising rule to the entire class of predicates at any point in learning these predicates.
Therefore, the learner will have to lexically learn the predicates that are raising on an
item-by-item basis. We also see from the fact that there is only positive evidence for
Sufficiency Principle requires at least 16 exceptions to a class of 67 for the majority rule
to be unproductive. In this case, with only evidence for 11 verbs as raising, the learner
overgeneralizes the control structure for the 66 verbs. Since more evidence for the control
structure of verbs is present in the input, the learner has a control-predicate default.
Since the number of ambiguous or pure raising predicates is not large enough for
the learner to overgeneralize, the learnability problem does not arise. The learner only
overgeneralizes if there are less than N/ln(N) exceptions to a rule, and here, the number
of exceptions well exceeds the number of predicates that show properties of raising. This
mistreatment of the problem of generalization in the first place resulted in the learnability
puzzle, which does not exist.
Now that we have discussed distinguishing the raising predicates from control ones,
let us turn to how the learner identifies ambiguous predicates from the pure raising ones
in the following section.