DESCRIPTIVE
EVALUATIONS
OF CHILD
DEVELOPMENT
AND
OF
DEVELOPMENTAL
SETTINGS
Bettye M. CaIdwell, Ph.D
College of Home Economics, Syracuse University, and Department of Pediatrics, Upstate Medical
Center, State University of New York, Syracuse, New York
46
I
F ONE is seriously interested ininterven-tion, then one is obligated to be equally interested in evaluation. The former
pro-cess is tremendously in vogue now, while
the latter is very much in disfavor. This is perhaps because the entirely legitimate
scientific process of evaluation has been
somehow confused with the highly personal and subjective process of making
judg-ments. The entire evaluation vocabulary sounds judgmental; words such as “less
than,” “more than,” “better than,” etc. can seldom be accepted as purely descriptive
terms. Yet, ironically, no evaluation process is as judgmental as any intervention pro-cess. For when one intervenes, one has al-ready formed a judgment, with or without any sort of formal evaluation, that things
should somehow be different than they are. The intervention referred to here is of a
fairly specific type-intervention designed
to influence a young child’s ability to learn, to profit from experience, to develop at an optimal rate and to a maximal level. That it is possible for intervention programs to make major strides toward this goal has been accepted as an article of faith; yet, at
the same time, there is a general wariness about full involvement in evaluation activi-ties.
One possible reason for this wariness is a
suspicion that some evaluation procedures will not be entirely fair to the children who are evaluated, that they will not adequately reflect important changes in functioning
which may occur.
This may vell be the case, for evaluation instruments often seem to be selected or
designed to assess “basic” or “pervasive” behaviors, whereas the intervention proce-dure being evaluated might have been
de-signed to accomplish more specific and re-stricted goals. If one does not include at least a sprinkling of such “basic” or “perva-sive” or “generalized” measures, one is sure
to be criticized for not having done so. This, of course, is not too surprising, for de-tection of transfer effects associated with
any type of training is important. It is as
essential to know something about how
new information is internalized and worked into the existing program for processing
in-formation as it is to know whether it has
been acquired.
For intervention programs concerned with the learning process, this
internaliza-tion, regardless of the label used for it, is close to what most people mean by the con-cept of intelligence. That is, people are
bound to inquire how a child’s “intelli-gence” has been affected by some enrich-ment program to which he has been ex-posed. Such an inquiry should not be re-sponded to as though it identified one total-ly unsophisticated about the nature of intel-ligence and how it can be assessed. For
such a question implies the underlying
awareness of what Hunt1 has called the epi-genetic theory of intelligence-or intelli-gence as something that emerges rather than exists, and as a level of performance
that is decidedly affected by experience.
(Received January 3; accepted for publication January 16, 1967.)
Presented at the Child Development Section, American Academy of Pediatrics, Annual Meeting, Chi-cago, Illinois, October 23, 1966.
The research referred to in this paper has been supported by Crant No. MH-07649, National Insti-tutes of Health, U.S. Public Health Service, and Grant No. D-156, Children’s Bureau, Welfare Ad-ministration, Department of Health, Education, and Welfare.
ADDRESS: Department of Pediatrics, Upstate Medical Center, 766 Irving Avenue, Syracuse, New
York 13210.
Yet, otie suspects that such a question also reflects some confusion of the phenomenal -that which is observed-with the “real”-that which underlies what is observed. Such a confusion reflects the type of decision making involved in making a diagnosis from observed symptoms. But intelligence
does not underlie so much as it overlies; it does a much better job of compressing a child’s past history than it does of pre-dieting his future performance. If that
con-cept of intelligence could earn acceptance there would probably be no objections to the use of tests parading tinder the label of
nitelligence tests” for the evaluation of
in-tervention programs. Such a concept would
do much to erase the negative set that
many persons have toward infant tests. Too
often they have been asked to “diagnose”
rather than “describe.” The former task
they do quite poorly; the latter they do
quite admirably.
DESCRIBING DEVELOPMENT IN
STRUCTURED SITUATIONS
If one is willing to accept developmental
evaluations as descriptive rather than
diag-nostic, as more indicative of past
experi-ence than of future potential, then one has available a number of procedures helpful for evaluating the effects of early interven-tion programs. The list includes not only
the most widely used instruments but also a number of screening procedures
deliberate-ly designed to be used by persons with a minimal degree of training and experience
in developmental testing. 2-5 Regardless of
tile technique employed, evidence is needed that the observed behavior bears a relationship to the setting in which it occurs
and to tile intervention experience (or lack of it) that tile child has had. Such informa-tion is of value in facilitating plans for what a child needs at the next developmental level, regardless of what it enables one to predict about ultimate developmental sta-tus.
The Preschool Inventory (which the writ-er developed for use in Head Start in 1965
vhen it seemed as though no previously
published test could be found which would
meet the needs of a mass evaluation pro-gram to be carried out largely by volun-teers and untrained personnel
)
perhaps pro-vides as good an example of this point asany.
(
It provides a better example in that the author can without apology be criticalof it.
)
It was developed and field testedal-most overnight; it in no way could be said
to have evolved from a systematic theory of early cognitive development; and, it was
openly and unashamedly intended merely to describe rather than diagnose. Further-more, it was intended to be sensitive, rather than resistant, to change. The Inventory
was compiled after discussions with various Head Start planners about . . . “the need for
some type of instrument that would pro-vide an indication of how much a disadvan-taged child, prior to his introduction to Head Start, had achieved in areas regarded
as necessary foundations for subsequent
success in school. A measure of basic intelli-gence was iiot in any sense the goal, al-though it would be naive to assume that
any such index of achievements would not be to some extent correlated with perfor-mance On intelligence tests. Nor was there any concern with the development of a so-called ‘culture-fair’ test. It was taken for granted at the outset that the culture in the
child’s preceding years had not been
entire-ly fair and that what was needed was not a
procedure that would attempt to remove this unfairness but rather one that would permit it to show in all its blatency. Also, it was considered to be extremely important
to demonstrate that the child from a less favorable background was actually func-tioning at a deficit at the time he began school; this deficit had been assumed but not substantiated on a large scale. Then, finally but by no means least, it was consid-ered important to develop a procedure that could be used on a before-after basis and be available as one index of educational achievement associated with Head Start.”
expe-rience in a day care center for
disadvan-taged children.#{176} Upon these experiences were grafted points mentioned in some of the early memos and news releases
dis-tributed by various persons involved in the planning for Project Head Start. For exam-ple, it was mentioned that many of the eli-gible children were unable to give basic in-formation about themselves and that they had a very negative self-concept. Therefore, items requesting the child to give his first and last name and his age were written. It was stressed that the experiences of de-prived children were often so limited that they were unable to interpret simple
in-structions given them. Therefore, a series of items requesting the children to carry out
simple instructions (“Raise your hand,” “Say ‘hello’ very softly,” “Put three cars in the big box,” etc.) was written. Their ability to
discriminate colors and to know the labels for the different hues was questioned, sug-gesting the value of items asking them ei-ther to name or to point to specific colored crayons. Their perception of authority figures was described as negative and re-strictive, rather than positive and suppor-tive. Accordingly, items were developed which requested the children to tell in their
own words what a policeman does, what a mother does, etc. Simple items were de-signed to assess the child’s knowledge of concepts of number and quantity
(
“How many wheels does a bicycle have?”). The result was a compilation of items which would hopefully measure the child’s perfor-mance in such areas as basic information and vocabulary; number concepts; concepts of size, shape, motion, and color; concepts of time, object class, and social function; visual-motor performance; followingin-structions; and independence and self-help. The inventory thus compiled could not hope to do anything but describe. For noth-ing was known about how well it correlated with the Stanford-Binet or any other “pre-dictive” measures of a child’s developing ability patterns. Nor, for that matter, could anything be said after a protocol had been
obtained about how much the
administra-tion had been influenced by rapport factors, about whether a child might have obtained a low score because he “would” not respond rather than because he “could not” respond Nothing could be said, and nothing was
asked. Rather, the instrument attempted to
summarize for mass data processing the
kind of output of which a group of children was capable under a particular set of cir-cumstances. This is, of course, the essence of structured evaluation.
Early information about the ability of the
inventory to carry out its assignment of
de-scribing the early achievements of large groups of children had to come largely from evaluation efforts of unknown persons#{176} representing unknown levels of training and competence-i.e., from persons who used the instrument to learn something about the initial achievement and progress of children participating in a number of local Head Start programs in the summer
of 1965. However, during the academic
year of 1965-1966, data collected and ana-lyzed by the author and her colleagues
(
especially Ann Paullin, Rutll Wynn, andJ
ordan Tannenbaum)
have been examined for important leads about areas of achieve-ment and educational need in both middle-and lower-class children. The available sample consists of 648 children ranging in age from two to seven, with approximately half residing in Syracuse and Onondaga County. The children came from public and private nursery schools, from the waiting rooms of well baby clinics, from kindergar-tens situated in low-income and tipper-in-come neighborhoods, and from one or an-other phase of the research being conduct-ed by the author and her colleagues.Figure 1 presents data on this group and demonstrates that, at the very early ages, these curves are not so widely separated. However, with increasing age, they diverge more and more until at age 5 the norms for
lower-class children on this inventory are
* The debt to these people is large and can
never be adequately paid, as they, like the children
.ItfieldIe el*s,
Lw,r c/us
4ge Gro z.p in 6-X0 C/a Forwdz-d Zia tervil.s
.5- Ii 6 Cf
FIG. 1. Medians fortotal score of Preschool Inventory for 196 middle-class
and 452 lower-class children ranging in age from 2 to 7 years. Each age
in-terval includes children ranging from the age indicated up to the beginning of the next interval.
almost 13 years behind those for
middle-class children.
Examination of the percentage of chil-dren from the two major groups passing in-dividual items provides many leads for the structuring of intervention programs. The area in which there is greatest disparity between the lower- and middle-class chil-dren is in associative vocabulary-a finding which supports data from other studies. This information, while of interest and
value in itself, has its greatest importance in identifying and suggesting directions
that an intervention program might take. For example, there was no overlap in tile
curves for the two groups on certain items. Only 45% of the 6-year-old lower-class chil-dren were able to report that you could
ex-pect to find a lion in a jungle or a zoo, whereas 48% of the 3-year-old and 90% of the 5-year-old middle-class children could
do so. Similarly non-overlapping curves were found for several of the other items,
and for all items there were big disparities.
There were no reversals-i.e., items about which lower-class children were more knowledgeable than their middle-class age
counterparts. Granted that it might not be especially adaptive in our culture to know where to look if you wanted to find a lion-it might be more adaptive to know where not to look-and granted that such an item might not correlate at all with scores on an intelligence test, such a finding tells a good
deal about a child’s ability to make an asso-ciation between certain objects and habitats
and thus identifies an area in terms of which classroom programs, displays, or
demonstrations can be oriented.
DESCRIPTIVE EVALUATIONS OF
DEVELOPMENTAL SETTINGS
No description of the current
envi-ronment in which that development has oc-curred. It is quite interesting that far more effort has been devoted to the task of
im-proving evaluation techniques for infant and child development than to techniques for describing environments in which this development occurs. In general we have been content with very gross structural de-scriptions, such as those implied by the des-ignations, middle class and lower class. Yet, this is not adequate, as it has been shown that some lower-class homes are very stimu-lating and supportive, often far more so
than middle-class homes.
One of the other tasks of the staff of the
Syracuse Children’s Center has been the development of a procedure for assessing the stimulation potential of the home, for identifying those subtle aspects of the
young child’s environment that might
some-how carry the class influence. Many facets of the home experience have been im-plicated, but proof for the effects of most of the suspected contributors has been lack-ing. Drawing from developmental theory and empirical data describing environments
available to young children, our staff7 has developed an Inventory of Home
Stimula-tion
(
SliM)
based upon certain assump-tions about factors conducive to develop-ment. Around each of these assumptions a series of items has been constructed, withthe following areas covered: (1) frequency and stability of adult contact; (2) amount of developmental and vocal stimulation; (3) need gratification; (4) emotional climate; (5) avoidance of restriction; (6) breadth of experience; (7) aspects of the physical envi-ronment, and (8) available play materials. The number of items per category ranges
from 6 to 19, with a total of 73 items alto-gether.f
In its present version, STIM is designed
to be appropriate for families of children in the age range of birth to 3 years. It is ad-ministered by having a person go to the home at a time when the child is awake and
can be observed in interaction with his
I Copies of STIM may be obtained by writing to the author.
mother. About two thirds of the items rely
purely upon observation; those covering certain areas
(
primarily those assessingbreadth of experience of the family
)
relyupon information obtained via a brief
structured interview. Interobserver agree-ment on coding tile items is very high, aver-aging 94.6% for the total scale, with agree-ment within the subscales ranging from 91.8% to 100%. This high reliability is as-sumed to be, at least in part, a function of the fact that all items are scored in a binary fashion rather than involving any rating of degree within an item. The entire proce-dure generally takes about an hour.
Even though one reason for developing STIM was the conviction that the simple
middle-lower class dichotomy did not do justice to the variability among home envi-ronments within either class, it seemed
im-portant to examine for social class differences as a preliminary step in deter-mining whether the inventory could be
use-ful in describing developmental settings. Accordingly, scores on the STIM given a
group of 51 lower-class and 24 middle-class
families were compared. Social class here is determined by a consideration of parental education, income, and occupation. Results
of this analysis, which examined for sig-nificance of difference by means of the
t test, are presented graphically in Figure 2.
The mean SliM score for the lower-class homes was 50, and that for the middle-class homes 59, a difference which is statistically
significant beyond the .001 level of confidence. But inspection of the graph
demonstrates the value of a search for an evaluative technique that will describe
differences among homes within any social class. That is, there is obviously a good deal
of variability in both social class groups,
with the lower-class families being
definite-ly more heterogeneous. It will be noted that, at least in this particular sample, the
dis-tribution for the lower class families is
definitely bi-modal.
M,/d/e Lewer
C!s.s Ctmj.r
N4 N’fJ
.3P16 7-J8 39-10 4, 1)-If fJf 47f9 f9$e $,f2 fl$f f$$6 $.f8 #{163}F-*06/4 63 65’I6 67-09
Scar, on Jrvezitory of
Home
Stiraulatiorz‘1
0
es
Ftc. 2. Scores on Inventory of home Stimulation assigiied to 51 lower-class and 24 middle-class homes.
sought evidence that the stimulation level related to the performance of the children. For 23 of the children involved in the
fore-going analysis, we had data on the Cattell
Infant Intelligence Scale when the infants
were 6 months old and again when they
were 1 year old. This is a time period dur-ing which scores on this particular instru-ment are quite unstable and when change
in obtained quotients is the rule rather than the exception. The actual changes in these 23 children, most of whom were from
lower-class families, ranged from -22 to + 24 I.Q. points. To each change score was added a constant of 23 (so that all would be positive), and this new distribution of
scores was correlated with the average
STIM score earned by each child’s family during the first year of life. There was a rank difference correlation of .87,
signifi-cant 1)eyOnd the .001 level, between these change scores and the average family STIM
score.
Data were also available on change in
developmental test scores for 28 children
enrolled in an enriching clay care program. These data are not directly comparable to those used in the foregoing analysis, since
ages of the children ranged from 6 months to almost 4 years, the test-retest interval was not constant, and two different tests were involved
(
Cattell Infant Intelligence Scale and Stanford-Binet Intelligence Scale),
depending on the age of the child examined. Even so, it was felt that an ex-amination of the data for the degree of as-sociation between home STIM score and magnitude of change in the children would be worthwhile. For this analysis the ob-tained rank difference coefficient was - .40,DESCRIPTIVE EVALUATIONS OF BOTH
CHILD AND ENVIRONMENT
Thus far this paper has discussed the
im-portance of describing
(
1)
the development of young children in areas considered rele-vant for everyday experience and for thedesign of enrichment programs and (2) the environment in which development must
occur. But these two areas of evaluation must eventually overlap, resulting in de-scriptions of the behavior of children within the specific situations in which we are inter-ested in predicting his performance. That
is, if one is operating an intervention pro-gram in the form of a nursery school, there
is a definite need for information about the relationship between what is put into the intervention program and how the child
re-sponds within that setting. Similarly, there is a need to know if the experience consid-ered to be enriching is truly enriching. This latter obligation is frequently overlooked in planning and evaluating enrichment pro-grams.
In the Children’s Center, several staff memberst have devoted considerable effort
to the development of a rather elaborate system of making observations of children right in the classroom and also in the home.
Called APPROACH (A Procedure for Pat-terning Responses Of Adults and Children),
the technique involves breaking on-going behavior up into short behavioral-grammat-ical episodes in terms of the subject (who
does something), the predicate (what is done), the object (toward whom or what is the behavior directed), and a number of ap-propriate qualifiers (such as whether the
behavioral sentence is simple or compound, silent or oral, etc.). The system simulta-neously describes what a child does and what is done toward him, the input he
sends to his environment and the input he receives from it. Analyses are based upon behavior segments of 30 minutes duration,
t The staff members most involved in the
de-velopment of this technique are Alice Honig,
Barton Kaplan, Ann Paullin, Ruth Wynn, and Norma Graham.
with one sample for each Cilild usually taken in tile morning and another in the af-ternoon. In Figure 3 data pertaining to 4
children, each of whom was observed for 1 hour in our nursery school and 1 hour at home, are presented. The records were cx-amined to determine whether the behavior of this small group of deprived children
differed as a function of setting. As can be seen in Figure 3, the major areas of
difference were in the categories of atten-tion and object manipulation. In the en-riched environment where a great deal of stimulation was hopefully available, the children were significantly more attentive
and were thus out of environmental contact
less than was the case in the home. Some-what surprising was the finding that the
children engaged in less object manipula-tion in the Center than in the home, a fact which may have been related to the fact that the environment of the Center was
more active and that there were more things for the child simply to look at and listen to.
But, this descriptive information is of lit-tie value unless it can be shown that the en-vironments in the two settings also
some-how differed. That they did so in several significant ways is shown at the bottom of Figure 3. The adults in the school showed fewer episodes of lack of response to the child but also did not show quite as much direct attention to the observed child. Here it should be noted that in the school
subgroups there were eight children,
whereas in the home setting there was
usu-ally only one or possibly hvo. Also in the school the adults carried out more informa-tion processing, less interference, more nur-turance, and made fewer direct requests for compliance of the children. Thus, in gener-al, one might be led to describe the envi-ronment of this particular school as more emotionally supportive and more
con-cerned with information processing. Such descriptions of intervention environments, together with descriptions of the reactions
enrich-ChJd Behavior
Hor,, $hoo1
.0
4” ‘t’
I
.1
y
I’
0’
N’
:
BeAa’rzo, Cate9ories
53
0
‘I
0
II U
Fic. 3. Percentage of total response time spent in certain categories of be-havior by four children and their adult caretakers in the home and in the
nursery school.
ment program is necessarily enriching to the child.
SUMMARY
This paper has described and supported with research data what is essentially a
phi-losophy of evaluation-one which suggests
that the most important function of
devel-opmental evaluation is descriptive rather than diagnostic and as compressing a histo-ry rather than predicting a future. Behav-ioral evaluations are necessary to accom-push these other objectives; but, until as-sessment techniques are far more elegant than they are today, they should not be
ex-pected to accomplish them alone. Further, it has been suggested that naturalistic de-scriptions are as valuable and as necessary as structured ones. Finally, a plea has been made that more research attention be di-rected to the task of evaluating environ-ments. A crude attempt to do this on our own part has carried some important em-pirical fuel to the theoretical fire which de-scribes development as influenced by the milieu in which that development occurs.
REFERENCES
2. Bakwin, R. M.: Office evaluation of intelligence
of children. PEDIATRICS, 23:989, 1959.
3. Caldwell, B. M., and Drachman, R. H.: Com-parability of three methods of assessing the
developmental level of young infants.
Pnx-ATRICS, 34:51, 1964.
4. Frankenburg, W. K., and Dodds, J. B.: Denver Developmental Screening Test. University of Colorado Medical Center, 1966. Unpublished
manuscript.
5. Caldwell, B. M., and Soule, D.: The Preschool Inventory. Upstate Medical Center, 1966.
Unpublished manuscript.
6. Caldwell, B. M., and Richmond, J. B.:
Pro-grammed day care for the very young child
-a preliminary report. J. Marriage Fam.,
26:481, 1964.
7. Caldwell, B. M., Heider, J., and Kaplan, B.:
The inventory of home stimulation. Paper
presented at the American Psychological As-sociation Meeting, New York, New York, Sep-tember 1966.
Acknowledgment
The author is indebted to her colleague, Dr.
Julius B. Richmond, for his support of the research