DESCRIPTIVE EVALUATIONS OF CHILD DEVELOPMENT AND OF DEVELOPMENTAL SETTINGS

(1)

DESCRIPTIVE

EVALUATIONS

OF CHILD

DEVELOPMENT

AND

OF

DEVELOPMENTAL

SETTINGS

Bettye M. CaIdwell, Ph.D

College of Home Economics, Syracuse University, and Department of Pediatrics, Upstate Medical

Center, State University of New York, Syracuse, New York

46

I

F ONE is seriously interested in

interven-tion, then one is obligated to be equally interested in evaluation. The former

pro-cess is tremendously in vogue now, while

the latter is very much in disfavor. This is perhaps because the entirely legitimate

scientific process of evaluation has been

somehow confused with the highly personal and subjective process of making

judg-ments. The entire evaluation vocabulary sounds judgmental; words such as “less

than,” “more than,” “better than,” etc. can seldom be accepted as purely descriptive

terms. Yet, ironically, no evaluation process is as judgmental as any intervention pro-cess. For when one intervenes, one has al-ready formed a judgment, with or without any sort of formal evaluation, that things

should somehow be different than they are. The intervention referred to here is of a

fairly specific type-intervention designed

to influence a young child’s ability to learn, to profit from experience, to develop at an optimal rate and to a maximal level. That it is possible for intervention programs to make major strides toward this goal has been accepted as an article of faith; yet, at

the same time, there is a general wariness about full involvement in evaluation activi-ties.

One possible reason for this wariness is a

suspicion that some evaluation procedures will not be entirely fair to the children who are evaluated, that they will not adequately reflect important changes in functioning

which may occur.

This may vell be the case, for evaluation instruments often seem to be selected or

designed to assess “basic” or “pervasive” behaviors, whereas the intervention proce-dure being evaluated might have been

de-signed to accomplish more specific and re-stricted goals. If one does not include at least a sprinkling of such “basic” or “perva-sive” or “generalized” measures, one is sure

to be criticized for not having done so. This, of course, is not too surprising, for de-tection of transfer effects associated with

any type of training is important. It is as

essential to know something about how

new information is internalized and worked into the existing program for processing

in-formation as it is to know whether it has

been acquired.

For intervention programs concerned with the learning process, this

internaliza-tion, regardless of the label used for it, is close to what most people mean by the con-cept of intelligence. That is, people are

bound to inquire how a child’s “intelli-gence” has been affected by some enrich-ment program to which he has been ex-posed. Such an inquiry should not be re-sponded to as though it identified one total-ly unsophisticated about the nature of intel-ligence and how it can be assessed. For

such a question implies the underlying

awareness of what Hunt1 has called the epi-genetic theory of intelligence-or intelli-gence as something that emerges rather than exists, and as a level of performance

that is decidedly affected by experience.

(Received January 3; accepted for publication January 16, 1967.)

Presented at the Child Development Section, American Academy of Pediatrics, Annual Meeting, Chi-cago, Illinois, October 23, 1966.

The research referred to in this paper has been supported by Crant No. MH-07649, National Insti-tutes of Health, U.S. Public Health Service, and Grant No. D-156, Children’s Bureau, Welfare Ad-ministration, Department of Health, Education, and Welfare.

ADDRESS: Department of Pediatrics, Upstate Medical Center, 766 Irving Avenue, Syracuse, New

York 13210.

(2)

Yet, otie suspects that such a question also reflects some confusion of the phenomenal -that which is observed-with the “real”-that which underlies what is observed. Such a confusion reflects the type of decision making involved in making a diagnosis from observed symptoms. But intelligence

does not underlie so much as it overlies; it does a much better job of compressing a child’s past history than it does of pre-dieting his future performance. If that

con-cept of intelligence could earn acceptance there would probably be no objections to the use of tests parading tinder the label of

nitelligence tests” for the evaluation of

in-tervention programs. Such a concept would

do much to erase the negative set that

many persons have toward infant tests. Too

often they have been asked to “diagnose”

rather than “describe.” The former task

they do quite poorly; the latter they do

quite admirably.

DESCRIBING DEVELOPMENT IN

STRUCTURED SITUATIONS

If one is willing to accept developmental

evaluations as descriptive rather than

diag-nostic, as more indicative of past

experi-ence than of future potential, then one has available a number of procedures helpful for evaluating the effects of early interven-tion programs. The list includes not only

the most widely used instruments but also a number of screening procedures

deliberate-ly designed to be used by persons with a minimal degree of training and experience

in developmental testing. 2-5 Regardless of

tile technique employed, evidence is needed that the observed behavior bears a relationship to the setting in which it occurs

and to tile intervention experience (or lack of it) that tile child has had. Such informa-tion is of value in facilitating plans for what a child needs at the next developmental level, regardless of what it enables one to predict about ultimate developmental sta-tus.

The Preschool Inventory (which the writ-er developed for use in Head Start in 1965

vhen it seemed as though no previously

published test could be found which would

meet the needs of a mass evaluation pro-gram to be carried out largely by volun-teers and untrained personnel

)

perhaps pro-vides as good an example of this point as

any.

(

It provides a better example in that the author can without apology be critical

of it.

)

It was developed and field tested

al-most overnight; it in no way could be said

to have evolved from a systematic theory of early cognitive development; and, it was

openly and unashamedly intended merely to describe rather than diagnose. Further-more, it was intended to be sensitive, rather than resistant, to change. The Inventory

was compiled after discussions with various Head Start planners about . . . “the need for

some type of instrument that would pro-vide an indication of how much a disadvan-taged child, prior to his introduction to Head Start, had achieved in areas regarded

as necessary foundations for subsequent

success in school. A measure of basic intelli-gence was iiot in any sense the goal, al-though it would be naive to assume that

any such index of achievements would not be to some extent correlated with perfor-mance On intelligence tests. Nor was there any concern with the development of a so-called ‘culture-fair’ test. It was taken for granted at the outset that the culture in the

child’s preceding years had not been

entire-ly fair and that what was needed was not a

procedure that would attempt to remove this unfairness but rather one that would permit it to show in all its blatency. Also, it was considered to be extremely important

to demonstrate that the child from a less favorable background was actually func-tioning at a deficit at the time he began school; this deficit had been assumed but not substantiated on a large scale. Then, finally but by no means least, it was consid-ered important to develop a procedure that could be used on a before-after basis and be available as one index of educational achievement associated with Head Start.”

(3)

expe-rience in a day care center for

disadvan-taged children.#{176} Upon these experiences were grafted points mentioned in some of the early memos and news releases

dis-tributed by various persons involved in the planning for Project Head Start. For exam-ple, it was mentioned that many of the eli-gible children were unable to give basic in-formation about themselves and that they had a very negative self-concept. Therefore, items requesting the child to give his first and last name and his age were written. It was stressed that the experiences of de-prived children were often so limited that they were unable to interpret simple

in-structions given them. Therefore, a series of items requesting the children to carry out

simple instructions (“Raise your hand,” “Say ‘hello’ very softly,” “Put three cars in the big box,” etc.) was written. Their ability to

discriminate colors and to know the labels for the different hues was questioned, sug-gesting the value of items asking them ei-ther to name or to point to specific colored crayons. Their perception of authority figures was described as negative and re-strictive, rather than positive and suppor-tive. Accordingly, items were developed which requested the children to tell in their

own words what a policeman does, what a mother does, etc. Simple items were de-signed to assess the child’s knowledge of concepts of number and quantity

(

“How many wheels does a bicycle have?”). The result was a compilation of items which would hopefully measure the child’s perfor-mance in such areas as basic information and vocabulary; number concepts; concepts of size, shape, motion, and color; concepts of time, object class, and social function; visual-motor performance; following

in-structions; and independence and self-help. The inventory thus compiled could not hope to do anything but describe. For noth-ing was known about how well it correlated with the Stanford-Binet or any other “pre-dictive” measures of a child’s developing ability patterns. Nor, for that matter, could anything be said after a protocol had been

obtained about how much the

administra-tion had been influenced by rapport factors, about whether a child might have obtained a low score because he “would” not respond rather than because he “could not” respond Nothing could be said, and nothing was

asked. Rather, the instrument attempted to

summarize for mass data processing the

kind of output of which a group of children was capable under a particular set of cir-cumstances. This is, of course, the essence of structured evaluation.

Early information about the ability of the

inventory to carry out its assignment of

de-scribing the early achievements of large groups of children had to come largely from evaluation efforts of unknown persons#{176} representing unknown levels of training and competence-i.e., from persons who used the instrument to learn something about the initial achievement and progress of children participating in a number of local Head Start programs in the summer

of 1965. However, during the academic

year of 1965-1966, data collected and ana-lyzed by the author and her colleagues

(

especially Ann Paullin, Rutll Wynn, and

J

ordan Tannenbaum

)

have been examined for important leads about areas of achieve-ment and educational need in both middle-and lower-class children. The available sample consists of 648 children ranging in age from two to seven, with approximately half residing in Syracuse and Onondaga County. The children came from public and private nursery schools, from the waiting rooms of well baby clinics, from kindergar-tens situated in low-income and tipper-in-come neighborhoods, and from one or an-other phase of the research being conduct-ed by the author and her colleagues.

Figure 1 presents data on this group and demonstrates that, at the very early ages, these curves are not so widely separated. However, with increasing age, they diverge more and more until at age 5 the norms for

lower-class children on this inventory are

* The debt to these people is large and can

never be adequately paid, as they, like the children

(4)

.ItfieldIe el*s,

Lw,r c/us

4ge Gro z.p in 6-X0 C/a Forwdz-d Zia tervil.s

.5- Ii 6 Cf

FIG. 1. Medians fortotal score of Preschool Inventory for 196 middle-class

and 452 lower-class children ranging in age from 2 to 7 years. Each age

in-terval includes children ranging from the age indicated up to the beginning of the next interval.

almost 13 years behind those for

middle-class children.

Examination of the percentage of chil-dren from the two major groups passing in-dividual items provides many leads for the structuring of intervention programs. The area in which there is greatest disparity between the lower- and middle-class chil-dren is in associative vocabulary-a finding which supports data from other studies. This information, while of interest and

value in itself, has its greatest importance in identifying and suggesting directions

that an intervention program might take. For example, there was no overlap in tile

curves for the two groups on certain items. Only 45% of the 6-year-old lower-class chil-dren were able to report that you could

ex-pect to find a lion in a jungle or a zoo, whereas 48% of the 3-year-old and 90% of the 5-year-old middle-class children could

do so. Similarly non-overlapping curves were found for several of the other items,

and for all items there were big disparities.

There were no reversals-i.e., items about which lower-class children were more knowledgeable than their middle-class age

counterparts. Granted that it might not be especially adaptive in our culture to know where to look if you wanted to find a lion-it might be more adaptive to know where not to look-and granted that such an item might not correlate at all with scores on an intelligence test, such a finding tells a good

deal about a child’s ability to make an asso-ciation between certain objects and habitats

and thus identifies an area in terms of which classroom programs, displays, or

demonstrations can be oriented.

DESCRIPTIVE EVALUATIONS OF

DEVELOPMENTAL SETTINGS

No description of the current

(5)

envi-ronment in which that development has oc-curred. It is quite interesting that far more effort has been devoted to the task of

im-proving evaluation techniques for infant and child development than to techniques for describing environments in which this development occurs. In general we have been content with very gross structural de-scriptions, such as those implied by the des-ignations, middle class and lower class. Yet, this is not adequate, as it has been shown that some lower-class homes are very stimu-lating and supportive, often far more so

than middle-class homes.

One of the other tasks of the staff of the

Syracuse Children’s Center has been the development of a procedure for assessing the stimulation potential of the home, for identifying those subtle aspects of the

young child’s environment that might

some-how carry the class influence. Many facets of the home experience have been im-plicated, but proof for the effects of most of the suspected contributors has been lack-ing. Drawing from developmental theory and empirical data describing environments

available to young children, our staff7 has developed an Inventory of Home

Stimula-tion

(

SliM

)

based upon certain assump-tions about factors conducive to develop-ment. Around each of these assumptions a series of items has been constructed, with

the following areas covered: (1) frequency and stability of adult contact; (2) amount of developmental and vocal stimulation; (3) need gratification; (4) emotional climate; (5) avoidance of restriction; (6) breadth of experience; (7) aspects of the physical envi-ronment, and (8) available play materials. The number of items per category ranges

from 6 to 19, with a total of 73 items alto-gether.f

In its present version, STIM is designed

to be appropriate for families of children in the age range of birth to 3 years. It is ad-ministered by having a person go to the home at a time when the child is awake and

can be observed in interaction with his

I Copies of STIM may be obtained by writing to the author.

mother. About two thirds of the items rely

purely upon observation; those covering certain areas

(

primarily those assessing

breadth of experience of the family

)

rely

upon information obtained via a brief

structured interview. Interobserver agree-ment on coding tile items is very high, aver-aging 94.6% for the total scale, with agree-ment within the subscales ranging from 91.8% to 100%. This high reliability is as-sumed to be, at least in part, a function of the fact that all items are scored in a binary fashion rather than involving any rating of degree within an item. The entire proce-dure generally takes about an hour.

Even though one reason for developing STIM was the conviction that the simple

middle-lower class dichotomy did not do justice to the variability among home envi-ronments within either class, it seemed

im-portant to examine for social class differences as a preliminary step in deter-mining whether the inventory could be

use-ful in describing developmental settings. Accordingly, scores on the STIM given a

group of 51 lower-class and 24 middle-class

families were compared. Social class here is determined by a consideration of parental education, income, and occupation. Results

of this analysis, which examined for sig-nificance of difference by means of the

t test, are presented graphically in Figure 2.

The mean SliM score for the lower-class homes was 50, and that for the middle-class homes 59, a difference which is statistically

significant beyond the .001 level of confidence. But inspection of the graph

demonstrates the value of a search for an evaluative technique that will describe

differences among homes within any social class. That is, there is obviously a good deal

of variability in both social class groups,

with the lower-class families being

definite-ly more heterogeneous. It will be noted that, at least in this particular sample, the

dis-tribution for the lower class families is

definitely bi-modal.

(6)

M,/d/e Lewer

C!s.s Ctmj.r

N4 N’fJ

.3P16 7-J8 39-10 4, 1)-If fJf 47f9 f9$e $,f2 fl$f f$$6 $.f8 #{163}F-*06/4 63 65’I6 67-09

Scar, on Jrvezitory of

Home

Stiraulatiorz

‘1

0

es

Ftc. 2. Scores on Inventory of home Stimulation assigiied to 51 lower-class and 24 middle-class homes.

sought evidence that the stimulation level related to the performance of the children. For 23 of the children involved in the

fore-going analysis, we had data on the Cattell

Infant Intelligence Scale when the infants

were 6 months old and again when they

were 1 year old. This is a time period dur-ing which scores on this particular instru-ment are quite unstable and when change

in obtained quotients is the rule rather than the exception. The actual changes in these 23 children, most of whom were from

lower-class families, ranged from -22 to + 24 I.Q. points. To each change score was added a constant of 23 (so that all would be positive), and this new distribution of

scores was correlated with the average

STIM score earned by each child’s family during the first year of life. There was a rank difference correlation of .87,

signifi-cant 1)eyOnd the .001 level, between these change scores and the average family STIM

score.

Data were also available on change in

developmental test scores for 28 children

enrolled in an enriching clay care program. These data are not directly comparable to those used in the foregoing analysis, since

ages of the children ranged from 6 months to almost 4 years, the test-retest interval was not constant, and two different tests were involved

(

Cattell Infant Intelligence Scale and Stanford-Binet Intelligence Scale

),

depending on the age of the child examined. Even so, it was felt that an ex-amination of the data for the degree of as-sociation between home STIM score and magnitude of change in the children would be worthwhile. For this analysis the ob-tained rank difference coefficient was - .40,

(7)

DESCRIPTIVE EVALUATIONS OF BOTH

CHILD AND ENVIRONMENT

Thus far this paper has discussed the

im-portance of describing

(

1

)

the development of young children in areas considered rele-vant for everyday experience and for the

design of enrichment programs and (2) the environment in which development must

occur. But these two areas of evaluation must eventually overlap, resulting in de-scriptions of the behavior of children within the specific situations in which we are inter-ested in predicting his performance. That

is, if one is operating an intervention pro-gram in the form of a nursery school, there

is a definite need for information about the relationship between what is put into the intervention program and how the child

re-sponds within that setting. Similarly, there is a need to know if the experience consid-ered to be enriching is truly enriching. This latter obligation is frequently overlooked in planning and evaluating enrichment pro-grams.

In the Children’s Center, several staff memberst have devoted considerable effort

to the development of a rather elaborate system of making observations of children right in the classroom and also in the home.

Called APPROACH (A Procedure for Pat-terning Responses Of Adults and Children),

the technique involves breaking on-going behavior up into short behavioral-grammat-ical episodes in terms of the subject (who

does something), the predicate (what is done), the object (toward whom or what is the behavior directed), and a number of ap-propriate qualifiers (such as whether the

behavioral sentence is simple or compound, silent or oral, etc.). The system simulta-neously describes what a child does and what is done toward him, the input he

sends to his environment and the input he receives from it. Analyses are based upon behavior segments of 30 minutes duration,

t The staff members most involved in the

de-velopment of this technique are Alice Honig,

Barton Kaplan, Ann Paullin, Ruth Wynn, and Norma Graham.

with one sample for each Cilild usually taken in tile morning and another in the af-ternoon. In Figure 3 data pertaining to 4

children, each of whom was observed for 1 hour in our nursery school and 1 hour at home, are presented. The records were cx-amined to determine whether the behavior of this small group of deprived children

differed as a function of setting. As can be seen in Figure 3, the major areas of

difference were in the categories of atten-tion and object manipulation. In the en-riched environment where a great deal of stimulation was hopefully available, the children were significantly more attentive

and were thus out of environmental contact

less than was the case in the home. Some-what surprising was the finding that the

children engaged in less object manipula-tion in the Center than in the home, a fact which may have been related to the fact that the environment of the Center was

more active and that there were more things for the child simply to look at and listen to.

But, this descriptive information is of lit-tie value unless it can be shown that the en-vironments in the two settings also

some-how differed. That they did so in several significant ways is shown at the bottom of Figure 3. The adults in the school showed fewer episodes of lack of response to the child but also did not show quite as much direct attention to the observed child. Here it should be noted that in the school

subgroups there were eight children,

whereas in the home setting there was

usu-ally only one or possibly hvo. Also in the school the adults carried out more informa-tion processing, less interference, more nur-turance, and made fewer direct requests for compliance of the children. Thus, in gener-al, one might be led to describe the envi-ronment of this particular school as more emotionally supportive and more

con-cerned with information processing. Such descriptions of intervention environments, together with descriptions of the reactions

(8)

enrich-ChJd Behavior

Hor,, $hoo1

.0

4” ‘t’

I

.1

y

I’

0’

N’

:

BeAa’rzo, Cate9ories

53

0

‘I

0

II U

Fic. 3. Percentage of total response time spent in certain categories of be-havior by four children and their adult caretakers in the home and in the

nursery school.

ment program is necessarily enriching to the child.

SUMMARY

This paper has described and supported with research data what is essentially a

phi-losophy of evaluation-one which suggests

that the most important function of

devel-opmental evaluation is descriptive rather than diagnostic and as compressing a histo-ry rather than predicting a future. Behav-ioral evaluations are necessary to accom-push these other objectives; but, until as-sessment techniques are far more elegant than they are today, they should not be

ex-pected to accomplish them alone. Further, it has been suggested that naturalistic de-scriptions are as valuable and as necessary as structured ones. Finally, a plea has been made that more research attention be di-rected to the task of evaluating environ-ments. A crude attempt to do this on our own part has carried some important em-pirical fuel to the theoretical fire which de-scribes development as influenced by the milieu in which that development occurs.

REFERENCES

(9)

2. Bakwin, R. M.: Office evaluation of intelligence

of children. PEDIATRICS, 23:989, 1959.

3. Caldwell, B. M., and Drachman, R. H.: Com-parability of three methods of assessing the

developmental level of young infants.

Pnx-ATRICS, 34:51, 1964.

4. Frankenburg, W. K., and Dodds, J. B.: Denver Developmental Screening Test. University of Colorado Medical Center, 1966. Unpublished

manuscript.

5. Caldwell, B. M., and Soule, D.: The Preschool Inventory. Upstate Medical Center, 1966.

Unpublished manuscript.

6. Caldwell, B. M., and Richmond, J. B.:

Pro-grammed day care for the very young child

-a preliminary report. J. Marriage Fam.,

26:481, 1964.

7. Caldwell, B. M., Heider, J., and Kaplan, B.:

The inventory of home stimulation. Paper

presented at the American Psychological As-sociation Meeting, New York, New York, Sep-tember 1966.

Acknowledgment

The author is indebted to her colleague, Dr.

Julius B. Richmond, for his support of the research

(10)

1967;40;46

Pediatrics

Bettye M. Caldwell

DEVELOPMENTAL SETTINGS