Impact of a Matched Term Control Group on Interpretation of Developmental Performance in Preterm Infants

(1)

Impact

of a Matched

Term

Control

Group

on Interpretation

of

Developmental

Performance

in Preterm

Infants

Steven

J.

Gross, MD*; Tern A. Slagle, MD*; Diane B. D’Eugenio, MA,

OTR*; and Barbara B. Mettelman, MA

ABSTRACT. One hundred twenty-four children who were born at 24 to 31 weeks’ gestation and 124 term children matched in social background underwent serial developmental evaluations. The Bayley Mental Devel-opmental Index at 6, 15, and 24 months and the McCarthy General Cognitive Index at 4 years were used to classify cognitive outcome for preterm children as normal

(in-dices higher than 1 SD below the mean), mild-moder-ately delayed (indices between 1 and 2 SD below the mean), or severely delayed (indices 2 SD below the mean). Classifications based on norms derived from the performance of the term control group were compared with those based on published standardized test scores. The control group had substantially higher mean (±SD) Bayley Mental Developmental Indices at 6 (111 ± 11), 15 (114 ± 13), and 24 months (115 ± 21) than the published test mean (100 ± 16). Consequently, significantly more preterm children were classified as normal when the Bayley test mean was used than when the performance of the control group was used to define the normal range

(84% vs 52% at 6 months, 82% vs 49% at 15 months, and

70% vs 47% at 24 months). Severe cognitive delays were

infrequent when defined by test mean (6% to 11%) but two to three times greater when the control group scores were used. In contrast, the control group had a mean

McCarthy General Cognitive Index at 4 years (102 ± 14)

that was similar to the published test mean (100 ± 16). Thus, while the preterm group demonstrated a small decrease in mean cognitive scores between 2 and 4 years (Mental Developmental Index, 95 ± 19 and General Cog-nitive Index, 92 ± 15), this represented a significant improvement in performance relative to the 13-point fall in mean scores in the control group over this same period (115 ± 21 to 102 ± 14). These data highlight the impor-tance of a control group to provide normative data for current populations of children and to provide a refer-ence for comparing outcome over time using different testing instruments in the evaluation of high-risk chil-dren. Pediatrics 1992;90:681-687; preterm birth, develop-mental performance.

ABBREVIATIONS. MDI, Mental Development Index; PD!,

Psycho-motor Developmental Index; GCI, General Cognitive Index.

Improvements in perinatal and neonatal care have

resulted in significantly increased survival rates for

very premature infants. The full impact of these

From the Departments of ‘Pediatrics and Psychiatry, State University of New York, Health Science Center, Syracuse.

Received for publication Jan 17, 1992; accepted Mar 31, 1992.

Reprint requests to (S.J.G.) Dept of Pediatrics, SUNY Health Science Center, 750 E Adams St. Syracuse, NY 13210.

changes in care cannot be evaluated without accurate

information about neurodevelopmental outcome for

these high-risk survivors. Evaluations of outcome of

intellectual performance are based on standardized

developmental tests. The most widely used

instru-ment for assessment of developmental performance

during the first 2 years of life is the Bayley Scales of Infant Development.’ In later childhood, other

instru-ments are used, including the McCarthy Scales of

Children’s Abilities,2 the Stanford-Binet test,3 and the Wechsler Intelligence Scale for Children-Revised.4 Interpretation of outcome of former preterm infants usually is made using the norms of the standardized tests as the reference. Few follow-up studies of pre-term infants have undertaken the task of recruiting

and evaluating appropriate control infants.5 A

con-current, matched comparison group is important to

control for the confounding influence of socioeco-nomic and environmental factors, to allow for blind-ing of examiners to infants’ high-risk status, to pro-vide normative data for current populations of

chil-dren, and to provide a reference for comparing

outcome over time using different testing instruments. We report a prospective 4-year longitudinal follow-up study of a cohort of preterm infants born at 24 to 31 weeks’ gestation along with a group of socioeco-nomically matched infants born at term. The purpose of our study was to examine the impact a term control

group had on interpretation of developmental

out-come in the preterm group. Categorization of

cogni-tive outcome of preterm children using published

standardized test norms was compared with that

using norms derived from performance of our control group.

METHODS

The study population included all liveborn neonates of 24 to 31

weeks’ gestation cared for at Crouse Irving Memorial Hospital

between July 1, 1985, and June 30, 1986. This is the sole tertiary care facility for central New York State’s 26 000 annual births. An active maternal referral service has existed for more than I 5 years.

Neonates born in other hospitals in the region were transported to

the tertiary center by an experienced neonatal transport team. A

neonatal fellow attended inborn deliveries, and a neonatologist directed clinical care. All neonates of 24 or more weeks’ gestation

who had any signs of life (breathing, beating of the heart, or

pulsation of the umbilical cord) were included. A lower cutoff point of 24 weeks was chosen because our regional policy is to intervene

aggressively on behalf of the fetus and neonate beginning at that

gestation. An upper cutoff of 31 weeks was chosen because Crouse

Irving Memorial Hospital is the regional perinatal referral center

for all such extremely premature neonates. Thus, this represents a

follow-up from a large geographic region.

at Viet Nam:AAP Sponsored on September 1, 2020

www.aappublications.org/news

(2)

Gestational age for these newborns was determined by a single investigator (T.A.S.) within 48 hours of birth from maternal dates

of the last menstrual period. Gestational age was substantiated by

first- or second-trimester ultrasound examinations in 83% of the

cases. Additionally. gestational age of all newborns 28 weeks was

confirmed by Dubowitz examination.6 Dubowitz findings were

used to modify the expected date of delivery only if they gave an

estimate that differed from the menstrual estimate by more than

I 4 days; this occurred in two cases. There were two other instances

where maternal dates were unavailable and Dubowitz examination

alone was used to determine gestational age.

Prenatal, perinatal, and neonatal data collected prospectively

included maternal illness, mode of delivery, Apgar scores, and all

major neonatal intensive care unit morbidities including respiratory distress, symptomatic patent ductus arteriosus, air leaks, necrotizing enterocolitis, systemic infections, duration of ventilatory therapy, duration of parenteral nutrition, and length of hospitalization. All

neonates had serial cranial ultrasonograms beginning at 48 hours

of age.7 Retinal examinations were performed by ophthalmologists beginning at 6 weeks of age.’ Socioeconomic data obtained included parental race, age, years of formal education, and marital status.

A control group of approximately grown, healthy term neonates

(38 to 42 weeks’ gestation) who were born during the same period

of time also were recruited. Each preterm newborn was matched

to one full-term newborn for all the following characteristics: gender, race, and maternal age (<18, 18 to 34, >34 years), years of

formal education (<12, 12 through 15, 16 years), and marital

status.

Follow-up visits were scheduled at 6, 15, 24, and 48 months of

age, corrected for prematurity. Weight, supine crown-heel length

(<2 years) or standing height (2 years), and occipitofrontal head

circumference were measured. Physical and neurologic

examina-tions were performed at each visit. At the 6- and 15-month visits, neuromotor status also was assessed with the Infant Neurological International Battery.’ This 20-item instrument quantitates early motor development in the areas of muscle tone, primitive reflexes, automatic reactions, and head and trunk tone. Based on total score,

neuromotor development is classified as normal, transiently

abnor-mal, or abnormal.’ Developmental testing was performed using the

Bayley Scales of Infant Developments at 6, 15, and 24 months of

age and the McCarthy Scales of Children’s Abilities2 at 4 years of

age. The Bayley scales provide a Mental Developmental Index

(MDI) and a Psychomotor Developmental Index (PDI). The

Mc-Carthy scales provide a General Cognitive Index (GCI) as well as

five separate scores on Verbal, Perceptual Performance, Quantita-tive, Memory, and Motor scales. At 4 years of age, all children had

audiologic evaluations using pure-tone testing for each ear at

frequencies of 500, 1000, 2000, and 4000 Hz. Hearing loss (<20

dB at one or more frequencies) was further evaluated with bone

conduction and tympanometry and characterized as sensonneural

or conductive)0 Children unable to comply with behavioral

audi-ologic testing had brainstem auditory evoked response testing. The

Bayley Scales of Infant Development were administered by a single

investigator (D.B.D.). The McCarthy scales were administered by a

single child psychologist (B.B.M.) to all children except for twins,

who were tested simultaneously by two psychologists. The

psy-chologists and audiologist were “blind’ to subjects’ groups (preterm

or term) and were unaware of their neonatal courses or previous

developmental test results.

Sensonneural findings were summarized as normal (no cerebral

palsy, corrected vision in at least one eye that was satisfactory for normal activities, and no sensineural hearing loss), mild-moderately

abnormal (cerebral palsy that allowed independent walking,

uni-lateral blindness, or mild sensorineural hearing loss), or severely abnormal (cerebral palsy interfering with ability to walk, bilateral

blindness, or sensorineural hearing loss 60 dB requiring use of a

hearing aid).

Developmental test results were used to classify children as

having normal cognitive development (Bayley MDI or McCarthy

GCI higher than 1 SD below the mean), mild-moderate

develop-mental delay (MDI or GCI between 1 and 2 SD below the mean),

or severe developmental delay (indices 2 SD below the mean).

This classification of developmental outcome for preterm infants

was done two ways: by using the published means and standard

deviations of the instruments (100 ± 16 for both the Bayley MDI

and the McCarthy GCI) and by using the mean and standard

deviation of the scores obtained by our full-term control group at

each follow-up visit.

Continuous variables were expressed as mean ±SD. Differences between groups were tested for significance by unpaired Student’s

ttest for two means or by x2analysis as appropriate for continuous or categorical data, respectively. Additionally, to examine change

in performance over time in our preterm group, we converted the

Bayley MDI and McCarthy GCI for the preterm children to z scores

using the mean and SD of the control group, ie,

z score

-Individual score - mean for control group SD for control group

Changes in z scores between 2 and 4 years were analyzed by

paired t test. Probability values less than .01 were considered significant.

The study was approved by the Hospital Human Research

Review Committee and informed consent was obtained from the

parent(s) of all children.

RESULTS

During the study period, there were 156 liveborn neonates of 24 to 31 weeks’ gestation. One hundred

thirty (83%) of these neonates were inborn and of

these, 69 (53%) followed maternal transports. Over-all, 133 (85%) of the 156 infants survived to hospital discharge. Survival improved with increasing gesta-tional age. Seven (58%) of 12 neonates of 24 weeks’

gestation, 39 (81 %) of 48 neonates of 25 through 27

weeks’ gestation, and 87 (91%) of 96 neonates of 28

through 3 1 weeks’ gestation survived to hospital

dis-charge. Causes of hospital deaths included failed

resuscitation in the delivery room (n = 5, all 24 weeks’ gestation), respiratory failure (n = 6), sepsis (n = 4), complications of birth asphyxia (n = 2), and

congen-ital anomalies (n = 6). Eight other infants died after hospital discharge including two from complications

of chronic lung disease (ages 5 and 6 months), two

from pneumonia (ages 4 and 1 2 months), and one

each from acute gastroenteritis (age 1 3 months), in-trahepatic biliary atresia (age 3 months), sudden

in-fant death syndrome (age 7 months), and a

chromo-somal anomaly (age 36 months). One hundred

twenty-four of the 1 25 preterm children who were

alive at 4 years (one blind child omitted) and their matched controls constitute the basis of this report.

Characteristics of Survivors

The mean birth weight of the 1 24 preterm survivors was 1 180 ± 342 g; 43 (35%) of the neonates weighed

1000 g. Delivery was vaginal in 55 (44%) cases.

Ninety-one infants (73%) received ventilatory

sup-port; 33 (27%) received assisted ventilation for more

than 30 days. Ninety-eight neonates (79%) received

parenteral nutrition for an average of 21 days. Nec-rotizing enterocolitis was documented in 4 neonates. Hemodynamically significant patent ductus arteriosus

required closure with indomethacin in 1 6 cases and

surgical ligation in 14 cases. Sepsis was documented in 26 infants (21%); coagulase-negative

Staphylococ-cus was the most frequently isolated organism.

Ultra-sonographic evidence of brain insult occurred in 22

neonates (1 8%): mild hemorrhage (isolated germinal

matrix bleeding or small amounts of ventricular

blood) in 9 cases, severe hemorrhage (hemorrhage

associated with ventriculomegaly or parenchymal

ex-tension of hemorrhage) in 10 cases, and

periventric-ular leukomalacia in 3 cases. Length of neonatal

(3)

intensive care was 65 ± 32 days (range, 7 to 184 days).

The control group comprised 1 24 full-term

new-borns with a mean gestational age of 40.0 ± 1.1 weeks

and a mean birth weight of 3539 ± 435 g. Delivery

was vaginal in 86 (69%) cases. All control infants

were free from medical problems in the newborn

period and were discharged from the hospital with

their mothers. The preterm and term groups were

comparable for all the socioeconomic variables for

which they were matched (Table 1). Both groups were

predominantly white, and all children were English

speaking.

Follow-up

Among the preterm group, follow-up at 6, 15, 24,

and 48 months was accomplished for 100%, 98%,

98%, and 98% of the children, respectively. Among

the term control group, follow-up at 6, 15, 24, and

48 months was accomplished for 100%, 99%, 99%,

and 100% of the children, respectively. One hundred

nineteen preterm children (96%) and 122 term

chil-dren (98%) were evaluated at all four time periods.

Greater than 95% of the visits through 2 years

oc-curred within 2 weeks of the targeted age; at 4 years, 92% of the visits were within 2 weeks of the 4-year birthday.

Sensorineural Outcome

Control Group. Transient abnormalities of tone were identified in three term children during the first 15

months of life. At 2 and 4 years of age, all term

children had normal neurologic examinations. The

age for independent walking was 1 1 .5 ± 2.0 months,

and all term children were ambulatory by 20 months

of age. No children were blind in either eye. Three term children had mild unilateral sensorineural hear-ing loss.

Preterm Group. Mild transient abnormalities of mus-cle tone were observed in 18% of the preterm infants during the first 15 months of life.” By age 2 years, neuromotor abnormalities were limited to 1 1 children with spastic cerebral palsy; in 3 ambulatory children, cerebral palsy was mild, while in the other 8 children it was severe. At 4 years of age, 1 0 of these 1 1 children still had clinically evident cerebral palsy, although in

5 it was classified as mild to moderate and only 5

children were not walking independently. One

addi-TABLE 1. Socioeconomic

Groups

Data for Preterm and Control

Preterm Group (n = 124)

Term Group (n = 124)

Gender, male 71 (57) 71 (57)

Race, white 108 (87)t 108 (87)

Maternal age, y 25 ± 6 26 ± 6

18y 16(13) 14(11)

Maternal education, y 12 ± 2 12 ± 2

<12y 38(31) 38(31)

High school 47 (38) 39 (32)

Post secondary school 25 (20) 33 (27)

College degree 14 (1 1) 14 (11)

Married 85 (68) 85 (68)

* Values represent number (percent) or mean ± SD.

tIncludes one infant of mixed race.

tional preterm child, who previously had been

neu-rologically normal, developed moderate spastic

diple-gia following head trauma at 30 months of age. The

119 ambulatory preterm children walked at 13.9 ±

4.6 months (P < .001 vs term children).

In addition to one preterm child who was blind in

both eyes, six others had poor or no vision in one

eye. Bilateral mixed sensorineural-conductive hearing

loss of 35 to 40 dB was found in one child. None of

these eight children with sensory deficits had cerebral palsy.

Growth

Control Group. At 4 years of age, term boys and

girls had mean weights (17.7 ± 2.6 and 16.8 ± 2.3

kg, respectively), heights (102.1 ± 3.9 and 101 .4 ±

4.0 cm), and head circumferences (52.0 ± 1 .3 and

50.8 ± 1 .4 cm) that were similar to published

norms.’2’3

Preterm Group. The preterm boys and girls were

significantly lighter than control children (preterm boys 15.9 ± 2.5 kg, preterm girls 14.9 ± 2.5 kg; P <

.001); they were significantly shorter (boys 100.4 ±

4.8 cm, girls 99.2 ± 5.3 cm; P < .01); and their mean head circumferences were smaller (boys 5 1.1 ± 1.7

cm, girls 49.4 ± 1 .6 cm; P < .001). The proportions

of preterm children 2 SD below the mean of the

control group for gender at age 4 years were as

follows: weight, 1 1 %; height, 14%; and head circum-ference, 1 6%.

Results of Developmental Testing

Control Group. Performance on the Bayley Scales of

Infant Development for the term children is shown

in Table 2. Bayley MDIs averaged 111 ± 11, 114 ±

13, and 115 ± 21 at 6, 15, and 24 months of age,

respectively; these are all substantially higher than

the published mean test score of 100 ± 16. At 6 and

15 months, scores corresponding to 1 SD below the

control group mean were at the published test mean

of 1 00 and scores that were 2 SD below the control

group mean (signifying severe developmental delay)

were still well within the published normal range.

Therefore, applying Bayley normative data to our

control population would greatly underestimate

de-velopmental delay. A Bayley MDI below 84 (1 SD

below the test standardization mean) would be

ex-pected to occur in approximately 16% of the

popula-lion. However, mental indices of less than 84 were

found in only one of our control children (<1

%)

at 6

and 15 months and in seven control children (6%) at

TABLE 2. Performance on t

opment by the Term Group’

he Bayley Scales of In fant

Devel-Bayley Standard 6 15 24

Scales Score Months Months Months

MDI (mean ± SD) 100 ± 16 111 ± 11 114 ± 13 115 ± 21

-1 SD 84 100 101 94

-2SD 68 89 88 73

PDI(mean ± SD) 100 ± 16 105 ± 10 109 ± 11 99 ± 13

-1SD 84 95 98 86

-2SD 68 85 87 73

* MDI, Mental Developmental Index; PDI, Psychomotor

Develop-mental Index.

(4)

24 months (P < .005). Our control children had mean PDIs at 6 and 15 months that also were substantially

higher than test norms; however, the control group

PDI at 24 months was similar to the Bayley norm

(Table 2).

In contrast, the performance of our term group on

the CCI of the McCarthy scales at 4 years (102 ± 14)

was close to that of the standardized test score (100

± 16). Scores on the Verbal (51 ± 8), Perceptual

Performance (51 ± 9), Quantitative (51 ± 10), and

Memory (5 1 ± 8) scales of the McCarthy were all

similar to test norms (50 ± 10). Scores on the Motor

scale (46 ± 9) were somewhat lower than the test

norm (50 ± 10).

Level of maternal education was directly related to childrens’ cognitive performance at all ages tested; the relationship became more significant with increas-ing age (Table 3). However, even for children whose

mothers had less than 12 years of education, mean

MDIs throughout the first 2 years of life averaged

well above the published test mean of 100. Only at 4

years of age did the mean GCI on the McCarthy scale

fall below 100 for children whose mothers had the

least amount of education.

Preterm Group. The Bayley MDIs averaged 101 ±

16, 99 ± 18, and 95 ± 19 at 6, 15, and 24 months,

respectively. The Bayley PDIs averaged 96 ± 14, 98

± 20, and 89 ± 1 7 at the three time periods, respec-tively. At 4 years of age, the CCI of the McCarthy

averaged 92 ± 15. Three children were severely

im-paired and could not be formally tested. There were

no differences in the cognitive performance of

pre-term children by gestational age or gender (Table 4).

The effect of maternal education on cognitive

out-come for preterm children was less dramatic than that for the term children, reaching statistical significance

only at 4 years (Table 4). Table 5 compares the

categorization of developmental outcome for preterm

children when the “normal range” was defined by

standardized test scores vs the performance of the

matched term controls. During the first 2 years of life,

normal outcome was overestimated and abnormal

outcome was underestimated when standardized test

norms were used. At 6 months, when the Bayley MDI

norms were used, 84% of the preterm infants were

classified as normal and only 6% were classified as

severely delayed; the proportion of infants classified

as normal decreased to 52% and the number of

infants classified as severely delayed increased to 21% when control norms were used (P < .001). Similarly, at 1 5 months, the number of preterm children

classi-fied as normal decreased from 82% to less than 50%

when the Bayley norms were replaced by our more

stringent control norms (P < .001). At 24 months of

age, when performance was gauged against Bayley

standardized test scores, 70% of the preterm children were classified as normal and fewer than 1 0% showed

severe delays; however, when control group test

scores were used to define outcome, fewer than half

the children were classified as normal and twice as

many showed severe developmental delays (P <

.005). At 4 years of age, when the McCarthy scales

were administered, categorization of outcome for

pre-term children was similar whether the scale norms or

control norms were used. Only on the Memory scale

were there significantly more preterm children

cate-gorized as normal using the test mean than when

using the control mean (Table 5).

When changes in cognitive performance of preterm children are examined across time, different

conclu-sions are reached depending on normative data

em-ployed (Figure). When gauged against standardized

test norms, the proportion of anormalf children was

greatest at the earliest assessment periods. More than 80% of children were classified as normal at 6 and 15

months; this decreased to 70% by 2 years. Between 2

TABLE 3. Effect of Ma ternal Education o n Cognitive Perf ormance in the T erm Group’

Maternal 6-Month 15-Month 24-Month 4-Year

Education Bayley MDI Bayley MDI Bayley MDI McCarthy GCI

<12y(n=38) 12y(n=39) >12y(n=47)

108±12 111±9 114±10

110±16 114±11 117± 11

107±20 118±18 121±22

97±13 102±13 107± 14

Pvalue <.01 <.01 <.01 <.005

* MDI, Mental Developmental Index; GCI, General Cognitive Index. Values represent mean ± SD.

TABLE 4. Cognitive Perf ormance in the Pre term Group’

6-Month 15-Month 24-Month 4-Year

Bayley MDI Bayley MDI Bayley MDI McCarthy GCI

Preterm group (n 121) 101 ± 16 99 ± 18 95 ± 19 92 ± 15

Gestational age

24-27 wk (n = 40) 98 ± 19 99 ± 17 94 ± 19 95 ± 15

28-31 wk (n= 81) 103 ± 15 99 ± 19 96 ± 19 91 ± 15 Gender

Male (n = 70) 102 ± 16 98 ± 19 93 ± 19 91 ± 13

Female (n= 51) 100 ± 17 101 ± 17 98 ± 19 94 ± 17 Maternal education

<12y(n=37) 100±17 97±18 94±18 88±15

12y(n=46) 99±17 98±17 93±15 91±13

>12 y (n= 38) 105 ± 15 103 ± 20 99 ± 23 97 ± 16t

* MDI, Mental Developmental Index; CCI, General Cognitive Index. Values represent mean ± SD.

t P< .01, >12 years vs 12 years.

(5)

TABLE 5. Categorization of Developmental Outcome in Preterm Children, Using Test Norms and Control Norms’

Test Norms Control Norms P

Normal Mild-Moderate Severe Normal Mild-Moderate Severe

Outcome Delay Delay Outcome Delay Delay

Bayley MDI

6 Months 104 (84) 12 (10) 8 (6) 65 (52) 33 (27) 26 (21) <.001

15 Months 100 (82) 9 (7) 13 (11) 60 (49) 33 (27) 29 (24) <.001

24 Months 83 (70) 26 (22) 10 (8) 56 (47) 45 (38) 18 (15) <.005

Bayley PDI

6 Months 104 (84) 12 (10) 8 (6) 73 (59) 31 (25) 20 (16) <.001

15 Months 89 (73) 20 (16) 13 (11) 75 (62) 10 (8) 37 (30) <.001

24 Months 75 (63) 27 (23) 16 (14) 75 (63) 20 (17) 23 (20) NS

McCarthy scales

GCI 86 (71) 25 (20) 11 (9) 73 (60) 35 (29) 14 (11) NS

Subscales

Verbal 83 (69) 30 (25) 7 (6) 72 (60) 36 (30) 12 (10) NS

Perceptual Performance 76 (63) 32 (27) 12 (10) 73 (61) 25 (21) 22 (18) NS

Quantitative 91 (76) 23 (19) 6 (5) 88 (74) 22 (18) 10 (8) NS

Memory 99 (83) 16 (13) 5 (4) 78 (65) 33 (27) 9 (8) <.01

Motor 44 (37) 51 (43) 24 (20) 57 (48) 38 (32) 24 (20) NS

* Test results were used to classify developmental outcome as normal (indices higher than 1 SD below the mean), mild-moderately delayed

(indices between 1 and 2 SD below the mean), or severely delayed (indices 2 SD below the mean). Values represent number (percent).

MDI, Mental Developmental Index; PD!, Psychomotor Developmental Index; GCI, General Cognitive Index; NS, not significant.

and 4 years, the percent of preterm children classified

as normal remained constant. In contrast, when

com-pared with our control group, normal outcome for

preterm children was greatest at 4 years of age; the

portion of preterm children who were normal

in-creased from 47% at 2 years to 60% at 4 years. This

was reflected by significantly improved standardized

z scores between 2 and 4 years in the preterm group.

The mean standardized MDI at 2 years was -1.01,

indicating that the mean MDI was approximately 1

SD below the mean of the control group; in contrast,

the mean standardized GCI at 4 years was -0.79 (P

< .01).

DISCUSSION

Results of neurodevelopmental follow-up studies

of high-risk infants often are summarized to yield

estimates of normal and impaired intellectual

per-formance in survivors.’4’8 The normal range is

typi-cally defined around the mean of the cognitive test

administered. Most tests, including the Bayley Scales

of Infant Development and the McCarthy Scales of

Children’s Abilities, have similar standard scores

(mean ± SD; 1 00 ± 1 6). Therefore, normal develop-ment is defined by a mental or cognitive index greater

than 84 (1 SD below the mean); a mild or moderate

disability is defined by an index between 69 and 84

(between 1 and 2 SD below the mean); and a severe

developmental disability is defined by an index 68

(2 SD below the mean).’4’8 Our data indicate that

applying such standards to the early performance of

preterm children might not be appropriate because

these standards do not accurately reflect the

perform-ance of a socioeconomically matched term group. Our

term control group scored significantly higher on the Bayley MDI throughout the first 2 years of life than the standard score of the test would have predicted.

Therefore, when conventional standard scores were

used to classify outcome among preterm children,

many more children were considered normal and

many fewer were classified as severely delayed than

when performance was judged against the more

shin-gent control norms.

Control groups often are not included in

longitu-dinal follow-up studies because of the extra cost of

testing and the difficulty in recruiting and maintaining participation from a population that is expected to be

normal. A meta-analysis of 80 recently published

studies that evaluated the outcome of low birth

weight infants revealed that only 3 1 % included a

term control group.5 Significant relationships between

socioeconomic and environmental factors and

cogni-tive performance of low birth weght and preterm

children have been demonstrated.’7”9 Therefore,

in-clusion of an appropriate control population is essen-tial to separate biologic from environmental

influ-ences on outcome as well as to compare outcome

between populations of high-risk survivors who are

cared for in different centers and who differ in social background.

Our data also demonstrate that although both the

Bayley and McCarthy scales have similar published

norms, scores on the two instruments are not

com-parable. In contrast to the Bayley MDI, our control

group had a mean McCarthy CCI that was similar to

that of the test standardization mean. As a result, there was a 13-point decrease in mean cognitive index

of the control group from age 2 years (Bayley MDI:

115 ± 21) to age 4 years (McCarthy GCI: 102 ± 14).

Therefore, while the preterm group demonstrated a

3-point decrease in mean scores between 2 and 4

years (95 ± 19 to 92 ± 15), this represented a relative

improvement in performance compared with the

con-trol group. The discrepancy between the mean scores

for the term and preterm groups was cut in half

between age 2 (20-point discrepancy) and age 4 years (10-point discrepancy). These data highlight the im-portance of a matched control group for interpreting

the effect of age on the development of preterm

infants. An apparent change in cognitive function

over time may be related to actual deterioration or

improvement in function or may be erroneous

(6)

* _*

USING CONTROL GROUP NORMS

100

80.

60.

40

20

-z

w

-j I

C)

-j

4

0

z

LI-0

I-z

Ui

C-)

Ui

a-**

U

0

I

6 MONThS 15 MONTHS 24 MONTHS 4 YEARS

U

USING INSTRUMENT NORMS

Preterm children with normal cognitive

development from 6 months to 4 years;

significantly more preterm children

were categorized as normal at 6, 15,

and 24 months when normal was

de-fined by instrument norms than when

control group norms were used. ‘P <

.001; “P < .005.

cause of comparisons on the basis of tests that differ in their level of difficulty.2#{176}

Several possible explanations for the discrepancies in performance on the Bayley Scales of Infant Devel-opment between our control group and the published normative sample include differences in background characteristics between the populations, differences

in examiner techniques, and true changes in infant

performance over time (necessitating revision of scale

norms). The Bayley scales were standardized more

than 30 years ago on a sample of 1262 children

selected to be representative of the national popula-tion. Stratification variables in the original sample

included sex (equal number of boys and girls), race

(approximately 85 % white), geographic residence

(80% to 90% urban), and parental education (60%

high school education). Children were recruited in 14

age groups from 2 to 30 months of age. Children

were excluded only for obvious physical, mental, or

behavioral problems.’ The socieoeconomic

demo-graphics of our control population are similar to those of the Bayley sample. However, important differences include the recruitment of our control subjects from birth, careful screening for medical complications, and longitudinal follow-up. Thus, our control group

included only appropriately grown term infants with

uncomplicated perinatal courses. The extent to which our potentially healthier subjects contributed to their higher scores is difficult to quantitate. The McCarthy

scales were standardized in much the same way as

the Bayley by excluding only obviously mentally or

physically handicapped children. The fact that we did not find significant differences between performance

by our controls and the published McCarthy norms

suggests that our “healthier” control population was not a major factor responsible for the higher scores on the Bayley Scales.

All testing with the Bayley Scales in our study was

performed by the same examiner. It is possible that a

tester bias resulted in the higher indices in the control

group. However, our examiner was blinded to infant

group membership as well as prior developmental

test scores to minimize biases. The Bayley scales are

AGE OF TESTING

designed to be administered to obtain the best test

performance possible, allowing flexibility in the order of presentation of items, returning to items previously failed, and even crediting relevant behavior directly

observed by the examiner outside the examination

room.’ Therefore, biases would more likely result in

lower scores. Infant testers today have considerably

greater knowledge of infant development than did

testers 30 years ago when the Bayley Scales were

standardized. This improved understanding of infant behavior may play a major role in eliciting coopera-tion from infants, resulting in higher test scores.

The major factor contributing to the higher test

scores by the control group compared with the Bayley

standardization sample is probably a true change in

performance of young children. For many

popula-tions in the United States, restandardization of intel-ligence tests have required “the toughening-up of the

norms” to maintain the average IQ score at 100.20 A

regular rise of approximately 0.3 of an IQ point per

year on the Stanford-Binet and Wechsler tests over

the period 1932 to 1978 has been reported.20’2’ Infant tests have undergone similar revisions. The current

mean developmental quotient on the Griffiths Scale

is 1 10, a full 10 points higher than original mean.2’

The Gesell Developmental Schedules recently were

restandardized, approximately 40 years after the orig-inal schedules were published.22 The Revised Devel-opmental Schedules found a 10% to 20% acceleration of performance in adaptive, gross motor, language, and personal-social behavior. This resulted in earlier age placement for many items. For example, building

a tower of three cubes was placed at an 18-month

level on the original Gesell and at a 15-month level on the revised schedule. The Bayley scales place the

same item at a mean age of 16.7 months. Similarly,

combining two- to three-word sentences was placed

at a 21-month level on the original Gesell and revised to an 18-month level. The Bayley scales place a similar language item at a 20.6-month level. Recently, Camp-bell and colleagues23 published data suggesting that

the Bayley Scales of Infant Development may also

require restandardization. The Bayley scales were

(7)

ministered to 305 one-year-old term infants. Despite

a relatively low socioeconomic composition of their

sample (mostly rural, 50% black, 48% maternal

edu-cation less than 12 years), mean MDI (1 1 1 ± 13) and PDI (1 1 0 ± 1 8) were considerably above the Bayley

normative sample mean of 100 ± 16 and similar to

values reported for our control group at 6 and 15

months of age.

It seems likely that our control group’s high scores on the Bayley Scales represent a real change in

per-formance of young children since the scales were

derived. A wide variety of intervention activities (par-ent-child education programs, availability of devel-opmentally appropriate infant toys, public education

programs, and knowledge regarding normal infant

development) may have promoted improved health

and development for young children in the decades

since the Bayley norms were developed. It is also

possible that knowledge of infant development has

improved our ability to test infants, resulting in better test performance. Regardless of cause, these findings have important implications for interpreting results of developmental outcome for high-risk infants.

REFERENCES

I. Bayley N. Bayley Scales of Infant Development. New York, NY: Psycho-logical Corporation; 1969

2. McCarthy D. The McCarthy Scales of Children’s Abilities. New York, NY:

Psychological Corporation; 1972

3. Terman LM, Merrill MA. Stanford-Binet Intelligence Scales (Rev 3).

Stan-ford, CA: Stanford University Press; 1972

4. Wechsler D. Manual for the Wechsler Intelligence Scale for Children-Revised. New York; NY: Psychological Corporation; 1974

5. Aylward GP, Pfeiffer SI, Wright A, Verhulst SJ. Outcome studies of low

birth weight infants published in the last decade: a metaanalysis. I

Pediatr. 1989;1 15:515-520

6. Dubowitz LM, Dubowitz V. Goldberg D. Clinical assessment of gesta-tional age in the newborn infant. IPediatr. 1970;77:1

7. Slagle TA, Oliphant M, Gross SJ. Cingulate sulcus development in the preterm infant. Pediatr Res. 1989;26:598-602

8. Cryotherapy for Retinopathy Study Group. Multicenter trial of crvo-therapy for retinopathy of prematurity: preliminary results. Pediatrics.

1988;81 :697-706

9. Ellison PH, Horn J, Browning C. Construction of an infant neurological international battery (INFANIB) for the assessment of neurological

integrity in infancy. Phys Ther. 1985;65:1326-1331

10. Gross SJ. Conductive hearing loss at age 4 years in children who were 32 weeks of gestational age. Pediatr Res. 1991;29:119A

I 1. D’Eugenio D, Flood MM, Gross SJ. Persistent cognitive delays in preterm

infants with transient neuromotor abnormalities. Pediatr Res.

1989;25:250A

12. Hamill PVV, Drizd TA, Johnson CL, et al. Physical growth: National Center for Health Statistics percentiles. Am I Cli,: Nutr. 1979;32:607-629

13. Nellhaus G. Composite international and interracial graphs. Pediatrics.

1968;41 :106

14. Kitchen WH, Ford GW, Richards AL, Lissenden JV, Ryan MM. Children of birth weight <1000 g: changing outcome between ages 2 and 5 years. IPediatr. 1987;110:283-288

15. Kitchen W, Ford G, Orgill A, et al. Outcomes in infants with birth weight 500 to 999 gm: a regional study of 1979 and 1980 births. I Pediatr.

1984; 104:921-927

16. Saigal S. Szatman P. Rosenbaum P. Campbell D, King S. Intellectual and functional status at school entry of children who weighed 1000

grams or less at birth: a regional perspective of births in the 1980s. I

Pediatr. 1990;1 16:409-4 16

17. Leonard CH, Clyman RI. Piecuch RE, et al. Effect of medical and social

risk factors on outcome of prematurity and very low birthweight. I

Pediatr. 1990;1 16:620-626

18. Hoffman EL, Bennett FC. Birth weight less than 800 grams: changing outcomes and influences of gender and gestational number. Pediatrics.

1990;86:27-34

19. Escalona SK. Babies at double hazard: early development of infants at biologic and social risk. Pediatrics. 1982;70:670-676

20. Bill JM, Sykes DH, Hoy EA. Difficulties in comparing outcomes of

low-birthweight studies because of obsolescent test norms. Dev Med Child

Neurol. 1986;28:244-250

21. Murphy G. Are intelligence tests outmodeled? Arch Dis Child.

I 987;62:773-775

22. Knobloch H, Stevens F, Malone AF. Manual of Developmental Diagnosis:

The Administration and Interpretation of the Revised Gesell and Amat rude Developmental and Neurologic Examination. Hagerstown. MD: Harper

and Row, 1987

23. Campbell 5K, Siegel E, Parr A, Ramey CT. Evidence for the need to renorm the Bayley Scales of Infant Development based on the perform-ance of a population-based sample of 12-month-old infants. In: Topics

in Early Childhood Special Education. 1986;6(2):83-92

TEACHERS’ ADVICE, 3200 YEARS AGO-EGYPT, RAMESES REIGN

A teacher advises his students: “Be a scribe! It saves you from toil and protects

you from all kinds of work. It spares you from using hoe and mattock, that you

need not carry a basket. It keeps you from wielding the oar and spares you torment, so that you are not subject to many masters and endless bosses. . . . Now the scribe,

he directs all the work in this land.”

Another teacher scolds, “I am told that you have abandoned your studies and

whirl around in pleasures, that you go from street to street and the place stinks of beer every time you leave.”

The National Geographic. April 1991:22.

Submitted by Harris C. Faigel, MD

(8)

1992;90;681

Pediatrics

Steven J. Gross, Terri A. Slagle, Diane B. D'Eugenio and Barbara B. Mettelman

Performance in Preterm Infants