Lead Levels and Intelligence

(1)

PEDIATRICS Vol. 68 No. 6 December 1 981 903

Letters

to the

Editor

Statements appearing here are those of the writers and do not represent the official position of the American Academy of Pediatrics, Inc. ,or its Committees. Comments on any topic, including the

contents of PEDIATRICS, are invited from all members of the profession: those accepted for publication will not be subject to major editorial revision, but generally must be no more than 400

words in length. Shorter letters will be published earlier. The editors reserve the right to publish replies, and may solicit responses from authors and others.

Letters should be submitted in duplicate in double-spaced typing on plain white paper. Send them to Jerold F. Lucey, M.D., Editor, Pediatrics Editorial Office, Mary Fletcher Hospital, Colchester Avenue, Burlington, VT 05401.

Lead Levels and Intelligence

To the

Editor.-The issue raised in Needleman’s commentary

(Pedi-atrics 68:894, 1981) is the interpretation of two studies in

an area in which methodologic concerns are paramount. We appreciate the pioneering efforts in the studies that preceded the Boston study’ and our current paper’; from

these published reports and informal exchange of material

all of us have learned much that guides more sophisti-cated endeavor. Two recent objective reviews3’4 describe some, but not all, of the design problems pertaining to the earlier works.

In our discussion of methodologic issues we mentioned some aspects ofthe Boston study. This was for illustration and to direct the reader toward our conclusion, ie, that the design difficulties are such that there is a strong

possibility of error in a conclusion that relatively low

levels of lead burden lead to developmental deficit. In a later review by Needleman and Landrigan5, the caution in interpretation of other studies is not applied in the description ofthe Boston study. To the contrary, they

identify five design difficulties: (a) difficulty in identifi-cation of lead exposure; (b) ascertainment bias; (c) lack

of sensitive outcome measures; (d) insufficient control for confounding variables; and (e) inadequate sample size. They then state that the Boston study “deals systemati-cally” with each of these difficulties. Although this is one of the most comprehensive studies to date, careful eval-uation of these design difficulties and the crucial issues of appropriate statistical test and studywise error leaves many questions.

1. Identification ofLead Exposure. Dentime lead level was the only measure used. Difficulties in reliability of measurement are clear in the report of discordance in

determinations from the same or a later tooth. There was

a greater loss due to nonconcordance from the group with high lead levels (upper tenth percentile) than from the group with low lead levels (lower tenth percentile). Lack of concordance is consistent with our experience in which several determinations from the same tooth differed ap-preciably.

In the Boston study, 81 of the children had prior blood lead level determinations but these were not related to outcome measures. (A significant R2 increment over con-trol variables for lead for the group with high levels of lead with no such increment for the group with low levels

of lead would suggest a threshold.) Multiple measures, with markers oftissue effect (erythrocyte protoporphyrin, as in our study, or inhibition of 5-aminolevulinic acid

dehydratase) supplementing blood and/or dentine lead

level will give more reliable measurement.

2. Ascertainment Bias. We view the subject loss re-ported in the Boston study as including those lost because of “infant at home, working parents, etc.” because in our experience parents in these categories who are motivated do come to a research center and less motivated parents do not. Motivation may well be related to lead access and to outcome variables. The arithmetic is less important. Of the 524 provisionally eligible children of the Needle-man et al study, 366 were excluded (70%); of these, 170 almost half, were excluded for “high mobility, lack of parent interest, or unwithngness to cooperate within the structures of the study protocol (as requiring children to

be brought to the research center).”

3. Measures ofPerformance. Needleman et al selected a large and comprehensive battery of outcome measures. Here we will discuss the content; the size of the battery poses problems dealt with below. With the exception of the WISC-R, the sensitive and well validated tests in the battery were not discussed, presumably for failure to

reach statistical significance. Academic achievement and

perceptual motor skills are particularly important. The

by guest on September 7, 2020

www.aappublications.org/news

(2)

904 PEDIATRICS Vol. 68 No. 6 December 1981

Peabody tests of academic achievement (mathematics,

reading recognition, and reading comprehension) are re-liable and valid measures related to classroom perform-ance. Although impairment of perceptual motor skifis 5 often related to insult to the brain, the Boston report includes no mention oftheir results with the Visual Motor

Integration Test, the Frostig Test, or “Elements of the

Halstead-Reitan Battery.”

With the exception of the WISC-R Scale, the only

significant fmdings of the Boston study involve

explora-tory procedures of limited or inappropriate

standardiza-tion. The Seashore Rhythm Test,6 which evaluates

mus-ical talent of children of at least fourth grade, might serve as a measure of attention for first and second graders. It is not standardized for this purpose, but has been used

(with only minimal discriminating power) in a study7 of brain damage in children more than 9 years of age. The Token Test8 has previously been used in a study of receptive language in aphasic adults. The Sentence Com-pletion Test9 appears to have been used before only in a differentiation of 20 normal and 20 dyslexic white males. The Reaction Time measures could not be traced; the

article’0 cited is an essay on chronic adult schizophrenia which makes some mention of reaction time but includes no pertinent methods or data. The Teacher Rating Scale was apparently devised for this study. It is well known

that single dichotomous items tend to be unreliable.

(Using factor scores of a somewhat better standardized

Teacher Rating Scale, the New York group found no lead effect).

This leaves only one sensitive well-validated

psycho-logical test (and some of its intercorrelated subtests)

statistically related to lead level in the analyses reported. For the intelligence measure, there S no indication of whether the reported means, 106.6 for the group with low

levels of lead and 102.1 for the group with high levels of lead, are adjusted for the covariates. Failure to adjust means may not matter when the units of measurement are unfamiliar; when it 5 in IQ units the uncorrected

reported difference between groups is spuriously large. Standard deviations are not provided, although they are included elsewhere. The proportions of variance ac-counted for after covariance analysis are not presented. Given that the obtained effect is small (even if these are adjusted means) and part of that effect may be due to

confounding, the results require much more caution in interpretation than they have received. Furthermore as Rutter4 noted, the effect, given that it is true, is exagger-ated by the selection of extreme groups.

4.Confounding Variables. No correction for

confound-ing conditions was made in the Boston study results with items of the Teacher Rating Scales. These ratings form the basis of the conclusion that there S a dose-response effect and that there are classroom performance deficits.

Although the list of control variables in the Boston

study is comprehensive, confounding remains a source of

concern. Bias cannot be eliminated completely because the relevant attributes are inevitably undermeasured and the nature of the statistical model leads to

undercorrec-tion. The appropriate compensation lies not in mechani-cal dependence on a statistical package but in the

scien-tific awareness of bias and due caution in the interpreta-tion of small effects when bias exists.

5. Multiple Outcome Measures. Needleman and asso-ciates ignored again the issue of multiple outcome

mea-sures. Two problems are noted, the failure to consider the large number of statistical tests in considering the prob-ability levels and the treatment of each of these tests as

if it were independent, ie, as if outcome measures were

uncorrelated with each other.

Whereas statistical treatment of many variables in the

Boston study is not described, it appears that there were at least 52 independent univariate tests. One method of coping with the studywise error rate was suggested for

subscales of a test, but not for the study as a whole. With

52 independent tests and a

=

.05 the studywise type I error rate, or probability of obtaining one or more signif-icant tests by change, is .93, not .05. Even if a more

stringent level, a

=

.01, 5 chosen, the studywise error rate 5 .41. The error rate increases drastically with the num-ben of variables.” This S an infrequently recognized

consideration when a large study S planned.

The probabilities of obtaining in a field of 52 tests the

specific P values reported by Needleman et al (and of obtaining the few reported in our study) could be com-puted. The probabilities have not been computed, how-ever, because these are based on the assumption of in-dependence. Failure to consider the intercorrelations among the many outcome variables can grossly distort the results. Unlike the biases associated with subject loss

or confounding, the directions ofdistortion are not always

predictable.’2

There are techniques available to handle inference in

studies with large numbers of intercorrelated (and

some-times unreliable) measures. A hierarchical strategy in

organizing variables would have provided a means for reducing the number of statistical tests as well as giving

a better picture of the performance measures. Data re-duction procedures such as the formation of indices or factor scores can be helpful. Once the data are reduced to a manageable number of variables, multivariate proce-dures, as T2, MANOVA, discniminant function, or can-onical correlation will handle the problem of

intercorre-lations among variables and, by reducing the number of

statistical tests, will considerably reduce the problem of

studywise error. Data in a study of this sort should not be analyzed by independent univariate tests.

6. Sample Size. We bypassed the issue of sample size

until the major statistical issues were identified. The purpose of increasing sample size is to increase the power of a statistical test.’3 Power is the ability to discern an effect given that there is such an effect. The sample size of the Boston study would have been quite adequate had the number ofvariables been reduced. Ifthe large number

of variables (at least 52 plus 6 control variables) had been

appropriately analyzed by multivariate methods (as Ho-teffing’s T2) it would have taken a much larger sample to

compensate for this loss of power. “. . . having more

variables when fewer are possible increases the risks of both finding things that are not so and failing to find

things that are. These are serious costs indeed.”’’#{176}’

In summary, research on this topic 5 not easy and if

(3)

LETTERS TO THE EDITOR 905

things can go wrong they wifi. Available specimens may

not yield wholly reliable indices oflead effect and multiple indices may be impractical or expensive. Parents can and

do refuse participation. We cannot measure every

con-founding variable and the measures obtained are

neces-sarily less than perfect. Statistical control of confounding undencorrects. We may fail to obtain positive fmdings

with most of our sensitive and reliable performance

mea-sures and obtain them with exploratory measures. Our

real criticisms of Needleman and associates are that they

refuse to recognize these limitations of their study and write as if they have solved these problems and that they

incorrectly used a very large number of univaniate

statS-tical tests with little recognition of studywise error and related design problems.

It was with full recognition of all of these issues that

we concluded in our report that “if there are, in fact, behavioral and intellectual sequelae of low levels of lead

burden independent of other aspects of parental and

social influences on development, these effects are

mini-mal.” The reasons are clearly stated in our report. Several points in the conunentary require some

re-spouse. The contrast of parent-child inteffigence correla-tions for groups with low and moderate levels of lead was replicated and reported in our study.2917 The difference

between .48 and .33 was not significant. (Incidentally, this

kind of analysis was not done by Needleman et al, who

had both appropriate data and knowledge of the mode of

analysis.)

We agree that the data from our reading tests are less reliable than is desirable because of unexpected diversity. In no currently reviewed study35 or any other study

known to us that included academic achievement tests

are these tests significantly related to lead level. This S puzzling if there S indeed a true effect on inteffigence.

We consider the study of Perino and Ernhart’4 to be one of the better earlier studies, but it too suffers from

some of thesesame problems, particularly the use of

urn-variate statStical tests. Needleman and associates suggest that we are committed to the conclusion drawn from the earlier, less completely designed and analyzed study. The

conclusions drawn in our present report are not consistent

with our prior opinion nor with the hypotheses with which we started this study. A careful evaluation of the

pattern of very minimal results and the review of

meth-odologic issues discussed forces the inferences drawn.

Needleman and associates appear unable to see the

dii-ficulties in this area of research and cling tenaciously to a poorly supported conclusion. We can’t.

CLAIRE B. ERNHART, PHD

Case Western Reserve University and

Department of Psychiatry

Cleveland Metropolitan General Hospital Cleveland, OH 44109

BETH LANDA, PHD

Hillside Research Center

Long Island Jewish/Hillside Medical Center

NORMAN B. SCHELL, MD MPH Department of Health

County of Nassau, New York

REFERENCES

1. Needleman HL, Gunnoe CE, Leviton A, et a!: Deficits in

psychologic and classroom performance of children with elevated dentine lead levels. N Engl J Med 300:689, 1979 2. Ernhai-t CB, Landa B, Schell NB: Subclinical levels of lead

and developmental deficit-A multivariate follow-up reas-sessment. Pediatrics 67:911, 1981

3. Bornshein R, Pearson D, Reiter L: Behavioral effects of

moderate lead exposure in children and animal models. CRC Crit Rev Toxicol 8:43, 1980

4. Rutter M: Raised lead levels and impaired

cognitive/behav-ioural functioning: A review of the evidence. Dev Med Child Neurol 22(suppl 1):25, 1980

5. Needleman HL, Landnigan PJ: The health effects of low level exposure to lead. Annu Rev Public Health 2:227, 1981 6. Seashore C, Lewis D, Saetveit J: Measures of Musical

Talents. New York, Psychological Corp, 1956

7. Reitan RM, Davison LA: Clinical Neuropsychology:

Cur-rent Status and Applications. New York, John Wiley, 1974 8. DeRenzi E, Vignolo LA: The Token Test: A sensitive test to

detect receptive disturbances in asphasics. Brain 85:665, 1962

9. Vogel SA: Syntactic Abilities in Normal Dyslexic Children.

Baltimore, University Park Press, 1975

10. Shakow D: Segmental set. Arch General Psychiatry 6:1, 1962

11. Cohen J, Cohen P: Applied Multiple

Regression/Correla-tion Analysis for the Behavioral Sciences. Hiisdale, NJ:

Lawrence Erlbaum, 1975

12. Tatsuoka MM: Selected Topics in Advanced Statistics. No.

6: Discriminant Analysis. Champaign, IL: Institute for Per-sonality and Ability Testing, 1970

13. Cohen J: Statistical Power Analysis for the Behavioral

Sciences. New York, Academic Press, 1977

14. Perino J, Ernhart CB: The relation of subclinical lead level to cognitive and sensonimotor impairment in black pre-schoolers. J Learn Disabil 7:26, 1974

The Bowel Cocktail

To the

Editor.-In a recent issue of Pediatrics, a commentary by Bowie, Mann, and Hifi’ addressed the subject of the treatment of persistent diarrhea with a “bowel cocktail.”

As commentaries such as this one in a refereed journal are widely read and the recommendations put forth are

frequently adopted by physicians in practice, we believe

a number of points must be emphasized. First and fore-most is that this commentary appears to be based on uncontrolled observations. Second, no mention was made as to how many infants had bacterial or parasitic

infec-tions, cases in which the bowel cocktail may have been

effective. Third, the definition of persistent is unclear. Before the soy formula part of their scheme was insti-tuted, was the diarrhea part of an acute illness of only

two

or three days’ duration, or was it of several weeks’ duration following either an acute or gradual onset? This information is extremely important.

In the United States and Canada the majority of acute

diarrheal disease in infants is viral, most commonly due

to the rotavirus.2’3 We are unaware of any data that

indicate that the components of the bowel cocktail are

effective antiviral agents. Furthermore, the natural

(4)

1981;68;903

Pediatrics

Claire B. Ernhart, Beth Landa and Norman B. Schell