Methodologic Standards for Controlled Clinical Trials of Early Contact and Maternal-Infant Behavior

(1)

Methodologic

Standards

for Controlled

Clinical

Trials

of Early

Contact

and Maternal-Infant

Behavior

Mary

Ellen

Thomson,

MSc,

and Michael

S. Kramer,

MD

From the Departments of Epidemiology and Health and Pediatrics, McGill University Faculty of Medicine, Montreal

ABSTRACT. To provide an objective evaluation of

pub-lished studies on the effect of early contact on subsequent

maternal-infant behavior, a set of 1 1 methodologic

stand-ards generally applicable to controlled clinical trials of

perinatal care was developed. Sixteen reports of early

contact trials were assessed and seven of the 1 1 standards

were found to be satisfactorily fulfilled. The four

“prob-lem” standards were: adequate definition of subjects,

randomization, subject bias, and treatment

contamina-tion (care giver) bias. Of the five best trials fulfilling eight

or more ofthe standards, three reported a beneficial effect

of early contact, while two demonstrated no effect. The

evidence that early contact improves subsequent

mater-nal-infant behavior thus remains inconclusive. It is urged

that for future research in this domain more attention be

given to adequate subject definition, strict randomization

procedures, and safeguards against bias by the subjects

or their care givers. Pediatrics 1984;73:294-300; met

hodo-logic standards, controlled clinical triaLs, perinatal care,

maternal behavior.

most reporting positive results, ie, closer mother-infant ties. These results have contributed to public

pressure that has brought about significant changes

in obstetric care. Some physicians remain skeptical

on intuitive grounds, however, that a few minutes

of contact can have so lasting an outcome, and

some question the quality of scientific evidence

demonstrating benefits of early contact.’7”8

Rut-ter,’9 for example, concluded, “The findings are

important but the claims concerning a sensitive

period for maternal attachment rather outrun the

empirical evidence.”

To evaluate the evidence in an objective way, we developed a set of methodo!ogic standards for

din-ical trials of obstetric and neonatal care and then

applied these standards to the published reports of

early-contact trials.

DEVELOPMENT

OF STANDARDS

Does a brief period of contact between a mother

and her newborn at childbirth influence maternal

and infant behavior days, months, or years later?

A number of controlled clinical trials of early

con-tact have been published in the past decade,’’6

Received for publication Dec 28, 1982; accepted April 7, 1983. This is publication No. 84001 ofthe McGill University-Montreal Children’s Hospital Research Institute.

Presented in part at the annual meeting of the Ambulatory

Pediatric Association in Washington, DC, May 14, 1982.

Mrs Thomson is a recipient of a studentship award of the

Medical Research Council of Canada. Her current address is

3775 University St, Montreal, Quebec H3A 2B4, Canada.

Dr Kramer is a National Health research scholar of the National

Health Research and Development Program, Health and

Wel-fare Canada.

Reprint requests to (M.S.K.) Department of Epidemiology and

Health, McGill University Faculty of Medicine, 3775 University

St, Montreal, Quebec H3A 2B4, Canada.

American Academy of Pediatrics.

Based on the principles originally set out by

Hi!!,2#{176}amplified by recent literature,2129 and

mod-ified by our experience, 11 standards were

elabo-rated. We then tested these standards on trials of

perinatal care that did not involve early contact

and made appropriate modifications to arrive at the

final version. These 1 1 standards are categorized as

follows, according to Hill’s four stages of a clinical trial20: definition of the subjects (standard 1);

a!lo-cation of the subjects to treatment groups

(stand-ards 2 to 4); laying down of the treatment schedule

(standards 5 to 7); and measurement ofthe outcome

(standards 7 to 11).

Standard 1 applies to the adequate definition of

the subjects. Fulfillment of this standard requires

the establishment of selection criteria. These

cri-teria should be clearly stated in the report, and data

should be provided concerning the proportion of

(2)

Furthermore, if the participation rate (the

propor-tion of those meeting the selection criteria who

actually participated in the study) is low, the

inves-tigators should compare participants and

nonpar-ticipants in an attempt to show their comparability.

Standard 2 concerns the randomization or other

methods used for allocating subjects to the

treat-ments under study. In order to fulfil! this standard,

the author should not only mention that treatment

was assigned on a random basis, but also describe

the method of randomization. That method must,

to be truly randomized, lie outside the investigator’s

control.

Standard 3 concerns the verification that the

allocation of treatments yielded comparison groups

that were equivalent in all important respects, other

than the treatment under study. In other words,

have the researchers ensured that, ignoring possible

treatment effects, the groups are equally likely to

develop the outcomes under study? For studies

involving mothers’ behavior toward their infants, a

variety of clinical (parity, neonatal birth weight,

gestationa! age, and health status) and sociodemo-graphic (maternal age, ethnic origin, and

socioeco-nomic status) factors could affect the outcome. If

the treatment groups differ to an important extent

on one or more of these variables, stratification or

statistical adjustment should be performed when

the outcomes are analyzed. The adjusted results

should then be taken into account in drawing

con-clusions about the effects of treatment.

Standard 4 pertains to losses of subjects

occur-ring after the subjects are randomized or otherwise

allocated to treatment groups. The authors should

provide the reader with both the number of subjects

initially randomized and the number

subse-quent!y lost from a!! groups. Furthermore, if the

losses are substantial, lost subjects should be

com-pared (according to relevant base line variables and

treatment group) with those retained in the study.

If lost and retained subjects are not equivalent,

stratification or statistical adjustment again

be-comes necessary in analyzing the results.

The fifth standard concerns the definition of the

experimental and control treatments. Both should

be described in adequate detail, and it should be

clear that the treatments were defined before the

trial commenced-not “in progress” during the

course of the trial.

Standard 6 applies to possible contamination of

the treatment stemming from bias on the part of

the care givers. Investigators should provide and

report adequate safeguards to ensure that the

ex-perimenta! group does not receive any treatment,

beyond the maneuver under study, that is not also

received by the control group. Differential support

and encouragement by postpartum nursing staff is

an obvious example of the kind of treatment

con-tamination that could alter the results of a trial

concerned with maternal behavior.

Standard 7 applies to subject bias, ie, the

possi-bility that subjects’ awareness of their treatment

group assignments could influence the outcome

un-der study. This standard actually pertains to both

of Hill’s two latter stages, since subject bias can

affect either the treatment itself (a placebo effect)

or the measurement of the outcome (ie, the subject

behaves in a way she believes the researcher wants

or expects her to behave). Fulfillment ofthis

stand-ard requires that adequate safeguards be reported

for the design and conduct of the trial. Subjects in

the experimental group should be kept unaware of

any special status, and all subjects should be blind

to the research hypothesis.

The eighth standard relates to the reliability and

validity of the outcome measures used in the trial.

In particular, the investigators should take steps to

ensure that interobserver agreement is high among

research staff responsible for measuring the

out-come, or they should use single observers or

meth-ods not susceptible to subjective judgment. The

validity of measures in the domain of

maternal-infant interaction is often difficult to assess, but

when validated measures are available, they should

be used.

Standard 9 concerns the unbiased observation of

outcome. It is fulfilled by blinding the observers to

the treatment group assignment (experimental v

control) of all subjects under study.

Standard 10 pertains to the soundness of

statis-tical inferences made. In particular, the authors

should make clear what their hypotheses were prior

to the trial. Hypotheses generated by post hoc

analyses of the data should be treated as tentative.

Multiple testing should be accompanied either by

downward adjustment of the a level (the threshold

P value for inferring statistical significance) or by

cautious interpretation of the findings if the

cus-tomary a level (P

<

.05) is retained. Finally, so-called “trends” (usually reflecting differences asso-ciated with P

>

.05) should not be overinterpreted.

Finally, the eleventh standard relates to clinical

significance and statistical power. In a large trial,

trivia! differences may achieve statistical

signifi-cance without reaching a magnitude that has

din-ical relevance or importance. Conversely, small

trials may fail to produce statistically significant

differences merely because they lack the power to

detect small, but clinically meaningful, differences. Fulfillment of this final standard requires

consid-eration of these aspects of sample size in

(3)

METHOD

OF EVALUATION

We evaluated only trials published in refereed

journals that assessed the effect of extra contact

between mothers and term infants in the first few

hours and days after birth. Abstracts or accounts

given in books were thus excluded. We located 16

reports of 13 different trials. (The three extra

re-ports concerned two trials in which subsequent

follow-up studies measured longer-term outcomes

in the same subjects as the original trials.) We

emphasize that our evaluation concerned the

re-ports of trials, rather than the actual methods used

in the conduct of these trials, because it is the

reports that are generally available for

considera-tion and that serve as the basis for policy decisions.

Our aim was to evaluate the methodologic quality

ofthese trials, ie, the design, conduct, and reporting

of experiments that measured the effect of

treat-ments (early contact) on outcomes (certain

mater-nal-infant behaviors) chosen by the investigators;

we did not assess the appropriateness of the chosen

treatments and outcomes themselves. We thus did

not concern ourselves with such issues as the types

of mothers and infants; the nature, duration, and

timing of the contact; nor the short-term or

long-term importance of the behaviors studied. These

biologic-clinical issues have been critically reviewed

by Siegel.3#{176}

Criteria for fulfillment of the 11 methodologic

standards are contained in the “Appendix.” For four

of the 11 (standards 4, 6, 9, and 11), the criteria

were judged as either fulfilled or nonfu!fi!!ed. For

the remaining seven standards, partial fulfillment

resulted in assignment to an intermediate category.

We independently assessed each report by each

of the 1 1 standards. Agreement between the two

authors was high, with a weighted K score of +0.69.

(Weighted kappa is a statistical measure of

condor-dance between two observers that corrects for

TABLE. Consensus Ratings for 16 Reports

chance-expected agreement and allows for partial

agreement and disagreement. It ranges in value

from -1 to +1, with values above +0.5 representing

high levels of concordance.31) We then compared

and discussed our discrepant ratings, establishing a

consensus rating for each of the 1 1 standards on all

16 reports, and the consensus ratings were then

used to judge the fulfillment of the standards.

We calculated summary scores for each standard

to quantitate, in a rough way, the overall strengths

and weaknesses of the published reports by giving

half credit for partial fulfillment and expressing the

result as a percentage of the total possible score of 16:

Summary score

=

(no. of articles with standard fulfilled)

100 #{189}(no. of articles with standard partially fulfilled)

16 total articles

RESULTS

AND

DISCUSSION

The consensus ratings for each of the 16

pub-!ished reports on the 11 standards are shown in the

Table, along with the summary scores for each

standard. In genera!, the standards were we!!

fu!-filled, with seven of the 11 obtaining summary

scores of 69% to 100%. These seven standards will

receive no further discussion here; readers

inter-ested in pursuing the methodologic issues involved

are referred to several excellent reviews.2#{176} We

shall focus here instead on those standards

receiv-ing summary scores of 50% or less: Subject

defini-tion (standard 1), Randomization (standard 3),

Treatment contamination (standard 6), and

Sub-ject bias (standard 7).

The first problem standard was Subject

defini-tion. Although a majority of reports did mention

some selection criteria, few indicated the proportion

excluded. It was thus not clear whether the reported

findings apply to a!! women giving birth, to the

majority who give birth with no complications

Standard R efere nce N o. of Report Summary

Score (%)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

-i-:-

Subject definition + #{247} + ± - + ± + ± ± ± 47

2. Randomization - - - - ± ± ± ± ± - ± - ± ± ± - 28

3. Equivalence of groups + + + + + - - + + + ± + + ± + + 81

4. Losses after randomization + + + + + + + + + + - + - + + + 87

5. Treatment definition + + + + + + + + + + + + + + + + 100

6. Treatment contamination - + + + + - - - + - - + - + + - 50

7. Subject bias + - - ± - - + + 22

8. Reliability/validity of outcome ± ± ± - + + + + + + - ± + + + + 75

meaures

9. Outcome observer bias - - - + + + + - + + + - + + + + 69

10. Statistical inferences + ± ± - + - - ± + + + + + + + + 72

11. Clinical significance and sta- + + + + + + + + + + + + + + - + 94

tistical power

(4)

whatever, or to some combination in between. We wonder also whether there were hidden selection

criteria, such as weekday, daytime deliveries. In

genera!, the subjects of these reports were poorly defined, as reflected by the summary score of 47%.

The second problem standard was

Randomiza-tion. The reviewed articles achieved a summary

score of only 28% on this standard. Half the articles

did not mention random assignment; only one

de-scribed the procedure used.

The crucial importance of the randomization

procedure in safeguarding the comparability

of the groups of patients has often been

empha-sized.20’22’3234 As Mainland2’ explains:

Experience has shown that the human being is an

ex-tremely poor instrument for the conduct of a random

selection. Whenever there is any scope for personal choice

or judgment on the part of the observer, bias is almost

certain to creep in. Nor is this a quality that can be

removed by conscious effort or training. Nearly every

human being has, as part of his psychological make-up,

a tendency away from true randomness in his choices.

Thus the method of randomization must be

care-fully planned and strictly executed so as to be

immune from the persona! choice or judgment of

the investigator. A random treatment order sealed

in opaque, consecutively numbered envelopes is a well-known example of such a method.

The third problem standard was Treatment

con-tamination. In these trials we were especially

con-cerned about the possible effect of care giver bias,

ie, the possibility that more interest,

encourage-ment, and support were given by the nurses,

phy-sicians, or others to the early-contact group. Few

of the reviewed reports mentioned any safeguards

against care giver bias, however, as reflected by the

summary score of 50% for this standard.

Although the “blinding” of care givers is a

corn-mon practice in trials of drugs, thereby effectively

eliminating care giver bias, such blinding is often

impossible in trials of health care. Obstetic care

trials are, in fact, particularly susceptible to

treat-ment contamination, because new mothers may be

especially sensitive to encouragement and support.

While it may be difficult to prevent care giver bias

entirely, some safeguards are necessary. For

exam-p!e, the time spent by nurses with mothers in both

groups could be equalized, or the members of the

hospital staff involved in the early-contact

treat-ment could be kept away from the mothers

there-after.

The fourth problem standard was Subject bias. Few of the reviewed reports mentioned any

safe-guards against subject bias; the summary score was

only 22%. We were particularly concerned about a

feeling of “specialness” among the mothers in the

experimental group, a feeling that might well affect

the outcome. Although the subjects of these trials

could not be blinded to their treatment, subject bias (like care giver bias) must be kept to a minimum

by thoughtful planning at the design stage of a trial.

One possible safeguard is to ensure that study

sub-jects remain unware that different treatments are

being compared. At the least, subjects should be kept unaware of the research hypothesis.

EVIDENCE

FOR

BENEFIT

OF EARLY

CONTACT

What can we conclude about the effectiveness of early contact? In the 16 reports evaluated, effec-tiveness was assessed by a variety of outcomes that could be loosely categorized as mothers’ behaviors toward their infants. The range and variety of these behaviors is extensive and includes maternal affec-tion (the number of times a mother looks at, talks to, smiles at, or touches her infant) at 36 hours after birth, breast-feeding at 6 to 12 weeks post partum, and parenting inadequacy during the first

18 months of life. Although indexes of infant

de-ve!opment, such as the Bayley Scales, were also

used in several studies, such infant outcomes were

not found to be affected. Thus the question under

debate is whether early contact affects a mother’s behavior toward her infant.

Of the 16 reports we reviewed, 13 concluded that extra contact in the first few hours and days after birth improved the measured maternal behaviors; three articles concluded that early contact was without effect. None of the reports, however, fu!-filled a!! 11 standards.

It is possible that the deficiencies noted are due

to poor reporting, rather than to flaws in the actual conduct of the trials. Evidence for such incomplete

reporting was found by Chalmers et a!,32 who wrote

to 59 authors who had omitted important

infor-mation in reports of their trials. According to the

responses, half the missing items had actually been

carried out. Thus it is probably safe to assume that

some trials are methodologically better than they

appear from their published reports. Nonetheless, evaluation of the evidence and policy decisions are

usually based on these reports, and readers should

be provided with the information necessary for

their evaluations and decisions.

We believe that the quality of trials, and not their

quantity, determines the validity of their

conclu-sions, if for no other reason than that positive trials

have a better chance of being published than

neg-ative ones. Such publication bias is particularly

likely with trials involving small numbers of

sub-jects, in which little importance can be attached to

negative results (owing to lack of power in the

(5)

majority reported trials of small sizes; ten of the 16 involved 50 or fewer subjects.

In comparing the quality of the reports, we found

that the positive reports tended to fulfil! fewer

standards (mean 7.0) than the negative reports

(mean 8.2), although this difference was not

statis-tica!ly significant. We then decided to focus on the

results of the trials of highest quality: the five trials

that fulfilled more than eight of the 11 standards.

Of these five trials, three concluded that extra

contact had a positive effect and two found that it

did not.

A number of reasons could be advanced to explain

the discrepancy in the results of the five best trials.

One possible explanation is a difference in the

control treatments, ie, the usual hospital routines.

In the three positive trials, control subjects only

briefly glimpsed at their infants immediately after

birth, whereas in the two negative trials, control

subjects held their infants (although wrapped) for

about five minutes. This difference in type and

duration of contact between the control mothers

and their infants could theoretically lead to

differ-ences in subsequent maternal behavior. This

hy-pothesis, however, remains to be tested.

Differences in the types of subjects studied, or in

the types of outcomes measured, reveal no evident

reasons for the inconsistent results. Trials in

simi-lar types of subjects produced both positive and

negative results; trials using the same outcome

mea-sure (maternal affectionate behavior at 36 hours

post partum) also yielded mixed findings.

We do not know which, if any, of these

consid-erations explains the conflicting results. Although five trials fulfilled more than eight of the standards,

none fulfilled all 11. We have argued for quality

rather than quantity of research reports. Thus, we

would place more credence in the findings of a few

studies fulfilling all of the standards rather than

base our conclusion on “majority rule.” It should be

emphasized, however, that no single study, even if

methodologically flawless, can guarantee the

dis-covery of truth. Besides the possibility of some

unforeseen source of bias, chance can and will

oc-casional!y lead to fallacious results. Thus

replica-tion by different investigators in different settings

using equally rigorous methods lends further

cre-dence to research findings.

The overall methodologic quality of clinical trials of early contact, although far from perfect, appears

comparable to that found in many trials of more conventional (“harder”) therapies in medical, pe-diatric, and obstetrical journals.33’35 Nonetheless,

our assessment of the evidence concerning the

ef-fect of early contact on mothers’ subsequent

behav-ior toward their infants reveals that the evidence

remains inconclusive.

RECOMMENDATIONS

FOR

FUTURE

RESEARCH

We suggest that future clinical trials in this

do-main give greater attention to the four areas that

have been, according to our evaluation, most

defi-cient in previous trials. Study subjects should be adequately defined, so that readers know to whom

the results apply and to whom they may be safely

extrapolated. Randomization should be strictly

ex-ecuted and we!! described. Finally, safeguards

should be provided against bias by either the

sub-jects or their care givers.

Although the randomized controlled trial is

gen-eral!y regarded as the most potent scientific too!

for the evaluation of medical treatments, the mere

use of this design does not in itself confer certainty on its conclusions. Some trials are methodologically

better than others, and confidence in the findings

of a given trial can be expected to rise directly with

the care that has gone into its planning and

exe-cution. With an outcome as emotionally-laden and

difficult to measure as the relationship between

mothers and their newborn infants, the

opportu-nities for bias are considerable, and the need for

methodologic rigor becomes even greater. The

cha!-lenge is great, but then so are the rewards.

ACKNOWLEDGMENTS

We acknowledge the helpful comments we received

from Professor James Hanley and Drs Tom Hutchinson

(6)

APPENDIX: Standards for Controlled Clinical Trials of Perinatal Care: Criteria for Fulfillment

Fulfillment Partial Fulfillment Nonfulfillment

1. Subject definition

a. States selection criteria a, but not b or c Selection criteria not provided

b. Provides proportion of population meeting cri-teria

c. If participation rate is low, compares partici-pants and nonparticipants.

2. Randomization

a. Mentions random assignment a, but not b Random assignment neither

men-b. Describes a method outside control of investi- tioned nor described

gator

3. Equivalence of groups

a. Checks for equivalence by comparing at least 7 Compares (and adjusts <4 of the 12 variables compared of the following 12 variables: age, parity, socio- for) 4-6 of the 12 vari- or, if different, satisfactory

adjust-economic status, race/ethnicity, marital status, ables ment not performed

type of delivery, method of feeding, health of the mother, gestational age, birth weight, sex, and health status of the infant

b. If important differences, performs stratifica-tion or other adjustment in analysis 4. Losses after randomization

a. Provides the numbers of subjects randomized Substantial losses neither

ac-and subsequent losses from groups counted for nor given cautious

in-b. If losses substantial, compares lost and re- terpretation

tamed subjects

c. If lost subjects different from those retained, performs stratification or other adjustment or interprets results cautiously

5. Treatment definition

a. Describes both experimental and control treat- Only experimental Experimental treatment not

ade-ments adequately treatment adequately quately described

b. Defines treatments prior to trial described (plus b)

OR

Evidence of change in treatment

schedule during course of trial 6. Treatment contamination

Care giving is identical in study and control Safeguards against treatment

con-groups, except for treatment under investigation tamination inadequately described

(ie, no contamination)

7. Subject bias

a. Experimental subjects unaware of special sta- b, but not a Safeguards against subject bias

in-tus adequately described

b. All subjects unaware of research hypothesis

8. Reliability/validity of outcome measures

a. Uses objective measures or single observer; if Reliability demon- Inadequate evidence or reliability

not, provides good evidence of interobserver strated for some mea- or validity of measures

agreement sures, but not all

b. Uses measures that are valid to best of current knowledge

9. Outcome observer bias

Observers blind to group assignment Blinding of observers not

men-tioned 10. Statistical inferences

a. Give primary importance to prior hypotheses a, but not b or c Firm inferences stated after post

and treats hypotheses generated by the data as hoc analysis or multiple testing

tentative (unless P < .001)

b. Adjusts P values when multiple outcomes are assessed or gives cautious interpretation

c. Does not overinterpret trends (P > .05)

1 1. Clinical significance and statistical power

a. Considers clinical significance if sample size is Clinical significance or power not

large and differences are small, yet statistically considered

significant

(7)

REFERENCES

1. Greenberg M, Rosenberg I, Lind J: First mothers

rooming-in with their newborns: Its impact upon the mother. Am J

Orthopsychiatry 1973;43:783

2. Klaus MH, Jerauld R, Kreger NC, et al: Maternal

attach-ment: Importance of the first post-partum days. N EngI J

Med 1972;286:460

3. Kennell JH, Jerauld R, Wolfe H, et al: Maternal behavior

one year after early and extended post-partum contact. Dev

Med Child Neurol 1974;16:172

4. Ringler NM, Kennell JH, Jarvella R, et al: Mother-to-child

speech at 2 years-Effects of early postnatal contact. J

Pediatr 1975;86:141

5. Hales DJ, Lozoff B, Sosa R, et a!: Defining the limits of the

maternal sensitive period. Dev Med Child Neurol

1977;19:454

6. de Chateau P, Wiberg B: Long-term effect on mother-infant behavior of extra contact during the first hour postpartum:

I. First observations at 36 hours. Acta Paediatr Scand

1977;66:137

7. de Chateau P, Wiberg B: Long-term effect on mother-infant behavior of extra contact during the first hour postpartum: II. Follow-up at three months. Acta Paediatr Scand

1977;66;145

8. Salariya EM, Easton PM, Cater JI: Duration of

breast-feeding after early initiation and frequent feeding. Lancet

1978;2:1141

9. Thomson ME, Hartsock TG, Larson C: The importance of

immediate postnatal contact: Its effect on breast feeding.

Can Fam Phys 1979;25:1374

10. McClellan MS, Cabianca WA: Effects of mother-infant

con-tact following cesarean birth. Obstet Gynecol 1980;56:52

11. Siegel E, Bauman KE, Schaefer ES, et al: Hospital and

home support during infancy: Impact on maternal

attach-ment, child abuse and neglect, and health care utilization.

Pediatrics 1980;66:183

12. O’Connor 5, Vietze PM, Sherrod KB, et al: Reduced

mci-dence of parenting inadequacy following rooming-in.

Pedi-atrics 1980;66:176

13. Carlsson SG, Fagerberg H, Horneman G, et al: Effects of

amount of contact between mother and child on the mother’s

nursing behavior. Dev Psychobiol 1978;11:143

14. Nelson NM, Enkin MW, Saigal 5, et al: A randomized

clinical trial of the Leboyer approach to childbirth. N EngI

J Med 1980;302:655

15. Svejda MJ, Campos JJ, Emed RN: Mother-infant

“bond-ing”: Failure to generalize. Child Dev 1980;51:775

16. Au

z,

Lowry M: Early maternal-child contact: Effects on

later behavior. Dev Med Child Neurol 1981;23:337

17. Chess 5, Thomas A: Infant bonding: Mystique and reality.

Am J Orthopsychiatry 1982;52:213

18. Eisenberg L: Social context of child development. Pediatrics

1981;68:705

19. Rutter M: Separation experiences: A new look at an old

topic. J Pediatr 1979;95:147

20. Hill AB: StatisticalMethods in Clinicaland Preventive Med-icine. New York, Oxford University Press, 1962

21. Mainland D: Elementary Medical Statistics. Philadelphia,

WB Saunders, 1964

22. Brown BW: Statistical controversies in the design of clinical

trials-Some personal views. Controlled Clin Trials

1980;1:13

23. Mosteller F, Gilbert JP, McPeek B: Reporting standards

and research strategies for controlled trials. Controlled Clin Trials 1980;1:37

24. Altman DG: Statistics and ethics in medical research: III.

How large a sample? Br Med J 1980;281:1336

25. Altman DG: Statistics and ethics in medical research: V.

Analyzing data. Br Med J 1980; 281:1473

26. Altman DG: Statistics and ethics in medical research: VII.

Interpreting results. Br Med J 1980;281:1612

27. Nunnally JC, Wilson WH: Method and theoryfor developing

measures in evaluation research, in Guttentag H, Streuning

EL (eds): Handbook for Evaluation Research. Beverly Hills,

CA, Sage Publications, 1975

28. Gore SM: Assessing clinical trials-Trial size. Br Med J

1981;282:1687

29. Nelson RB: Are clinical trials pseudoscience? Forum on

Medicine September 1979, p 594

30. Siegel E: Early and extended maternal-infant contact. Am

J Dis Child 1982;136:251

31. Kramer MS, Feinstein AR: Biostatistics LIV: The

biostatis-tics of concordance. Clin Pharmacol Ther 1980;28:551

32. Chalmers TC, Smith H Jr, Blackburn B, et al: A method for

assessing the quality of a randomized control trial.

Con-trolled Clin Trials 1981;2:31

33. Gilbert JP, McPeek B, Mosteller F: Progress in surgery and

anesthesia: Benefits and risks of innovative therapy, in

Bunker JP, Barnes BA, Mosteller F (eds): Costs, Risks, and

Benefits of Surgery. Oxford, Oxford University Press, 1977, pp 124-169

34. DerSimonian R, Charette LI, McPeek B, et al: Reporting

on methods in clinical trials. N Erigl J Med 1982;306:1332

35. Tyson JE, Furzan JA, Reisch JS, et al: An evaluation of the

quality of therapeutic studies in perinatal medicine. J

(8)

1984;73;294

Pediatrics

Mary Ellen Thomson and Michael S. Kramer

Maternal-Infant Behavior