SPECIAL
ARTICLE
RELIABILITY
OF
PEDIATRIC
H STORIES
A
Preliminary
Study
Katharine E. Goddard, M.D., George Broder, A.B., and Charles Wenar, Ph.D.
Departments of Pediatrics and Psychiatry, University of Pennsylvania Sc/tool of Medicine
ADDRESS: (KG.) Hospital of the University of Pennsylvania, Philadelphia 4, Pennsylvania.
PEDIATRICS, December 1961
CONTRIBUTORS’
SECTION
1011
,-r,
ODAY’S PEDIATItICIAN is confronted withI
a dilemma. He relies heavily on tileIlis-tory given by the mother for diagnostic
in-formation, yet he is uncertain of the
accu-racy of the material he elicits. Mucil has
been written on the tecilni(Iues of
ilistory-taking, but few data are available regarding
the validity of facts obtained. Tile study
re-ported ilere is an initial attempt to furnish
Sucil information. In brief, the histories of
prenatal, birtil, and developmental events as
given by 25 mothers are compared with the
facts recorded in hospital clinic records, to
determine tile divergence of tile two
sources.
Similar studies have been reported by
Macfanianel and Chess et al.,2 \vilo followed
groups of children for a number of years.
The former studied a group of mothers in
Berkeley, California, comparing their recall
of events after the first 21 months of their
infants’ lives, with material in the primary
records, concluding as follows:
[The] retrospective account [of physical
condi-tion during pregnancy] was so unreliable that we
have had to disregard it. . . . \Veight at birth was
reliably reported. The use of instruments was
unre-liably reported-only two-thirds of the mothers
de-livered srith instruments reported this fact. The
du-ration of labor showed an average discrepancy of
3.5 hours, exact agreement occurring in only 10%
of the cases. . . . Illnesses, unless outstanding, were frequently forgotten.
On the several developmental items, different
av-erage amounts of discrepancy and different spreads
of discrepancy were found. In general, more errors
were made in the direction of precocity. Mothers of
first-born children were more apt to err in the
di-rection of precocity than were mothers of
later-born.
Where retrospective interview data are the only
type available, the above findings should limit the
over-optimistic use of tllcnl as factual.
MATERIALS AND METHODS
This study was conducted in tile Pediatric
Clinic of the Hospital of the University of
Pennsylvania (HUP). In selecting the
moth-ens to be included in the sample, two criteria
were utilized: 1) the child must have been
born in either 1955 or 1956 at the HUP, or
2) the child must have been seen in tile
clinic often enough to permit a reasonably
complete description of his early
develop-ment to be obtained from the Clinic records.
The sample (Table I) consisted of the first
25 mOtilers to visit the Clinic during the
months of July and August, 1960, who had
children who fulfilled these criteria.
The interview, devised by a pediatrician,
a psycilologist and a medical student, was
designed to simulate an ordinary clinic
situ-ation. Tile actual interviews conducted by
the medical student lasted 30 to 45 minutes.
The questions covered details of pregnancy;
delivery; neonatal, infant and early
child-hood development and health; an
assess-ment of the child’s development and
gen-eral status by tile mother; and the family’s
present socioeconomic situation. Both
spe-cific and open-ended questions were
em-ployed, and most of the motilers
spontane-Otis comments were recorded by the
inter-viewer. The material obtained was
tabu-lated statistically according to the outline
in Table II.
RESU LTS
pre-Age
(yr)
y
Race ?%o. Education . Z%o.7
income
-
Sper n,o. ..,..Clznw I ,.szts
Mean: 26 .2 Negro 25 Grammar 8 20()-300 S Mean: 15.3
Range: 20 to 44 School 300-400 19 Range: 10 to 22
high school 14 400-500 2
College 3 Public assistance 1
Children
Sex No. Age
(yr) Position No. health History No.
Male 10 Female 15 Only (-hlild First child Other ‘3 14 8 14 ii 0 (7) (8) (9) (10) (11) (12) (13) (14) (15) (16) (17) (18) (19) (20) (21) (22) (23) (24) (25) (26)
1012 PEDIATRIC HISTORIES
TABLE I
CHARACTERISTICS OF THE 25 MOTHERS AND CIIILDIOF.N IN TIlE SAI1IF
Mothers
Mean: 4-5
Range: 3-7 to 5-6
Noriial illnesses
Moderately ill Severe illnesses
TABLE II
OUTLINE FOIl TABULATION OF MATERIAL
FROM INTERVIEWS
Pregnancy and Delirery
(1) l)uratioii of gestation (2)1)uration of labor
(3) Use of forceps at delivery (4) Weight at birth
(5) Neonatal difficulties at tiIIse of delivery
(6) Agreement l)etween mothers’ and physicians’
rat-ing of ease of delivery
Neonatal Period
l)id the niothier nurse and for how long? 1’Iother’s recall of child’s first formula Age introduced food to child Age at which formula stopped First injection
Saceination
J)erelopment and Health Roll-over
Sit with support Sit alone Crawl
Stand with support Stand alone Walk with support Walk alone
First tooth
First word
Weight at I year
Weight at2years
Weight at 3 years
Illnesses
sented. Tilese will enable the reader to see
110W accurate and inaccurate the motiler’s
information was, and, equally important,
they viil enable him to determine for
him-self whether the degree of inaccuracy would
i)e misleading diagnostically. In tile latter
section vi1i be considered tile question of
whether the departure from fact is serious
Or would alter clinical evaluation.
Empiric Findings
Many of the results are presented in Table
III. N, the number of cases, is not the same
from question to question, due to the fact
that certain mothers were unable to give
any answer to some questions and/or the
in-formation was not available in the primary
records for all children on all questions.
Record indicates tile mean value obtained
from the records; it furnishes the factual
data against which the mothers’ answers can
be evaluated. % P represents the percentage
of cases of perfect agreement between the
mothers’ response and the primary records.
The mean is the average deviation for the
entire group,
+
indicating that the mothers’report was greaten on the average than it
was in the record, and - indicating that the
mothers’
‘-
report was less. For example theaver-N Unit %P Record Mean Sigma Range 75% Range 1(K%
%+
or-1nnit Rank,
%P
(2)Duration of labor ‘22 1 hour 14 8.6 -0.95 3.4 -4 to +3 -7 to +7 41 16
(4) Weight at birth 25 1 ounce 52 6 lb.
8.6 oz
-1 .55 4.95 - to +2 -22 to +2 68 2
(9) Age introduced food
to child 25 1 week 56 6.1 +1 .2 4.6 -2 to +4 -6 to +16 64 1
(10) Age at which formula
stopped 22 1month 32 5.7 .. .. - 1 to +3 - 2 to +6 59 6
(11) First injection 22 1 month 23 3.1 - 0.5 2.3
-
2fto +1 - 7 to +4 50 11(12) \accination 20 1monthi 20 8.9 +2 .9 10.0 -5 to +6 -8 to +30 25 13
(13) Roll-over 21 1 month 19 4.0 +0.1 1.3 -1 to +1 -3f to +2 86 14
(14) Sit with support 2’2 1 month 32 4.1 -0.05 1 .8 -‘2 to +2 -3 to +4 64 6
(15) Sit alone 24 1 month 21 6.0 +0 .I 1.5 - 1to +I - 4 to +4 83 12
(16) Crawl 11 1 month 9 6.1 +0.7 1.7 -1 to +1 -1 to +4 73 18
(17) Stand with support 22 1 month 45 7 .3 +0 .6 1.65
-
I to + 1 - I to +6 82 4(18) Stand alone 24 1 month 38 9.5 +0.3 1 .4 -1 to +1 -2 to +4 75 5
(19) Walk with support 21 1 month ‘29 10.0 -0 .2 2.9 - 1to +1 +4 to +1 1 76 9
(20) Walk alone 25 1 month 48 11.5 +0.4 3.4 -1 to +1 -3 to +16 80 3
(21) First tooth 20 1 month 25 7.8 -0.1 2.5 -3 to +3 -4 to +6 45 10
(‘22) First word 20 1month 15 7.3 -0.5 1 .9 -2 to +1 -5 to +4 65 15
(23) Weight at 1 year 14 1 pound 14 22 .2
-
1.1 3.1 -4 to +Sf -54 to +5 50 16(24) Weight at 2 years 11 1 pound 0 27.2 -0.3 2.2 -2 to +fl -3 to +4 36 19
(25) Weight at 3 years 10 1 pound 30 33.2 +0.6 3.25 -2 to +44 -3 to +8 60 8
* N = number of cases. %P = percentage of agreement between mother’s response and clinic records. Record
=mean value from records. Mean=average deviation, for group, of mother’s report minus clinic record. %+
or -1 unit =percentage of errors between +1 and -1 unit.
TABLE III
ACCURACY OF MOTHERS’ RECALL OF INFORMATION*
age, the mothers reported labor as 0.95
hours shorter than indicated on tile records.
Because the N’s are small and the
distnibu-tions, at times, do not approximate the
non-mal curve, the Sigma and the Quartile and
Rull ranges are all presented in the Table.
The next to last column is yet another way
of evaluating the nature of inaccuracy;
namely, the percentage of errors between
+1 and
-
1 unit. Thus 25% of the cases werebetween + 1 and - 1 month of a perfect
score on Vaccination; 60% were between -{-1
and
-
1 lb of a perfect scone on Weight at 3Years. The final column is a means of
com-paring the relative accuracy of the items by
ranking the items according to the
percent-age of perfect responses to that question.
The following additional results did not
lend themselves to tabular representation:
(1) Duration of Gestation: Of the 25
chil-dren in tile sample, 16 had a 40-week
gesta-tion. All 16 of the mothers reported this to
the interviewer. For the nine children born
before 40 weeks, two mothers reported a
gestation time of 36 weeks correctly. In the
other seven instances in which tile mothers
called their children “full-term” babies, two
periods of gestation were 39 weeks, two
were 38 weeks, two were 36 weeks, and one
was 35 weeks.
(3) Use of Forceps at Delivery: Of the 23
children in the sample on whom agreement
between tile mothers report and the
pni-many record could be determined, 7 were
delivered with the aid of forceps; four of
the seven mothers reported this fact
cor-rectly. In the 16 instances in which children
were born without the aid of forceps, 15 of
the mothers reported tilis accurately.
(5) Neonatal Difficulties at Time of Dc-livery: Of the 25 children, 8 had difficulties
immediately following delivery (e.g.,
diffi-culty in beginning spontaneous respiration).
(3) Use of Forceps at Delivery: What
ap-PEDIATRIC
difficulty. All the mothers of the 17 children
Witil no difficulties reported this fact con-rectly.
(
6) Agreement between Mother’s andPhysician’s Rating of Ease of Delivery: The
mothers were asked: “Was this an easy or a
difficult delivery?” The records of labor and
delivery were reviewed by a senior member
of tile obstetric department and classified
in the same manner. Labor and delivery
were considered difficult if any of the
fol-lowing conditions was present: observed
active labor prolonged beyond 24 hours;
failure of normal progression of cervical
dilation; re(luirement of excessive use of
analgesia; blood pressure variation, either
hypotension or hypertension; abnormal
bleeding; evidences of fetal distress. If none
of tilese conditions was present and labor
proceeded at a normal rate, it was
con-sidered easy. According to these criteria, 5
of the 25 cases were judged difficult; but
two of tile five mothers recalled the cases as
easy. Two of the deliveries judged as easy
by tile physician were considered difficult
by tile mothers. Tile Fisher Exact
Probabil-ity Test revealed no significant difference
between the physicians’ evaluations and the
motilers recall. This indicates that mothers
are generally reliable in their memory of
tilis event.
(7) Did the Mother Nurse and for How Long? Of the 25 mothers in the sample, all
were connect in reporting whether on not
they nursed their children. Of the 12
moth-ers who did nurse their children, 7
con-rectly reported the lengtil of time nursed,
while 5 had errors in the reports; these
errors ranged from 1 to 5 weeks from the
actual time.
(8) Mothers’ Recall of Child’s First
For-miiki: Of tile sample of 25 mothers, 9 were
unable to make any comment on the
propor-tions of the ciliids first formula.
Interest-ingly, one mother claimed that her child
never used formula but went directly
in-stead to whole milk. The primary records
showed tilat tile child did, in fact, use
for-mula for a period of time.
(
26) illnesses: Each mother was asked tolist the illnesses that hen child had had
dun-ing his life. This information was compared
with tile Clinic records. Analysis of this
comparison showed that there were 58
epi-sodes of iiiness among the children of the
sample; the mothers listed only 26. In no
instance did a mother list a disease that
tile cllild had not had.
These diagnoses were roughly classified
as major or minor illnesses (Table IV).
Comments
The following are interpretative
corn-ments on the findings previously described.
(1) Duration of Gestation: The data
mdi-cate that if a mother says her child was
pre-mature, the cilances are that sile is correct
and also that she will probably report the
time of prematurity correctly. If, on the
other hand, a mother says tllat her child was
“full-term,” tile chances are about one in
tilnee tllat she is incorrect as to a full
40-week gestation time. However, in only three
of the nine cases in this category was the
error as great as 4 to 5 weeks; this might be
regarded by many pediatricians as
signifi-cant enough to alter tilein clinical evaluation
of maturity. We wish to point out that our
data include only gestation periods of 35 to
40 weeks.
(2) Duration of Labor: First the data on
duration of labor were analyzed to
de-tenmine if there was a relationship between
tile magnitude of the error and the absolute
length of labor. A correlation coefficient of
0.68 indicated that sucil a relationship does
exist and is statistically significant.
Next, an experienced pediatrician, who
was not active in this research, was asked to
indicate the cases in which the error would
change his clinical judgment of the childs
birth ilistory. He reported that in 4 of the
19 cases the error would significantly affect
his evaluation, and that in 2 other cases it
might be significant in light of other
find-ings. In every instance of change of
evalua-tion, the mother reported a silort labor,
when in fact, it had been of average
Major Minor
2 1
2 0
I I
6 1
1
7 4
1 0
2 1
2 (1
34 18
TABLE IV
CoIPAIuSoN OF ILLNESSES RECORDED ON hoSPITAL CIIAITs
VITlI TilE LisTINGS AS RECALLED BY MOTHERS
On Listed
Hospital by Charts Mothers
Listed
(‘harted by Mothers
(I) “Overwhelniing vireinia’’ (2) Measles
(3) Chicken POX
(4) Mumps (5) Pneumonia (6) Bronchitis
(7) Bronehiolitis (under I year and hospitalized)
(8) Tuberculosis (at age 1 year) (9) Roseola
(I 0) Convulsions (11) Otitis media
(12) Tonsillectomy
(13) Suspected rheumatoid arthritis (14) l)og l)ite (hospitalized 6 days) (15) Corrective shoes an(l l)races
I 1 (1) Conjunctivitis
10 5 (2) Constipation
4 1 (3) Vaginitis
1 1 (4) Skin rashes (eczenia an(l contact
dermatitis)
2 1 (5) Thrush
2 1 (6) Frequent colds
(7) Positive serologic test for
2 1 syphilis
I 0 (8) Speech disorders
I 0 (9) Sleeping disorders
3 3 -
-2 2 24 8
I 0
I 0
I I
2 1
pears to us to be important here is tile fact
that only four of seven mothers wilose
cilil-dren were delivered with the aid of forceps
reported this fact. It is quite possible that
tilese mothers were never told that forceps
were used to assist the delivery.
Neverthe-less, the fact remains that the mOtiler’s
in-formation covering tllis important aspect of
the child’s birth history may be inaccurate
and misleading. For practical clinical
assess-ment, therefore, tile pediatrician should
ob-tam his information from the primary
oh-stetric record whenever possible.
(4) lVeight at Birth: Stuart’s tables list
norms for weight at birth in percentiles of
25-75, 10-90, and 3-97. We decided that a
significant error in reporting birth weight
would be one in which the classification
changed from beyond the extreme 3-97% to
the middle 25-75%, or vice versa. Only 1
of the 12 errors in birth weight was
sig-nificant by tilis definition. For clinical uses,
therefore, we may conclude that this
in-formation is reliably reported.
(5) Immediate Difficulties following
Dc-livery: Of tile eight mothers in our study
whose children had had immediate
post-partum medical difficulties, only one
re-ported this fact. This may not be a defect
in the memory of the mother but may
in-stead be due to tile fact tllat in a large
num-her of instances the mothers are not
in-formed of these problems. By tile time the
motiler first sees her child outside the
de-livery room, these medical problems have
usually been corrected. This does not,
how-ever, alter tile inaccuracy of hen report and
its misleading implications for clinical
judg-ment by the pediatrician in this critical
period of tile child’s life. This finding of our
study again points out the need to ascertain
facts about the immediate neonatal period
from the obstetric record.
(
6) Agreement between Mothers’ andPhysicians’ Rating of Ease of Delivery:
Al-though statistically there was no significant
difference between mothers’ recall and
doe-tons’ ratings of labor and delivery, the
agree-ment was not perfect. We believe tilat a
differentiation should be made between
labor, during which the mother is generally
PEDIATRIC HISTORIES
mother is under tile effect of drugs. Her
re-call, tilUs, is a combination of events of
winch she may have information and events
of which she may not be informed.
A word of caution should also be added
about tile findings. Because of tile diagnostic
importance of this information, the lack of
perfect reliability on the mother’s part may
be important in individual cases. If other
clinical signs indicate the possibility of
oh-stetnical injury, it may be prudent for tile
physician to check the original record
in-stead of relying completely on tile mother’s
report.
(11) First Injection: Error may appear
here due to tile fact that, even though these
children were being followed up in the
hos-pital clinic monthly, it was still possible for
tilem to have received tileir first injection
from an outside physician as treatment for
an illness of short duration. We believe,
however, that this possible error was held
to a minimum.
(16) Crawl: An interesting incidental
find-ing is tllat 8 of the 19 mothers included in
tiliS question denied at the time of the
inter-view that their child had ever crawled. The
clinic records in all eight cases, however,
in-dicated that the child had gone through a
crawling phase. Our study is not complete
enough to offer an explanation for this
find-ing.
(13-22) Developmental Series: The
au-tilors wish to point out the high degree of
accuracy with which tile mothers were able
to recall the time at which their children
passed most of the developmental
land-marks.
(23-25) Weights at Each of the First 3 Years of Life: Stuart’s tables of norms were
again consulted to get the average weights
of the percentile groupings as described
under (4) of this section. The same criteria
of significance were followed. For 1 year,
the error of 4 of the 14 mothers was
sig-nificant; for 2 years there were no
signifi-cant errors; for 3 years there was only one
significant change.
Interestingly enough, a large proportion
of the mothers (40 to 60%) were unable to
give any estimate of their child’s weight at
a particular year. Thus it seems that
moth-ers often cannot remember weights, but
when tiley do, tiley tend to be accurate in
their recall.
(
26) Illnesses: The autilors wereim-pressed with tile finding that only 18 of the
34 episodes of major illness among the
chil-dren in our sample were spontaneously
re-called by their mothers. Omissions of this
sort could certainly prove significant in
diagnostic histories. No analysis of the
fac-tons causing lack of recall has been
at-tempted here, however.
Remarks: In all of the questions, we
ana-lyzed the errors to determine if there was
any tendency for tile mothers to report their
cilildren as more or less precocious. Only
on item, (17) Stand with Support, was any
statistically significant deviation observed.
On this item tile mothers tended to report
tile event as having occurred kiter tilan the
actual time listed in Clinic records.
Finally, the data were analyzed to see if
any characteristics of the mothers’ and/or
cilildren’s environments could be
demon-strated that would differentiate “reliable”
mothers from those whose recall was poor.
Crude analysis of such factors as age,
edu-cation, intelligence, schooling, sibling order,
etc., did not reveal any leads that would
have warranted more extensive analysis. No
evidence was found that the events in the
life of a first and/on only child are recalled
more accurately tilan those of a later child.
One final methodologic point: although
the mothers’ reports were evaluated in
terms of tile clinical records, this does not
mean that the latter are regarded as
in-fallible. Not only does the human error in
recording enter to some unknown degree,
but there also is tile possibility of confusion
due to definition of terms; e.g., the recording
of “talking” may vary from tile babbling
phase of speech development, through the
acquisition of one intelligible word, up to a
varied single word vocabulary.
A comparison of these data with the
find-ings of Macfanlane will be helpful in
es-TABLE V
PERCENTAGES OF AGREEMENT BETWEEN FINDINGS IN THE PRESENT STUDY AND MACFARLANES’ FINDINGS
Perfect Agreement
Present Study Maefarlane
(1) Duration of gestation (4) Weight at birth (20) Vs’alk alone
(21) First tooth
(2) Duration of labor (‘23) \‘eight at I year
88 89
52 59
48 49
25 36
14 10
14 9
pecially since the two populations differ in
composition and in the length of time
elapsed before interviewing the motilers.
‘vlany of tile results are surprisingly similar.
Macfanlane’s group ranked six items for
whicil they determined the percentage of
perfect answers in exactly the same order
of accuracy which we obtained (Table V).
Our findings also are in agreement with
Macfarlane’s findings quoted in the
intro-ductory section. We find no evidence,
ilow-ever, that mothers tend to report their
cliii-dren as more precocious than they actually
are. This may be a function of the different
socioeconomic groups studied in the two
different reports-the population in our
study tending to fall in a somewilat lower
socioeconomic group than tile Berkeley
families.
COMMENT
As pediatricians become more skilled in
interpreting developmental data in the
ins-tories of their patients, they have become
increasingly adept at the diagnosis of many
pediatric conditions (e.g., the differentiation
of genetically retarded development from
environmentally influenced deviations). Of
necessity, they rely principally on the
motil-ens’ memory for this vital information.
With the increasing mobility of the
popu-lation and its transient character, the
moth-er’s report of the developmental history of
ilen children is often the sole source of
im-portant facts. The growth of public health
centers for child care and the expansion of
child-guidance facilities have increased the
need for tllis type of factual information. It
tilerefore becomes important to assess the
mother’s reliability in this regard.
We recognize that in medical
llistory-taking tile proper framing of a question is
all-important in obtaining a significant
re-sponse, and that the present study can be
criticized as to the reproducibility of the
ne-sponses. To meet this criticism, an attempt
was made to simulate, as nearly as possible,
the normal pediatric interview, and to
for-mulate questions that would bring out
fac-tual answers rather than responses qualified
by feeling. We have endeavored to provide
a statistical basis for the clinical judgment
of reliability.
Although tilis study is limited, it does
in-dicate that reliability in historical facts
can-not be taken for granted. Further studies
are clearly indicated to ascertain what
fur-then distortion of reliability is produced by
prolonging the span of time over which a
mother’s memory extends, and to
investi-gate what effects the stress of illness and
re-tarded development may produce in
ac-curacy. Further investigations are
contem-plated.
SUMMARY
A study of reliability of elementary facts
in the developmental instonies of pediatric
patients in a clinic population is reported.
The results show that mothers do not
re-port gestation time reliably, that many
mothers are incorrect when they state that
forceps were not used in the delivery, that
few mothers can report accurately the
im-mediate difficulties at tile delivery of tileir
infant. Feeding ilistonies revealed many
discrepancies in duration of nursing and
knowledge of formula composition. Many
motilenS forget or overlook a significant
number of illnesses. On tile other hand, the
mother’s evaluation of difficulty of labor
and delivery agrees \vitll that of the
physi-cian; facts concerning weight at birth, and
at subsequent yearly intervals, and details
of motor development are reported with
ac-curacy.
PEDIATRIC HISTORIES
in interpreting tile relevance of factual
de-veiopmental data to differential diagnosis
in deviant behavior, peninatal studies, etc.,
and further areas of investigation of
reli-ability are suggested.
REFERENCES
1. Macfarlane,
J.
W. : Studies in child guidance: I.Methodology of data collection and
organiza-tion. Monogr. Soc. Res. Child Develop., Vol.
Ii!, No. 6., 1938.
2. Chess, S., et al.: Iniphications of a longitudinal
study of child development for Child
Psychia-try. Amer. J. Psychiat., 117:434, 1960.
Acknowledgment
We wish to acknowledge the co-operation of Dr.
David Comfeld, Director of the Pediatric
Out-Pa-tient Department of the Hospital of the University
of Pennsylvania, in permitting us to use the clinic
population for this study, and for his helpful evalu-ation of our data.
We also wish to thank Dr. Robert C. McElroy, of