PEDIATRICS Vol. 90 No. 1 July 1992 11
Bias
and
‘Overcall’
in Interpreting
Chest
Radiographs
in Young
Febrile
Children
Michael S. Kramer, MD*; Ren#{233}eRoberts-Br#{228}uer, MA*; and Robert L. Williams, MD*
ABSTRACT. Few studies have examined the diagnostic
validity of the examining physician’s interpretation of
chest radiographs in young febrile children, and none
(to our knowledge) the extent to which the “official” (ie,
the radiologist’s) reading may be biased by access to the
examining physician’s reading and to other clinical
in-formation. The authors studied 287 consecutive chest
radiographs obtained in 286 febrile children 3 to 24
months of age without chronic cardiopulmonary disease
or known asthma who presented to a children’s hospital
emergency department between March 1989 and August
1990. The readings by treating pediatricians, official
pe-diatric radiologists, and a “blind” pediatric radiologist
were compared. Official radiologists had access to the
treating pediatricians’ readings and the clinical
infor-mation provided on the radiography requisition. The
blind radiologist knew only that each child was 3 to 24
months of age and febrile, and he was asked to judge the
presence or absence of pneumonia. Using the blind
ra-diologist’s reading as the “gold standard” for judging
validity of the treating physicians’ and official radiolo-gists’ readings, sensitivity (.677 vs .647), specificity (.828
vs .849), positive predictive value (PPV, .537 vs .571), and
kappa index (x, .462 vs .475) were quite similar. By
con-trast, agreement by the treating physicians was
consid-erably higher with the official radiologists’ readings as
gold standard: sensitivity .756, specificity .922, PPV
= .795, and x = .688. When the treating physician’s
reading was positive, the official radiologists’ positivity
rate was much higher than the blind radiologist’s
(74.4% vs 51.8%, P < .005), sensitivity was high (.884) but
specificity was low (.436), PPV was .663, and x was .326.
When the treating physicians’ reading was negative,
however, the pattern was reversed: positivity 8.5% vs
12.8% (P not significant), sensitivity .240, specificity
.937, PPV = .353, and K = .205. Surprisingly, none of the
three sets of readings appeared to be influenced by the
reporting of clinical signs and symptoms on the
radiog-raphy requisition. These results indicate that official
radiologists are strongly biased by the treating
physi-cian’s reading. Since such a bias can lead to unnecessary
antibiotic treatment and hospital admission, strategies to
reduce it should receive high priority. Pediatrics
199290:11-13; chest radiograph, pneumonia, fever,
diag-nostic tests.
From the Departments of Pediatrics, Radiology, and §Epidemiology and Biostatistics, McGill University Faculty of Medicine, Montreal, Quebec,
Can-ada.
Received for publication Aug 16, 1991; accepted Dec 5, 1991.
Presented, in part, at the annual meeting of the Ambulatory Pediatric Association, New Orleans, LA, May 2, 1991.
Reprint requests to (M.S.K.) 1020 Pine Ave West, Montreal, Quebec H3A
1A2, Canada.
PEDIATRICS (ISSN 0031 4005). Copyright © 1992 by the American
Acad-emy of Pediatrics.
ABBREVIATIONS. PPV, positive predictive value; CI, confidence interval.
The chest radiograph is one of the diagnostic tests
most frequently obtained in the evaluation of young
febrile children, particularly those with respiratory signs and symptoms.’5 Although office-based
prac-titioners often rely on clinical signs, most physicians
who work in tertiary care settings depend on the chest radiograph for making decisions concerning diagno-sis, antibiotic treatment, and hospital admission.
Whose reading should be used as the basis for these
decisions? In many emergency departments and
“walk-in” clinics, it is difficult or impossible to obtain a reading by a radiologist, particularly during eve-nings, nights, and weekends. Thus in these settings, it is the treating physician’s reading that must be used as the basis for diagnostic and therapeutic decision
making. The “officia1 reading by a radiologist is
per-formed later, and procedures vary as to whether
discrepancies between the two readings lead to
changes in the originally prescribed treatment. The radiologist who interprets the chest radiograph often has access both to the treating physician’s read-ing of the film and to the radiography requisition form, which contains clinical information concerning signs and symptoms as reported by the treating phy-sician. It is reasonable to hypothesize that the official
reading may be biased, whether consciously or
un-consciously, by this information. To our knowledge,
however, no previous studies have examined the
existence, direction, and magnitude of this bias or
evaluated its potential impact on the diagnosis and
treatment of pneumonia in young children. This study
is an attempt to fill that void.
METHODS
The study comprises 287 chest radiographs obtained in 286 children between the ages of 3 and 24 months with a documented ‘rectal equivalent” temperature of at least 38#{176}C.(“Rectal equivalent’ temperatures were obtained by adding 0.5#{176}Cto oral temperatures and 0.8#{176}Cto axillary temperatures.) Of the 287 total chest radio-graphs, 241 (84%) included both frontal (anteroposterior) and lateral supine views, while 46 (16%) were limited to frontal views. All study children were nonreferred and presented for their initial visit to the emergency department of the Montreal Children’s Hospital between March 1989 and August 1990. Children with chronic cardiopulmonary disease were excluded, as were known asthmatics presenting with acute attacks.
We compared the readings of these same chest radiographs by the treating attending staff pediatrician or senior pediatric resident; one of six pediatric radiologists whose subsequent reading, usually the following day, became the ‘official’ one contained in the medical record; and a ‘blind’ reading by a single pediatric
at Viet Nam:AAP Sponsored on September 1, 2020
www.aappublications.org/news
12 BIAS IN INTERPRETING CHEST RADIOGRAPHS ogist (R.L.W.). (Three additional chest radiographs obtained during the study period had no official reading available either in the medical record or in the Radiology Department file and were therefore not included among the 287 films under study.) The ‘official’ radiologist had access both to the signs and symptoms listed by the treating physician on the radiography requisition form and to the treating physician’s reading of the film. The blind radiologist, on the other hand, knew only the age of the child, that the child was febrile, and that the diagnostic question under study was the presence or absence of pneumonia. The blind radiologist’s diagnosis of pneumonia was based on his observation of one or more focal areas of consolidation or infiltration; bronchial wall thickening or subsegmental atelectasis alone were insufficient. Each of the three readings was interpreted as either positive, negative, or questionable for pneumonia.
We defined the positivity rate for each of the three sets of readings as the proportion of positive readings and half the ques-tionable readings. Sensitivity is the proportion of true pneumonias (based on a specified ‘gold standard’) read as positive, while specificity is the proportion of true nonpneumonias read as nega-tive. Positive predictive value (PPV) is the proportion of positive readings with true pneumonia. Each of the above-mentioned in-dices is reported with its corresponding 95% confidence interval
(CI), based on the normal approximation to the binomial distribu-tion. Finally, kappa (K) is an index of overall agreement corrected
for the extent of agreement expected by chance alone, with 95% CIs estimated according to the method of Cohen.6 In calculating concordance indices (sensitivity, specificity, predicted values, and K) between pairs of readings, we gave the benefit of the doubt to agreement in the relatively few cases when one of the observers interpreted a film as questionable. In other words, two readings were considered concordant if one reader interpreted a given film as positive (or negative) and the other reader interpreted that film as questionable.
RESULTS
Table 1 contains a description of the study cohort.
The treating physicians read 29.6% of the films as
positive for pneumonia, compared with 23.9% for the
blind radiologist (P < .05 by McNemar
x2
test). Thepositivity rate for the official radiologists was 27.2%,
ie, intermediate between these two extremes but
closer to the treating physicians’ rate. This pattern
was observed in both admitted and nonadmitted
cases.
Using the blind radiologist’s reading as the “gold standard” for judging validity, the sensitivity, speci-ficity, and FFV of the treating physicians’ and official radiologists’ readings were quite similar (Table 2). The
I( index of interobserver agreement with the blind
TABLE 1. Description of Study C ohort (n = 287)
Age, mo (mean ± SD) 10.7 ± 6.0
Temperature, #{176}C(mean ± SD)
WBC5 count, cells/mm3
39.1 ± 0.9 15 152 ± 7728 (mean ± SD) (n = 121)
Male sex, % 56.8
Admitted, % 13.6
5 WBC, white blood cell.
TABLE 2. Validity of Treating Physicians’ and Official Radi-ologists’ Readings5
Index Treating
Physicians
Official Radiologists
Sensitivity .677 (.563-791) .647 (.533-761) Specificity .828 (.778-878) .849 (.802-897)
PPV .537 (.429-645) .571 (.461-682)
K .462 (.340-584) .475 (.353-597)
‘Blind radiologist’s readings considered as ‘gold standard.’ 95% confidence intervals given in parentheses. PPV, positive predictive value; K, kappa index.
radiologist was also similar for the treating physicians and for the official radiologists, indicating only mod-est overall agreement.
By contrast, agreement by the treating physicians
was considerably higher with the official radiologists’
readings as gold standard: sensitivity = .756 (95% CI
= .663-.849), specificity = .922 (.885-959), FFV =
.795 (.705-.884), and ic = .688 (.593-.783).
These differences in agreement, coupled with the
pattern of positivity rates mentioned earlier,
sug-gested to us either that the blind radiologist had a higher threshold for diagnosis of pneumonia or that
the official radiologists’ readings may have been
biased by the treating physicians’ readings or by
clinical information reported by the treating physi-cians on the radiography requisition. To explore these
possibilities, we first compared the agreement
be-tween the readings by the official and blind radiolo-gists when the treating physician’s reading was posi-tive vs negative (Table 3). When the treating physi-cian’s reading of the film was positive, the official radiologists’ positivity rate was much higher than the blind radiologist’s: 74.4% vs 51.8% (P < .005 by McNemar
x2
test). Considering the blind radiologist’sreading as the “gold standard,’ sensitivity of the
of-ficial radiologists’ readings was an excellent 0.884, although specificity was low at .436 (ie, the rate of
false-positives exceeded that of false-negatives), FFV was .633, and K was .326. When the treating
physi-dan’s reading was negative, however, the situation
changed dramatically. The official radiologists’ posi-tivity rate was actually lower than the blind radiolo-gist’s: 8.5% vs 12.8% (P not significant). Sensitivity
was only .240, but specificity was .937, indicating a much higher rate of false-negatives than false-posi-tives; FFV was only .353, and K was a meager .205.
These findings indicate that the blind radiologist did not necessarily have a nonspecifically higher
threshold for diagnosis of pneumonia. He was at least
as likely to diagnose pneumonia as the official
radi-ologists in those cases when the treating physician
did not, but much less likely to diagnose pneumonia when the treating physician did. Thus the data
sug-gest that official radiologists may be biased toward
confirming positive readings by the treating physi-cian, even at the risk of overcall, ie, false-positives.
We next examined the effect of respiratory signs and symptoms reported on the radiography
requisi-tion. The presence of fever was recorded in 61%,
cough in 48%, wheezing in 18%, retractions or other
forms of respiratory distress in 12%, rales in 9%, tachypnea in 7%, decreased air entry and grunting in 3% each, and vomiting and stridor in 1 % each. To
our surprise, a positive reading was not significantly associated with the reporting of these clinical signs and symptoms for any of the three sets of readings. (The low prevalence of rales, tachypnea, and grunt-ing, however, provides poor statistical power for de-tecting associations with these clinical signs, which are frequently cited79 as indicators of pneumonia.)
DISCUSSION
Our data suggest that official radiologists, whose
readings of chest radiographs may impact on
diag-nostic and therapeutic decisions, can be strongly
at Viet Nam:AAP Sponsored on September 1, 2020
www.aappublications.org/news
ARTICLES 13
TABLE 3. Agreement Between Of ficial and ‘Blind’ Radiologists as Function of Treating Physician’s
Reading5
Index Treating Physician
Reading Positive
Treating Physician
Reading Negative
Fositivity rate, % Official radiologists Blind radiologist Sensitivity Specificity PPV ‘C 74.4 (64.9-83.8) 51.8 (41.0-62.6) 88.4 (.788-980) .436 (.280-592) .633(511-755) .326 (.326-534) 8.5 (4.7-12.4) 12.8 (8.2-17.5) .240 (.073-407) .937 (.900-973) .353(126-580) .205 (.076-334)
S Blind radiologist’s readings considered as ‘gold standard.’ 95% confidence intervals given in
paren-theses. PPV, positive predictive value; ic, kappa index.
biased by the treating physician’s reading. This is demonstrated by the high positivity rate and
sensitiv-ity and low specificity of the official radiologists’
reading (as compared to the blind radiologist’s “gold standard” reading) when the treating physician’s reading was positive, but exactly the reverse pattern when the treating physician’s reading was negative.
We do not believe that our findings can be
ex-plained by an educational impact by the official ra-diologists on the treating physicians. Few of the at-tending staff physicians in our emergency department received their formal training in interpreting chest radiographs from the official radiologists included in this study. Moreover, there is no systematic feedback of the official reading to the treating physician. Ra-diographs initially read as negative by the treating physician but subsequently interpreted as positive by an official radiologist may result in a child’s being recalled for reevaluation or antibiotic treatment, but
such recall is arranged by an attending staff
pediatri-cian on duty at the time the reading is revised, not by the original treating physician.
Several factors may limit the generalizability of our findings, however. First, in some (probably rare)
set-tings, official radiologists may be available at all
hours, thus obviating the treating physician’s need to make a radiographic diagnosis. Second, in those set-tings where official radiologists do not have access to the treating physician’s reading (and therefore could not be biased by it), the official reading can probably be considered more of a “gold standard.” Although we are unaware of the relative frequency with which these different practice patterns occur, we suspect that there are a sufficient number of settings like our own to make our results of general interest to
pedia-tricians, radiologists, and other clinicians practicing in
such settings.
With these caveats in mind, our results have several
important implications for the diagnosis and treat-ment of young febrile children, particularly those who are neither seriously ill nor have clinical signs strongly suggestive of pneumonia. Overcall either by treating
physicians or official radiologists can result in
unnec-essary antibiotic treatment and even (though probably
rarely) some unnecessary hospitalization. Most pneu-monias in this age group are viral,’0” even those
associated with a lobar radiographic pattern.’2”3
Moreover, the benefit of antibiotic treatment has not been demonstrated.’4 Any tendency to overcall
pneu-monia will therefore further increase the number of
children who are treated and hospitalized unneces-sarily. Although the relative risks and benefits of undertreatment vs overtreatment of childhood pneu-monia have not been adequately assessed, our
find-ings suggest that in settings where radiologists’
read-ings can be influenced (biased) by knowledge of the treating physician’s reading, overcall of chest radio-graphs has the potential for doing harm. Strategies for reducing such bias and overcall should receive high priority by clinicians and investigators alike.
ACKNOWLEDGMENTS
This work was supported by a grant from the Fonds de la recherche en sante du Qu#{233}bec (FRSQ) and was carried out while Dr Kramer was a senior career investigator of the FRSQ.
REFERENCES
I. McCarthy PL. Controversies in pediatrics: what tests are indicated for the child under 2 with fever? Pediatr Rev. l979;l:51-56
2. Soman M. Diagnostic workup of febrile children under 24 months of age: a clinical review. West J Med. 1982;137:1-12
3. Long SS. Approach to the febrile patient with no obvious focus of infection. Pediatr Ret’. 1984;5:305-31 5
4. McCutcheon ML. The febrile infant. I Fan Pract. 1985;20:584-588 5. Patterson RJ, Bisset CS, Kirks DR. Vanness A. Chest radiographs in the
evaluation of the febrile infant. AIR. 1990;155:833-835
6. Cohen J. A coefficient of agreement for nominal scales. Educ Psycho! Meas. 1960;20:37-46
7. Leventhal JM. Clinical predictors of pneumonia as a guide to ordering chest roentgenograms. Cur: Pediatr (Phila). 1982;21:730-734
8. Zukin DD, Hoffman JR. Cleveland RH, Kushner DC, Herman TE. Correlation of pulmonary signs and symptoms with chest radiographs in the pediatric age group. Ann Emery Med. 1986;15:792-796
9. Crossman LK, Caplan SE. Clinical, laboratory, and radiological infor-mation in the diagnosis of pneumonia in children. Ann Emerg Med. 1988; 17:43-46
10. Turner RB, Lande AE, Chase F, Hilton, N, Weinberg D. Pneumonia in pediatric outpatients: cause and clinical manifestations. IPediatr. 1987;
I I 1:194-200.
I I. Claesson BA, Trolifors B, Brolin I, et al. Etiology of community-acquired pneumonia in children based on antibody responses to bacterial and viral antigens. Pediatr Infect Dis I.1989;8:856-862
12. McCarthy PL, Spiesel SZ, Stashwick CA, Ablow RC, Masters SJ, Dolan TF. Radiographic findings and etiologic diagnosis in ambulatory child-hood pneumonias. Clin Pediatr (P!ii!a). 1981;20:686-691
13. Bettenay FAL, deCampoJF, McCrossin DB. Differentiating bacterial from viral pneumonias in children. Pediatr Radio!. 1988;18:453-454
14. Friis B, Andersen P. Brenoe E, et al. Antibiotic treatment of pneumonia and bronchiolitis: a prospective randomized study. Arch D:s Child. 1984;
59:1038-1045
at Viet Nam:AAP Sponsored on September 1, 2020
www.aappublications.org/news
1992;90;11
Pediatrics
Michael S. Kramer, Renée Roberts-Bräuer and Robert L. Williams
Bias and `Overcall' in Interpreting Chest Radiographs in Young Febrile Children
Services
Updated Information &
http://pediatrics.aappublications.org/content/90/1/11
including high resolution figures, can be found at:
Permissions & Licensing
http://www.aappublications.org/site/misc/Permissions.xhtml
entirety can be found online at:
Information about reproducing this article in parts (figures, tables) or in its
Reprints
http://www.aappublications.org/site/misc/reprints.xhtml
Information about ordering reprints can be found online:
at Viet Nam:AAP Sponsored on September 1, 2020
www.aappublications.org/news
1992;90;11
Pediatrics
Michael S. Kramer, Renée Roberts-Bräuer and Robert L. Williams
Bias and `Overcall' in Interpreting Chest Radiographs in Young Febrile Children
http://pediatrics.aappublications.org/content/90/1/11
the World Wide Web at:
The online version of this article, along with updated information and services, is located on
American Academy of Pediatrics. All rights reserved. Print ISSN: 1073-0397.
American Academy of Pediatrics, 345 Park Avenue, Itasca, Illinois, 60143. Copyright © 1992 by the
been published continuously since 1948. Pediatrics is owned, published, and trademarked by the
Pediatrics is the official journal of the American Academy of Pediatrics. A monthly publication, it has
at Viet Nam:AAP Sponsored on September 1, 2020
www.aappublications.org/news