• No results found

Reader and Computer-Aided Detection

Original title: A new paradigm for computer aided diagnosis of radiographic image data: using the computer as an independent reader

S. Schalekamp; N. Karssemeijer; C. Schaefer-Prokop; B. van Ginneken Submitted

Abstract

Background: Computer-aided detection (CAD) systems have so far only been used as a prompting tool, pointing radiologists to regions that warrant a second look. We propose a new paradigm, where independent interpretations by a radiologist and the CAD system are combined with a weighting formula.

Methods: We analyzed data from two large observer studies for tumor detection in chest radiography and breast mammography. The chest radiography study used 300 CXRs, with 111 CT proven solitary pulmonary nodules. The mammography study used 200 full field digital mammography exams with 80 biopsy proven malignancies. In both studies, twelve different radiologists marked and scored suspicious regions, first without and then with marks of state-of-the-art CAD systems. AFROC MRMC analysis was used to measure detection performance, defined as mean sensitivity in the clinically relevant high specificity range between 80 and 100%. Scores of the radiologists without CAD were then combined with CAD scores at the location of reader findings, using a weighted averaging. Performances were compared for CAD standalone, human standalone, human reading with CAD marks, independent combination of human and CAD scores, and human double reading.

Results: For chest and breast, CAD standalone was worse (0.353 / 0.503) than human reading standalone (0.640 / 0.569), but independent combination with CAD significantly improved performance compared to unaided reading (0.686; P=0.003 / 0.635; P<0.001) and 23 of the 24 observers improved. For chest, independent combination was comparable to reading with CAD marks (0.670; P=0.07) and significantly worse than double reading (0.731; P=0.007). For breast, independent combination was significantly superior to reading with CAD marks (0.573; P<0.001) and comparable with double reading (0.645; P=0.28).

Conclusion: Independent combination of a human observer and a computer system has the ability to outperform the traditional way of using CAD marks in both chest radiography and mammography.

Introduction

Breast cancer and lung cancer are the two most frequent (non-skin) cancers and are

the two cancer sites with the highest mortality rates worldwide1. Despite the advent

of advanced imaging techniques, such as low dose computed tomography (CT) for the lungs and magnetic resonance imaging for the breasts, plain radiographic imaging is still by far the most widely used modality for detection and diagnosis of these cancers, both in clinical and screening practice. Early detection of breast and

lung cancer is crucial for survival and thus highly desirable2,3. Therefore it seems

inadmissible that around 20% of retrospectively visible and thus detectable cancers

are missed in screening mammography4-6 and chest radiography7,8. Such errors

result in delayed diagnosis, and an overall increased mortality due to more advanced disease stages at the time of diagnosis. Moreover, errors in diagnosis related to breast and lung cancer are among the most common causes of medical malpractice suits

against radiologists9.

To reduce miss rates computer-aided detection (CAD) systems have been developed for mammography and chest radiography. The United States Food and Drug Administration has cleared CAD systems only for use as an assistant to the radiologist. A possible explanation for this is that standalone CAD systems performed inferior compared with radiologists. In this “second reader paradigm” the CAD system is meant to avoid oversight of lesions by alerting the radiologist to suspicious areas in the image after his or her initial interpretation. When the radiologist accepts CAD marks on true lesions that were missed in the initial read, sensitivity increases; however, when CAD marks are accepted erroneously, specificity decreases. Many studies found a modest or nonexistent increase in detection performance when CAD

is being used as a second reader10-15. These disappointing results contrast with the

fact that CAD systems marked many (cancerous) lesions that were missed by the

radiologist16-19.

Using CAD as a second reader has the drawback that the workflow of the radiologist has to be adapted. Reading time of the examinations will inevitably increase. Finally, only showing CAD marks does not convey to the human reader how suspicious the area is estimated to be by the computer analysis. Therefore we decided to investigate alternative ways to combine computer and human reading that are possibly more effective and impose fewer burdens on workflow. We propose a method where the crucial difference is that the task of combining computer scores and human scores is not delegated to the human reader (imposing additional burden on the radiologist) but to a computer, via a simple weighted averaging. This paradigm has the advantage that the radiologist does not need to inspect CAD marks, he reads the cases as usual, without CAD, but assigns a score indicating the degree of suspicion to each finding. Assignment of risk scores is something radiologists are already familiar with,

e.g. in mammography (BI-RADS)20 or as recently been suggested for lung nodules

in CT (Lung-RADS)21.

Methods

Data from two large observer studies, one using mammography and one using chest radiography (Figure 1), were used for this study. Written informed consent was waived by the institutional review board for this retrospective analysis.