Would a CAD-system work as a standalone reader?

Newer generations of machine learning techniques such as deep learning algorithms and hardware with high computational power have great potential to change the field of medical imaging. With deep learning algorithms many complex visual tasks can be performed fully automated. Many experts question the necessity of a human reader for certain tasks in the foreseeable future. In chapters 6-8, a commercially available CAD system for ABUS is investigated as an aid to improve the performance of breast radiologists. In figure 9.1 the performance of CAD is shown in a free-response receiver operating curve (FROC). A FROC curve shows all trade-offs between the number of marked false-positive CAD regions and the fraction of cancers detected correctly by the software. At the threshold of 1 false positive CAD-mark per ABUS volume the CAD program has a sensitivity of 72% on the study data set in chapter 6 and 82% on the dataset in chapter 7. Note that in both studies the cases in the datasets were excluded from the dataset used to train the CAD algorithms.

Summary and general discussion Chapter 9

9

In figures 9.2 and 9.3 we compared the location corrected ROC (LROC) curves of CAD as a standalone reader with the pooled curves of the readers on the datasets used in chapters 6 and 7. The data of both readers and CAD are treated similarly in computing the LROC curves and the AUC. For each case the highest score is used in the analysis with the exception of malignant cases where the cancer is not marked in the ABUS volume (i.e. the mark is given for something else). These cases were treated as if they were classified as normal. By doing so, both human readers and the CAD system are not rewarded for recall of cases where they actually missed a cancer. According to these results, the conclusion is that CAD software as a standalone reader is not (yet) on par with human readers reading with or without QVCAD assistance (including the promising intelligent MinIP image). Compared to highly trained human observers the CAD system needs to generate many false positive findings in order to achieve acceptable sensitivity rates in our (enriched) study datasets. In chapters 6 and 7 we used the threshold of 1 false positive per ABUS volume to limit the number of CAD-marks to a reasonable proportion. Improving the

Figure 9. 1 Free Response

Receiver Operating Characteristics curve of QVCAD as a standalone reader on the ABUS datasets used in chapters 6 and 7.

Figure 9. 2 Location

corrected Receiver Operating Characteristics showing the performance of QVCAD compared to the pooled performance of the human readers in reading ABUS with and without QVCAD support in chapter 6.

Summary and general discussion Chapter 9 180 181

9 Future directions

The debate on the role of ABUS in today’s breast cancer diagnostic field is not concluded. Most breast imaging experts agree that supplemental imaging modalities are wanted and that breast magnetic imaging is the most accurate technique for detecting breast cancer that is occult on mammography. But breast MR requires substantial resources and currently still needs intravenous contrast agent injection. DBT and whole breast ultrasound are less costly alternatives, but it is unclear which modality to use for specific risk groups of asymptomatic women. Development of personalized multimodal screening strategies using advanced risk modelling might help to determine which screening pathway should be offered to women with varying risk profiles (248). However, risk models using data from current published literature might not be able to accurately estimate the cost-effectiveness ratio of these supplemental techniques due to the wide range of false positive recalls generated by DBT or whole breast ultrasound reported in literature. It appears that particularly for ultrasound the number of false positive recalls declines heavily when used in subsequent screening rounds. Still, according to current literature radiologists who perform whole breast ultrasound with ABUS as a supplement to mammography will increase the number of false-positive recalls largely. In the available studies visual assessment of only grayscale B-mode ultrasound is performed. Functional imaging such as (shear wave) elastography for ABUS has the potential to help avoid false-positive recalls (249–252). Other functional parameters such three-dimensional power Doppler may also have additional value in differentiating benign from malignant breast disease in ABUS. Quantitative breast ultrasound using multiparametric radiofrequency (RF) data derived from the ABUS systems should also be further explored. We also expect that the combination of standard B-mode sonography and multiparametric quantitative (functional) analysis will improve the differentiation of breast lesions, however such an approach needs validation in diagnostic and in screening settings.

ABUS is developed as a supplemental screening modality rather than a primary screening modality and therefore will require additional time, personnel and clinical space. Combining mammography or DBT with ABUS in one clinical system would be a very efficient development. Schaefgen et al (253) and Larson et al (254) have shown preliminary results of prototype systems that fuse DBT/mammography and ABUS into one system. The ABUS transducer is mechanically driven over the compression paddle of the DBT system while the breast is compressed for the CC or MLO acquisitions. This will improve the spatial correlation between the two modalities. Nevertheless, Schaefgen et al reported an additional 80 seconds acquisition time for the ABUS acquisition on top of the 25 seconds needed for a DBT view acquisition, which might limit its practicality. DBT and mammography is mostly experienced by women as very uncomfortable and even painful. A woman’s breast is compressed between the compression paddle and the detection plate of a mammography/DBT system. Holländer et al (255) recently showed that the acquisition time of ABUS may be optimised using plain wave compounding instead of spatial compounding without losing vital image quality in a breast phantom study. This may create a new window of opportunity for a more successful fused ABUS/DBT system, as using a

faster acquisition technique would decrease the acquisition time of ABUS to mere seconds and consequently would lessen the discomfort of such an examination.

In this thesis we investigated the implementation of CAD and the reading performance of radiologists who used a commercially developed CAD software package that is primarily designed to be used as a concurrent reading tool. In general, the CAD systems on the market are designed to assist radiologists during the tasks of detecting breast lesions and classifying lesions as suspicious for cancer or not. The CAD software we used is, according to our results, valid to use as an aid in reading ABUS but is not yet up to par with breast radiologists when used as a standalone reader. Nevertheless, artificial intelligence and in particular machine learning for medical imaging is a rapidly evolving field. Computers with high computational power are currently able to use complex algorithms for improved lesion detection and classification. It is expected that with newer technology such as advanced graphic cards with heavy computational power even more complex machine learning software will be developed to automatically perform the more advanced tasks of radiologists. The machine learning algorithms of the future may use all information available from an ABUS scanner, including RF data and functional imaging and combine it with all the information from other modalities like breast MR and DBT to generate a single diagnosis for a woman, maybe even without the interference of a radiologist.

Chapter 10

In document Automated 3D breast ultrasound. Advances in breast cancer detection, diagnosis and screening (Page 91-94)