• No results found

tice

Two scans of the same patient acquired sequentially using two CT scanners are subject to inter-scan variability, patient position, heart rate, and inspiration levels, which are found to be the primary causes of measurement variability in chest CT ([14]). Despite the high reproducibility and inherent accuracy of clinically acquired thoracic CT scans, there is only a moderate level of agreement as far as intra- and inter-observer measurements of detected pulmonary nodules.

In order to illustrate the inherent inter- (between observers) and intra- (within observers) observer variability when the measurements are based on ill-defined bound- aries, we selected the clinical application of assessing part-solid nodule growth rate. However, any boundary tracing in medical imaging is subject to inter- and intra- user variability when there are poor margins. Similarly, our collaborating radiologist confirmed the inter-user variability in the manual LV segmentation from the TEE data; therefore, to serve as a gold standard, we needed a consensus established and approved by an experienced cardiologist. For the Cine MRI data, there also was the requirement to reach consensus between the different LV segmentation approaches to serve as the gold standard for the challenge. Note that the apex and base slices are the most challenging to segment because very often their margins are indistinct and am- biguous, increasing user variability and making manual and automated segmentation difficult [30].

Achieving consensus between observers requires precision; therefore, it is more important than the accuracy, since the actual measure is unknown and an accuracy study cannot be easily conducted. The segmentation and quantification of volumetric changes of both solid and sub-solid nodules is an active area of research with grow- ing interest in clinical practice, with a greater focus on precision and measurement variability, specifically on pulmonary sub-solid nodules [10, 27, 28, 31].

GGNs and the presence of poorly defined margins, which often misleads the observer, leading to quantification error. For example intra- and inter-observer variability for whole nodule size and solid portion size ranged from -3.45 mm to +2.91 mm [32], while previous part-solid studies reported that the smallest inter-observer agreement range was -1.14 mm to +1.72 mm, with the largest ranging from -7.7 mm to +1.7 mm, within the 95% confidence interval. Additional studies, which analyzed measurement variability ([28, 33–38]), showed a variety of inter and intra-observer agreement levels, but mostly a modest inter-observer agreement on solid portion measurements.

The traditional method for nodule change assessment is diameter measurement on CT scans; however, these measurements are often inconsistent [36, 39–42]. Con- sequentially, in addition to these measurements, there is a side by side comparison, which requires 3D alignment. Image registration is used to align the scans prior to the nodule change assessment. The ultimate goal is to help the radiologist in this process by emphasizing the true nodule change and reducing variability and ambiguity.

An approach which has been shown to be efficient in achieving this goal has been suggested in a study on assessment of volume and density change in ground-glass nodules conducted by Staring et al. [43]. This study showed the advantage of assessing the GGN change when the subtraction image post-registration is available. The use of the image subtraction improved the inter-observer agreement and significantly improved the confidence of the observers. The perceived need for the nodule diameter measurement varied strongly between the observers, for some of the observers 57% of the nodules needed diameter measurements, while other observers used diameter measurement only in 5% of image pairs. The need for diameter measurements dropped substantially when the subtraction image was available. They concluded that image subtraction after registration improves the evaluation of subtle changes in sub-solid nodules and decreases inter-observer variability. The subtraction image emphasizes the differences, which appear as black and white edges, over a similar area represented in gray. This can be explained by the well known fact that the edges represent the image content required for interpretation and the human eye is more sensitive to high frequencies, i.e. edges. Nevertheless, nodule assessment based on the image subtraction depends on the accuracy of the initial and follow-up image registration.

Although measurement agreement remains challenging, significant research efforts have shown potential for faithful and early detection of malignant pulmonary nod- ules [21]; hence, rendering the accurate interpretation and management of sub-solid nodules in follow-up CT is critically important. Current criteria for assessing lung nodule growth rely on longitudinal cross-sectional measurements performed on ini- tial and follow-up computed tomography (CT) images. Even today, according to the collaborating radiologist, in clinical practice, this assessment is supplemented by diameter measurements on different planes, as mentioned. The measurements rarely take into account 3D reorientation and effects from background lung deformations. Current typical measurements quantify the solid portion, the whole nodule size and the visual appearance of the margins estimated from the axial slices that show the largest lesion diameter. When using this typical method, the causes of the user vari- ability are found to be: 1. the difference between the slice depicting the maximal area selected by each radiologist; 2. the variability associated with the selected start and end points to measure the maximal diameter; 3. the differences between the angles of the selected maximal diameters [28].

A more powerful measurement, which uses three-dimensional (3D) assessment, has shown increased accuracy and agreement; however, volumetric analysis is rarely used in a typical clinical workflow ([27]). Volumetric analysis requires segmentation of the nodule, a process that is time consuming and highly subjective to observer variability, especially when the margins are indistinct. An alternative approach that we mentioned is the use of registration and image subtraction. However, the segmen- tation of the lesion is required in order to enforce constraints that will retain the true lesion change. A true alternative would be an accurate registration that retains the true lesion change without the need for segmentation. Both methods, segmentation and registration, are very useful tools, not only in medical imaging, but also in image processing and computer vision in general. Therefore, many of the computer-aided systems are using image registration and segmentation for a variety of applications [6, 8, 11]. For the assessment of GGN growth rate the registration without the need for segmentation is preferable, because of the difficulty of segmenting ill-posed bound- aries.

As a result of the user variability associated with manually performed tasks by clinicians, there often is critical bias in clinical decision making. For example, it was reported by expert echocardiographers that the decision of a ventricular assist device (VAD) to support the heart, usually at the LV, is based on the ejection fraction (EF) approximation. In clinical practice, the LV volumes for calculating EF, are approxi- mated from the areas of the 2D analysis on the TEE images. This method is obtained by manual tracing of the LV margins, which is subject to user variability, especially when the margins are indistinct. Therefore, mainly LV but also RV automated seg- mentation in cardiac US and MRI is a very active research field [30, 44, 45]. Lastly we also experienced ambiguity even with the manual long limbs X-ray stitching, in instances when there are no unique clues, e.g. when the diaphysis (midsection of the bone shaft) of the femur does not feature sufficient distinguishable clues in the overlapping region.

Most medical image processing tools and techniques such as segmentation, regis- tration, statistical shape and appearance models are formulated as an optimization problem i.e. searching the ideal parameter values by minimizing / maximizing an energy / objective function. Defining and solving the optimization problem has be- come much more sophisticated with the development of image registration methods in the last two decades [6–9, 17, 18, 46]. This development was inspired from the com- puter vision field and vice versa, because of the common use of the same tools. The complexity mainly depends on the number of unknowns, degrees of freedom (DOF) and the complexity of each term in the objective function. Despite this development, the translation of these methods into the clinical practice is still a major problem. The use of these methods in clinical practice requires accuracy together with fairly straightforward algorithms for reliability and easy-to-use software interfaces. Even though recently developed methods are robust and accurate in ways specific to each application, inside of those applications, the clinical data variability is still a great challenge, with variations such as scanner type, scanning protocol and, most signifi- cantly, patient characteristics.

As with any software designated to commercial use, testing on a large scale database is required to prove robustness. Public databases and challenges comprising

many of the variations provide an excellent platform for evaluation and the compar- ison of the methods, so that the methods can be optimized and eventually reach the required robustness. The reliability of the methods is related to the number of degrees of freedom. As was mentioned above regarding the lung deformation, a dense local deformation field that employs a large number of DOFs (109 for 512x512c600 CT scan) cannot be validated in vivo and therefore the artificial and true deformations cannot be distinguished, thereby eliminating the reliability of the method.

Localization of an optimal mask around the region of interest is crucial for achiev- ing simplicity and reliability. An ideal ROI restricts the region of processing, and, combined with global registration having a minimum number of DOFs, might achieve comparable accuracy to the deformable dense registration and avoid unnecessary sub- sequent processing steps. Many CAD systems are semi-automated, requiring mini- mum user input in order to ensure ideal masking prior to registration or segmentation [11, 16]. To summarize, computer-aided systems based on reliable and accurate im- age processing methods may be able to automate, or semi-automate, with minimum user intervention, many of the manual tasks which are subject to variability, biased decisions and the requirement for an expert’s time, when the radiologist workload is continuously increasing.