Preparing Training Data - Data-Driven Pixel-Mapping Based on Supervised Learning

6.2 Data-Driven Pixel-Mapping Based on Supervised Learning

6.2.3 Preparing Training Data

For adapting the supervised classification algorithm, a set of training examples Γ = {(x, y)i}, i =

1, . . . , N consisting of pairs of input patterns xi ∈ IRnt and class labels yi is needed. The class

label yi encodes the membership of the corresponding input pattern xi to one of the nΩ classes

using a 1-of-nΩ scheme, i.e. yi = (1, 0, 0), yi = (0, 1, 0) and yi = (0, 0, 1) for examples of the

malignant, normal and benign class, respectively. Samples ΓMalignant, ΓBenign and ΓNormal for the three tissue classes have to be selected from labelled DCE-MRI data volumes for constructing a representative set

Γ = ΓNormal∪ ΓBenign_{∪ Γ}Malignant _(6.3)

subsequently used for adapting the classification algorithms. Labelling of Image Data

The label ypattributes each input pattern xpeither as a malignant, normal or benign example and

is derived from two information sources which were acquired during a standard clinical DCE-MRI evaluation process. The spatial information about the location of lesion voxels is provided by a manual lesion segmentation. Voxels of lesions were manually marked by an experienced radiologist with a cursor on a screening device. To this end, the DCE-MRI data was presented as subtraction images visualising the temporal intensity gradient computed from one of the postcontrast and the precontrast image. Strong enhancing structures such as lesions or blood vessels appear with high intensity for a suitable selection of the postcontrast image. Furthermore, the radiologist correlated the DCE-MRI data with X-ray mammography images. Lesion voxels were then marked by either selecting individual voxels or by adjusting the vertex of a polygon enclosing a larger subset of voxels. The lesion segmentation of case m is formally described by the set PLesion_m of spatial coordinates of designated lesion voxels.

Neural Artificial Network DCE−MRI Sequences Lesion Segmentations Histopathological Report Pool of Training Cases

Histopatholical Signal Spatial Information Information Information Histopatholoic Histopatholoic Report Report

Figure 6.2: For the ANN adaptation, information from different sources is derived. The DCE-MRI sequences of a group of cases provide information about the signal domain. Spatial information about the corresponding lesions is derived from a manual lesion segmentation. A classification of the seg- mented lesions as malignant or benign is obtained from the laboratory report of the histopathological examination.

Even though the manual lesion segmentations yield information about the location of suspicious signals, no information about the distribution of benign and malignant signals inside the lesions is provided. Since lesion tissue is typically heterogeneous, a reliable classification of the temporal kinetic signals associated with individual lesion voxels requires a manual voxel-by-voxel evaluation by an experienced radiologist. Nevertheless, a voxel-by-voxel evaluation of a large number of lesions is impracticable under the prevailing circumstances of clinical diagnosis. An alternative source of information, albeit suboptimal for the purpose of preparing a set of labelled training signals, is the outcome of the histological examination of the lesion. The microscopic analysis of tissue samples extracted from a core-needle biopsy analyses the lesion tissue at the level of individual cells and allows for reliable classifications of type and grade of lesions. If the outcome of the histological examination is utilised as a label for temporal kinetic signals of lesion voxels, two aspects need to be considered:

• It is commonly not known which subsets of voxels exactly correspond to the tissue samples extracted for the histological examination. Thus, the histological report provides only a classification of the entire lesion without any further information about the exact location of malignant and benign signals inside the lesion. It has to be assumed that the distribution of benign and malignant training examples considerably overlap, if all signals of heterogeneous lesions are labelled according to the outcome of the corresponding histological examination. • The histological examination evaluates tissue at a cellular level using features reflecting the type and configuration of individual cells. Consequently, the histological diagnosis does not necessarily reflect the diagnosis which a radiologist would have derived solely from the examination of the DCE-MRI data.

The advantage of signal labelling based on histological reports is the fact that histological examinations are often routinely performed for cases exhibiting suspicious lesions. Therewith, the DCE-MRI sequence and the corresponding histological reports are frequently available for a number of cases which is sufficient for adapting data-driven learning algorithms.

Training Data Selection

Despite the mentioned shortcomings of a signal label based on the histological examination, labelled examples for adapting the supervised learning algorithms are sampled by utilising the breast masks, the manual lesion segmentations and the outcomes of the corresponding histological examinations. Examples for temporal kinetic signals caused by malignant tissue are selected from the DCE-MRI data of the cases Mm, m = 1, . . . , nM exhibiting lesions which were histologically

classified as malignant:

ΓMalignant = {(x, y)p|p ∈ PLesionm , m ∈ {M1, . . . , MnM}, y = (1, 0, 0)}.

Examples representing signals caused by benign tissue are, respectively, selected from the cases Bm, m = 1, . . . , nB exhibiting lesions classified as benign:

ΓBenign= {(x, y)p|p ∈ PLesionm , m ∈ {B1, . . . , BnB}, y = (0, 0, 1)}. The set ΓNormal= n (x, y)p|p ∈ PBreastm ∧ p 6∈ PLesionm , m ∈ {M1, . . . , MnM, B1, . . . , BnB}, y = (0, 1, 0) o

of examples of normal tissue signals is selected from all positions marked by the breast mask, excluding the lesion segmentation. Subsequently, the three sets ΓMalignant,ΓBenign and ΓNormal are divided into two subsets which are used for adapting the learning algorithm and for selecting the algorithm’s hyperparameters. The size of each subset is chosen under consideration of the computational expenses of the adaptation and hyperparameter selection steps and is described in more detail in section 6.2.4 and 6.2.5.

Feature Description

Examination of temporal kinetic patterns s as measured for small ROIs is a common method for characterising lesion masses. However, the categorisation of such signal time courses based on visual examination is insufficiently standardised [Szabo et al., 2003] and, therewith, is subjective and depends on the radiologist’s expertise. Although none of the different interpretation strate- gies used in literature has evolved into a generally adopted approach, some basic features for quantitative evaluation of temporal kinetic signals are widely used. Szabo et al., 2003 examined the value of different morphologic and kinetic features for the formulation of a systematic scoring scheme for lesion characterisation. Among others, the following temporal kinetic features were examined:

• The percentage enhancement reflecting the increase of signal value in the n-th postcontrast image relative to the value in the precontrast image:

En(s) =

sn− spre

spre

• The initial slope reflecting the slope of the signal uptake between the precontrast value and the signal peak:

Slope(s) = Epeak(s) Tpeak(s)

(6.5) with the maximum percentage enhancement Epeak(s) and the time-to-peak Tpeak(s).

• The washout ratio reflecting the downslope from the signal’s peak to it’s value in the n-th postcontrast image (n > peak):

Wpeak−n(s) =

speak− sn

speak

· 100 (6.6)

Additional quantitative features of temporal kinetic signals have been examined by Abdolmaleki et al., 2001. The mentioned features can be computed for signals which relate to a single voxel as well as for averaged signals of ROIs. In the latter case, the spatial variance of the different features in the ROI can also be considered as it is done in the CAD system proposed by Chen et al., 2004. In general, the definition of features and the selection of reasonable thresholds for the discrimination of benign and malignant signals often depends on parameters of the DCE-MRI protocol such as the spatial and temporal resolution.

To emphasise the fundamental idea of a data-driven approach to tissue characterisation and to avoid explicitly defined quantitative features, two types of signal transformations

T : S → X , s 7→ x (6.7)

for the computation of a feature description x of the measured signal s are investigated. The first feature description, referred to as the raw -feature, consists of the unprocessed signal values. The second feature, referred to as the allratio-feature, describes the temporal course of signal intensity as the ratios of intensity values at two different points in time:

x = s_j

(6.8) with j, k = 1, . . . , nt and k < j. Both features have already been employed in chapter 5 for the

detection of suspicious tissue masses.

In document Datengetriebene Analyse dynamischer Magnet-Resonanz-Tomographie-Aufnahmen für die Brustkrebsdiagnostik (Page 109-112)