• No results found

4.2 Breast Segmentation

5.1.6 Improvements in the algorithm

In order to improve the performance of the algorithm in our study case of density mammogram classification, we have added some new features as the possibility to test new parameters in each node in the moment of the tree building or adding texture features to the feature vector.

5.1.6.1 Different operation modes

First of all we thought that might be useful to add the functionality to choose between three possibilities to test in each node in the learning and testing step. In each node, a simple value is compared to a random threshold and the simple answer if the value is greater or minor that the threshold can propagate the sample until reaching the final or leaf node. The implemented and measured tests are the following ones:

1. Absolute value of the pixel corresponding to the feature used in this node (already implemented). 2. Pixel difference value between the value corresponding of the feature used and other feature chosen

randomly.

3. Possibility to compute a test according to the maximum between two pixels.

The operation mode it will be controlled by a input variable and the test done in each node will be done according to the selected mode. In the testing, it will also be done the same test in each node that in the training to be coherent with the algorithm.

5.1.6.2 Adding new features to the feature vector

Feature scale:

We increased the dimension of the feature vector in one position for adding a feature related to the scale of the extracted sub-window. Before being resized to 16x16, the size of these sub-windows is chosen randomly. So we have decided to compute the scale between the size of this sub-window with the size of the image, computed as:

f eaturescale=

subwindowdim

imagedim

This feature gives us a rate between the extracted random sub-window and the dimension of the image, giving us a measure to compute if the sub-window are comparable to the image dimension or if it is so small to give us some information about the possible classification.

Texture features:

Furthermore, we have thought that adding texture features in our algorithm might help in distinguishing between the more similar classes. Texture classification is responsible for segmenting a textured image into several regions with the same texture features. Texture features can be applied in three main different ways: statistical, structural and spectral. In statistical approaches, texture statistics such as the moments of the gray-level histogram, or statistics based on gray-level co-occurrence matrix are computed to discriminate

CHAPTER 5. METHODOLOGY FOR DENSITY CLASSIFICATION 42

different textures. For structural approaches, “texture primitive”, is used to form more complex texture pattern by grammar rules which specify the generation of texture pattern. And in spectral approaches, the textured image is transformed into frequency domain[38].

For biomedical images not all the methods are available and due to the characteristics of the pixels in the texture pattern are not similar everywhere, we thought that the statistic approach might help in the classification step. In our algorithm, there are some texture features that they are still underlying in the implementation, like the mean(average intensity). So, we have thought about which features, that they are not already implemented, and could help in distinguishing between density classes.

Our extracted features, as in the work of Seenapa H. et al.[38], are based on the statistical features and use the mathematical model for the moments to compute them. However the used expressions are different and we are not going to use the information of intensity histogram.

These texture features are added in the feature vector giving them a certain weight related to the absolute value of the pixels. We have added some positions of each extracted feature to give them more importance or less in moment of the classification. He have tested the results using 1 and 10 positions of the vector. He have tested with texture features of the random extracted sub-windows before resizing them and of the whole image. The results are presented in the table 6.12.

He have calculated these features in the moment of the random subwindow extraction and it was necessary to compute its histogram, mean and variance to compute them. The used expressions to compute the mathematical moments are:

• Moment n: µn= 1 N· N −1 X i=0 (zi− m)n

where zi is the value intensity of the pixel, m is the mean and N are the size of the subwindow (rows·cols).

• Mean m = 1 N· L−1 X i=0 zi • Variance: σ2= 1 N· L−1 X i=0 (zi− m)2 • Standard deviation: σ =σ2

• Smoothness: measure of the relative smoothness of the intensity region.

R = 1 − 1

(1 + σ2) • Skewness: measure of pdf’s symmetry.

µ3 σ3 = 1 N· 1 σL−1 X i=0 (zi− m)3

CHAPTER 5. METHODOLOGY FOR DENSITY CLASSIFICATION 43

• Kurtosis: measure of the "peakedness" of the pdf

µ4 σ4 = 1 N· 1 σL−1 X i=0 (zi− m)4

• Uniformity: measure of the uniformity of the measures based on the Coefficient of Uniformity (CU).

U = 1 − σ m

Chapter 6

Results

6.1

Tests on mini-MIAS Database

We have tested our segmentation and classification algorithm in a public well known and very referenced database, called mini-MIAS, which was explained previously, to compare our results with the state of the art.

The results of the segmentation part will be presented in two different sections, one dedicated to the results of the pre-processing step of removing labels, artifacts and background from the original image, and the other related to the ability to remove the pectoral muscle from the breast. The results of the first part will be done by a simple probability, as the number of images with the correct label removal divided by the total amount of images, and the second one will be presented subjectively observing if the result image is correct according to the amount of pectoral muscle showed in the original one.

In the other hand, the classification results are calculated, taking into account a lot of parameters and the results are presented in terms of probability, variance, computational time, number of nodes in the trees or confusion matrices.

Related documents