• No results found

2.6 Peripheral blood digital cell image processing: state of the art

2.6.4 Feature extraction

In the feature extraction step, the characteristics of the object are obtained by quantitative measures. In the lymphoid cell recognition problem these features are calculated for the entire cell, cytoplasm, the nucleus and the region around the cell. They can represent morphologic qualitative features usually employed by the hematologist [3] or abstract quantitative param-eters [35]. Since one of the main objectives of this thesis is to analyze PB lymphoid cells of patients with lymphoid neoplasms, and the description methods of normal WBCs and blood cells present in leukemia can be significant for the DIP process, the state of the art is divided into three principal items corresponding to the feature extraction of normal WBCs, cells from leukemias and neoplastic lymphoid cells, respectively.

20

2.6 Peripheral blood digital cell image processing: state of the art

2.6.4.1 Feature extraction of normal WBCs

To recognize WBCs, Ongun et al. [46, 61] calculates 57 features mainly grouped in two categories: shape based features (moments and affine invariants, length of the cell bound-ary, curvature and boundary energy), and color/texture based features (mean and standard deviation for cell, cytoplasm and nucleus in the L*a*b* color space and HSV color his-tograms features). Sinha and Ramakrishnan [34] presents a methodology for automatically differentiate WBCs, extracting several parameters: shape features (eccentricity of the nucleus and cytoplasm, nuclear compactness, area-ratio and number of nucleus lobes), color features (means of each component of the RGB color space), texture features based on computations of the gray level co-ocurrence matrix (GLCM) and the autocorrelation matrix (energy, entropy and correlation for GLCM, and coarseness and busyness for the autocorrelation matrix). Sanei and Lee [62] uses principal component analysis (PCA) in similar way to face recognition but extended to the YIQ (Y is the luma, I and Q are the chromaticities) color space to obtain the eigencells and after a linear transformation use them as features to describe WBCs. In order to describe WBCs, Piuri and Scotti [27, 39] extracts various geometric features for the nucleus and cytoplasm such as: area, perimeter convex area, solidity, major axis length, orientation, filled area, eccentricity, rectangularity, circularity, and number of nucleus lobes. The best features are selected by a technique named forward selection. Ramoser et al. [50] proposes the extraction of 18 color statistical features for the nucleus and cytoplasm (mean, standard deviation and skewness of each component of HSV color space), five nucleus shape features (convexity, principal axes ratio, compactness, circular variance and elliptical variance) and three geometric features (sizes of nucleus and cytoplasm, and number of detected nucleus regions), to obtain information of the WBCs. Pan et al. [63] employs image-based features rather than concrete features, thus it uses RGB color histogram of the whole cell, the intensity histograms of the nucleus and the cytoplasm to make an only feature vector; then this one is reduced by Kernel PCA to represent a quantitative characterization of blood and bone marrow cells. Rodrigues et al. [64] utilizes shape descriptors by spatial moments with invariance for translation and rotation (row and column moments of inertia, aspect, spread and Hu descriptors), texture features (mean, standard deviation, skewness, kurtosis, first and second neighbor contrasts), and some geometric features (mean, area, perimeter of the cytoplasm and nucleus, and circularity) to recognize WBCs.

2.6.4.2 Feature extraction of blast cells from acute leukemias

Markiewicz et al. [60, 65] and Siroic et al. [66] present the extraction of four groups of features of WBCs from bone marrow smear of patients suffering from acute leukemia: texture

State of the Art of Digital Blood Cell Image Processing

(applied to the three RGB color components on the nucleus and cytoplasm), geometric (of the cell), statistical (color distribution of the cell image) and morphologic (mathematical morphology operations); then the most relevant features are selected by a genetic algorithm feature selection or a Support Vector Machines (SVM) feature selection. Mohapatra et al.

[51–53] calculates various nuclear features to detect acute lymphoblastic leukemia from PB smear images: perimeter roughness (by fractal geometry), contour signatures, shape features (area, perimeter, compactness, solidity, eccentricity, elongation, form factor), color features (means of each component of RGB and HSV color spaces) and four second statistical features based on GLCM of a gray version of the image.

After a segmentation process, González et al. [54] extracts several features corresponding to the geometry (perimeter, area, major and minor axis, orientation, Euler number, among others), texture (gray threshold of the segmentation, sum of the histogram, maximum and minimum of the histogram, mean, standard deviation and variance), and another type of feature obtained from PCA of the bone marrow cells with the purpose of identify possible leukemias. In order to differentiate between normal lymphocytes and abnormal lymphoblast cells, Madhloom et al. [67] obtains shape features (area, eccentricity, perimeter, circularity, elliptical features of the nucleus, cell area and ratio of the nucleus to the whole cell) and texture features (first and second statistical features); subsequently Fisher’s discrimination is utilized to select the best and uncorrelated features. Aimi et al. [68] gets different quantitative parameters to describe WBCs from patients with acute leukemia: size based features (cell area, nucleus area, cytoplasm area, nucleus-cytoplasm ratio, cell perimeter and nucleus perimeter), shape based features (roundness, compactness, central moment and affine invariant moment of the nucleus), and color based features (mean and standard deviation of intensity and RGB color space for nucleus and cytoplasm).

2.6.4.3 Feature extraction of neoplastic lymphoid cells

Comaniciu et al. [49] and Foran et al. [21] extract the following features of the nucleus:

area, shape (elliptical Fourier descriptors) and texture features based on a multiresolution simultaneous autoregressive model within the develop of a image-guide decision support for pathology to characterize neoplastic and normal lymphoid cells. Benattar et al. [69]

proposes a scoring system for lymphocytes in B-cell neoplasm using various morphometric parameters: nuclear shape, cell shape, cell area, nucleus-cytoplasm ratio, nuclear red/blue ratio, cytoplasmic green/blue ratio and the proportion of cell with nucleolus. Angulo et al. [58, 59] extracts several quantitative parameters from the lymphocytes to define some qualitative morphologic features: nuclear and cell sizes, nucleus - cytoplasm ratio, nuclear excentration, chromatin density (texture) by granulometric curves, regular and irregular

nu-22

2.6 Peripheral blood digital cell image processing: state of the art

clear shapes through some simple parameter (form factor, circularity, eccentricity) and the specific analysis of the nuclear lobes and others irregularities, number of big or medium nucleolus, cytoplasmic basophilia (mean of the each color component of L*a*b* color space), and cytoplasmic granulations, cytoplasmic shape by binary granulometry. Ushizima et al. [70]

studies the leukocyte recognition problem by calculating shape and size features (perimeter, area, circularity, bending energy, nucleus - cytoplasm ratio, etc.) and texture features based on the GLCM applied for different block sizes over a grayscale version of the cell image.

Afterwards, it uses feature selection by an exhaustive search, and heuristic search with forward sequential selection and backward sequential selection. In a subsequent work, Ushizima et al.

[71, 72] extends its proposed method by applying the texture features to the components of the RBC color space and adding some statistical features, later it utilizes again the feature selection method to choose the most important and independent features. Jahanmehr et al.

[73] analyzes quantitative and qualitative cytological parameters of lymphocytes from B-cell neoplasms (CLL, MCL and B-PL). Particularly, B-cell area, B-cell diameter, cytoplasm area, nuclear area, nuclear/cell ratio and nuclear density are evaluated and it demonstrates that these features can be useful to differentiate the lymphoid neoplasms. For the purpose of describing neoplastic lymphoid cells and blast cells, Tuzel et al. [74,75] performs a cell representation characterizing its texture structure using textons inside both nucleus and cytoplasm by the construction of two texton histograms.