Niranjan Chatap
, IJRIT 723 International Journal of Research in Information Technology
(IJRIT)
www.ijrit.com ISSN 2001-5569
Automatic Blasts Counting for Acute Leukemia Cells Based on Blood Samples
Niranjan Chatap1, Sini Shibu2
1Mtech Scholar, Department of Computer Science Engineering, NIIST, RGPV University Bhopal, MP, India
2Assistant professor, Department of Computer Science Engineering, NIIST, RGPV University Bhopal, MP, India
Abstract
This paper describes a preliminary study of developing a detection of leukemia types using microscopic blood samples images. Analysing through images is very important as from images , diseases can be detected and diagnosed at earlier stage. The identification of blood disorders, it can lead to classification of certain diseases related to blood. Several classification techniques are investigated till today. One of the best methods for classification techniques is SVM .The system will focus on automatic blast count of acute leukemia cell using the SVM classifier. It is possible to count the acute leukemia cells.
Keywords- White Blood Cell, Microscopic images, Leukemia ,SVM.
1. INTRODUCTION
Leukemia is a type of blood cancer involving white blood cells. It is a bone marrow disorder that arises when abnormal white blood cell begins to continuously replicate itself [1]. These cells do not function normally which is to fight off infections. As they accumulate, they inhibit the production of other normal blood cells in the marrow, leading to anemia, bleeding, and recurrent infections. Over time, the leukemic cells spread through the bloodstream where they continue to divide, sometimes forming tumors and damaging organs such as the kidney and liver. Acute leukemia is classified according to the French- American- British (FAB) classification into two types: Acute Lymphoblastic Leukemia (ALL) and Acute Myelogenous Leukemia (AML). Acute leukemia is a rapidly progressing disease that affects mostly cells that are unformed (not yet fully developed or differentiated) [2]. ALL is most common in children while AML mainly affects adults but can occur in children and adolescents.
In 2009, it is estimated that approximately 31,490 individuals will be diagnosed with leukemia and 44,510 individuals will die of the disease in the United States [3]. In Malaysia, a total of 529 cancer cases
Niranjan Chatap
, IJRIT 724
of Myeloid Leukemia and 433 cases of Lymphatic Leukemia were reported comprising 4.5% of the total number of cancers in Peninsular Malaysia in 2003 and registered under National Cancer Registry. Out of this value, males predominated at a ratio of 1.7: 1 for Lymphatic Leukemia & 1.1: 1 for Myeloid Leukemia.
It was the 4th commonest cancer with 7.1% in males after Lung Cancer, 13.8%, Nasopharynx Cancer, 8.8%, and Colon Cancer, 7.6%. In females, it was the 7th commonest cancer with 4.0% after Breast Cancer, 31.0%, Cervix Uteri, 12.9%, Colon Cancer, 6.0%, Corpus Uteri, 4.3%, Rectum Cancer, 4.1%, and Ovary Cancer, 4.1% [4].Based on the high number of leukemia cases, the requirement for fast and cost- effective production of blood cell count reports and analyzing are of paramount importance in the healthcare industry.
The early and fast identification of the leukemia type, greatly aids in providing the appropriate treatment for the particular type. Its detection starts with a complete blood count (CBC). The patient should perform bone marrow biopsy if there are abnormalities in this count. Therefore, to confirm the presence of leukemic cells, a study of morphological bone marrow and peripheral blood slide analysis is done. In order to classify the abnormal cells in their particular types and subtype of leukemia, a hematologist will observed some cells under a light microscopy looking for the abnormalities presented in the nucleus or cytoplasm of the cells. This classification is very important in predicting the clinical behavior of the disease and the prognosis in order to determine which treatment should be given to the patient [5].
A typical blood microscope image is plotted in Fig. 1. The principal cells present in the peripheral blood are red blood cells, and the white cells (leucocytes). Leucocyte cells containing granules are called granulocytes (composed by neutrophil, basophil, eosinophil). Cells without granules are called agranulocytes (lymphocyte and monocyte). The percentage of leucocytes in human blood typically ranges between the following values: neutrophils 50-70%. Eosinophils 1-5%. basophils 0-1%. monocytes 2-10%, lymphocytes 20.45% [6].
Figure . 1. Blood’s white cells marked with colorant: basophil (b), eosinophil (e), lymphocyte (l), monocyte (m), and neutrophil (n). Arrows indicate platelets. Others elements are red cells.
Conventionally, manual counting done under the microscope was performed in order to count the white blood cells in leukemia slides. This way is troublesome and time consuming if the counting process is
Niranjan Chatap
, IJRIT 725
interrupted; it has to be started all over again. Thus, the traditional method of manual counting under the microscope affords susceptible to error procedure and put an intolerable amount of stress on the medical laboratory technicians.Although there are hardware solutions such as the Automated Hematology Counter to perform counting, certain developing countries are not capable to deploy such an expensive machine in every hospital laboratory in the country [7]. Furthermore, hematologists still carry out the manual counting based on slide of blood and bone marrow samples to confirm the case. Therefore, here we propose to provide an alternative solution to the problem where white blood cell counting can be carried out automatically at low cost by designing an automated system to count white blood cells in leukemia slides.
2. RELATED WORK
Over years numerous blood smear image segmentation methods have been proposed [12–14]. Those methods can be broadly classified as edge-based [15], region-based [16], threshold-based [17], and watershed-based [18, 19] segmentation schemes.
• Wu et al. [15] stated that cell boundaries are not sharp enough to perform edge-based segmentation in leukocyte images. An improved seeded region growing algorithm for cell segmentation was presented by Mehnert and Jackway [20]. However, determining the initial seed points is the drawback of all region-based methods.
• Few studies have been reported in the literature which employs thresholding for white blood cell (WBC) segmentation. Liao and Deng [21] introduced a gray level threshold based WBC image segmentation. Fuzzy divergence is employed by Ghosh et al. [5] for threshold estimation in leukocyte images. All threshold-based approachs are able to segment the nucleus from the background with acceptable accuracy.
• Color images are very rich source of information and regions can be segmented better in terms of color as compared to grayscale images. A two-step color image segmentation process using K- means clustering followed by EM algorithm was proposed by Sinha and Ramakrishnan [22].
Comaniciu and Meer [23] applied mean shift algorithm for color image segmentation of leukocyte images.
• Blood cell contour detection using active contour model was first presented by kass Ei al. [24].
Another variation of active contour model (snakes) was explored successfully for WBC nucleus segmentation by Ongun et al. [25]. The application of morphological operators has also been investigated for WBC background separation [26]. In recent years, clustering technique has been incorporated intelligently for leukocyte image segmentation [27].
• As a general purpose segmentation method, feature space clustering has the advantage that is straight forward for classification [12]. Drawbacks associated with standard-clustering algorithms are the predetermination of number of clusters [28] and overlapping of morphological regions that is, cytoplasm and nucleus.
• As per Kumar et al. [29] selection of color space is also a vital issue in color-based clustering.
Assumption of leukocytes as circular in shape-based methods is untenable in many cases; thus, diagnostic accuracy can drastically fall.
Niranjan Chatap
, IJRIT 726
There are several similar findings on blood-cell segmentation in the literature. It was also observed that standard segmentation methods are able to extract the WBC nucleus with acceptable level of accuracy but fails badly with cytoplasm. Cytoplasm is a decisive morphological component of blood; hence, utmost care should be taken while extracting it. To summarize the study reveals that the segmentation performances are limited by factors like smear preparation, staining, and image grabbing. So much work has to be done to meet real clinical demands. Uncertainty arises due to color pixel similarity between cytoplasm and background (plasma) region. Misclassification of color pixel is an inherent problem in standard color based clustering schemes [30]
3.RESEARCH METHODOLOGY
Support Vector Machines (SVM):The support vector machine (SVM) is a classification algorithm that provides state of the art performance in a wide variety of application domains, including handwriting recognition, object recognition, speaker identification, face detection and text categorization. Two main motivations suggest the use of SVMs in computational biology.
Figure 2 :The SVM system
The scheme for the SVM algorithm is shown in Fig.2. The detailed step is described as follows.
Step 1: The original data are pre-processed by the Wilcoxonrank sum test.
Wilcoxon rank sum test uses Wilcoxon statistical variables,and its advantage lies in its combination of good propertiesof the t test and TNOM. The statistics formula is asbelow:
S (g) =ƩƩI ((Xj(g)- Xi (g)) ≤0) (1) i∈N0j∈N1
Here I is the distinguishing function whose value is 1 if thelogic expression is true, 0 if otherwise. Xi
(g)is the expressionvalue of sample i in gene g, N0 and N1 represent samplequantities belonging to different class, s (g) can represent themeasurement of the expression difference of one gene in twoclasses. Owing that when the value reaches 0 or reach themaximum N0 ×N1, the corresponding gene is more importantto classification, we use the below expression to weighthe importance of the gene.
q (g) = max(s(g), N1 ×N0 −s(g)) (2) Wilcoxon
Preprocess
Original Data Population Evaluation of
the Fitness
Final Gene Subset
Analysis of appearing frequency of each
gene
Important gene Subsets
Classifier
Terminationc ondition
Yes No
Niranjan Chatap
, IJRIT 727
Each gene is evaluated and sorted according to Eq. 2, and thefirst P genes are selected to form new subset.
Step 2: The different gene subsets are obtained by usingSVM on different training sets.
Ifthe training set has N samples, the LOOCV(leave-one-outcross validation) will results in Ntraining subsets with N −1 samples and N correspondingtest subsets with one samples. Using this scheme, we can getN highly informative gene subsets.
After this step, the N important gene subsets are obtained.
Step 3: The final high informative genes are selected andtested classification performance by a SVM classifier.
The frequency of appearance for each gene in N genesubsets are all calculated, and the higher the frequency, themore important the gene. According to this approach, the firstK genes appearing most frequently areselected as the finalselected genes.
The system we propose firstly individuates in the blood image the leucocytes from the others blood cells, then it extract the lymphocyte cells (the ones interested by acute leukemia), it extracts morphological indexes from those cells and finally it classifies the presence of the leukemia.
The main modules which compose the overall system are plotted in Figure 3.
• The Single-cell Selector module firstly enhances the input image and identifies the single cells. It has been composed by adaptive pre filtering and segmentation algorithms.
• Secondly, the White-cells Identifier module selects the white cells present into the image by separating them from others blood’s components (red cells and platelets).
• The third module (the Lymphocyte Identifier) can recognize a lymphocyte with respect to the other selected white cells.
The system we propose in this paper is the sub-system which has to recognize if a lymphocyte is blast or normal, and it is composed by the modules inside the dashed line in Figure 3: the Feature Extraction module and the Classification module.
• The Feature Extraction module processes a sub-image containing a lymphocyte coming from the Lymphocyte Identifier module and it produces in output a set of morphological indexes.
• The classification module processes those indexes in order to classify the cell as blast or normal.
If the system finds a blast cell, the blast cell counter is increased; otherwise a new lymphocyte will be processed.
Niranjan Chatap
, IJRIT 728
Figure 3. The structure of modules composing the acute leukemia classification system.
The classification of the lymphocyte sub-image is quite complicated since even an expert operator can have dubs in classifying some lymphocyte cells. Actually, the morphological distinctive aspects of blast and normal lymphocytes are very smooth.
The features we want to extract from the cell are mostly the features that the operators qualitatively observe in the blood film to classify the cell as blast or normal. The most common leukemia classification is the FAB method [8]; nowadays it has been updated with the immunologic classification [9] which it is not image-based. Otherwise, the FAB method is still valid for image-based morphological classification.
Concerning the ALL, the FAB classification is three-partitioned as follow:
• L1: Blasts are small and homogeneous. The nuclei are round and regular with little clefting and inconspicuous nucleoli. Cytoplasm is scanty and usually without vacuoles.
• L2: blasts are large and heterogeneous. The nuclei are irregular and often clefted. One or more, usually large nucleoli are present. The volume of cytoplasm is variable, but often abundant and may contain vacuoles.
• L3: blasts are moderate-large in size and homogeneous. The nuclei are regular and round-oval in shape. One or more prominent nucleoli are present. The volume of cytoplasm is moderate and contains prominent vacuoles.
Figure 4 shows the great variability in shape and pattern of the blast cells according to the FAB classification. Our goal is to detect without differentiation the presence of all three types of blasts in the blood film.
Niranjan Chatap
, IJRIT 729
Figure 4. Morphological variability associated to the blast cells according to the FAB classification
4. DATABASE
The ALL-IDB image datasets (Acute Lymphoblastic leukaemia Image Database) [31] is provided by Fabio Scotti to test andfairly compare algorithms for cell segmentation and classification of the ALL infection. There are two types of datasets are available. The ALL-IDB1 can be used both for testing segmentation capability of algorithms, as well as the classification systems and image pre-processing methods and ALL-IDB2 has segmented WBCs to test the classification of blast cells. The examples of ALL- IDB1dataset images are shown in below figure
Figure 5:
Healthy blood images examples (a), blood with ALL blasts (b). (a1-2) and (b1-4) are zoomed subplots ofthe (a) and (b) images centered on lymphocytes andlymphoblast respectively.
5. CONCLUSION
the development of a fully automated screening system prototype for blood cell segmentation and classification may provide the specialist with significant aid in the effort to detect and classify Luckemia cells more effectively and efficiently. The SVM algorithm can handle this case by using tradeoff (cost) parameter and penalty parameters (slack variables). The real power of SVM is to map the data (implicitly) to a higher dimensional space via a kernel functionand then identify themaximum-margin hyperplane that separates training instances The proposed method will detect the cell is blast or normal .The counting of cell is done if the blast cell is detected. it will helpful to not only diagnose the disease but also for health care student.
References
[1] Leukemia Disease, http://en.wikipedia.org/wiki/Leukemia, retrieved on 1 June 2011.
Niranjan Chatap
, IJRIT 730
[2] S. Rajendran, “Image Acquisition and Retrieval Systems For Leukaemia Cells” Master’s thesis, Department of Computer Science, UM, 2007.
[3] A. Jemal, R. Siegel, E. Ward, Y. Hao, J. Xu, and M. J. Thun, “Cancer statistics, 2009,” CA: A Cancer Journal for Clinicians, 2009.
[4] National Cancer Registry, Malaysia (2003). Second Report of the National Cancer Registry: Cancer Incidence in Malaysia 2003.
[5] C.Reta, L.Alamirano, J.A.Gonzales, R. Diaz,and J.S.Guichard, “ Segmentation of Bone Marrow for Morphological Classification of Acute Leukemia,” in Proceding of the Twenty-Third International Florida Artificial Intelligence Research Society Conference (FLAIRS 2010).
[6] K. B. Taylor and J. B. Schorr, “Blood vol.4,” 1978.
[7] G.P.M Priyankara, O.W Seneviratne, R.K.O.H Silva, W.V.D Soysa and C.R. De Silva, “An Extensible Computer Vision Application for Blood Cell Recognition and Analysis”, 2006.
[8] J.M. Bennett et al. “Proposals for the classification of the acute leukaemias. French-American-British (FAB) co-operative group”, Br J Haematol.; Vol. 4, No. 33, pp. 451-458, August, 1976
[9] A. Biondi, G. Cimino, R. Pieters, C. H. Pui, “Biological and therapeutic aspects of infant leukemia”, Blood, Vol. 96, No. 1, pp. 24-33, July, 2000
[10] Ruggero Donida Labati, Vincenzo Piuri, Fabio Scotti, “All-Idb: The Acute Lymphoblastic Leukemia Image Database For Image Processing”, IEEE 2011
[11] A. Sinha and A. Ramakrishnan, “Automation of differential blood count,” in Proceedings of the Conference on Convergent Technologies for Asia-Pacific Region, pp. 547–551, 547–551, 2003.
[12] R. Adollah, M. Mashor, N. M. Nasir, H. Rosline, H. Mahsin, and H. Adilah, “Blood cell image segmentation: a review,” in Proceedings of the 4th Kuala Lumpur International Conference on Biomedical Engineering, N. A. Osman, F. Ibrahim, W. W. Abas, H. A. RahmanTing, and H. Ting, Eds., vol. 21, pp.
141– 144, Springer, 2008.
[13] C. Di Rubeto, A. Dempster, S. Khan, and B. Jarra, “Segmentation of blood images using morphological operators,” in Proceedings of the 15th International Conference on Pattern Recognition, vol. 3, pp. 397–400, 2000.
[14] D. Anoraganingrum, “Cell segmentation with median filter and mathematical morphology operation,”
in Proceedings of the International Conference on Image Analysis and Processing, pp. 1043–1046, 1999.
[15] J. Wu, P. Zeng, Y. Zhou, and C. Olivier, “A novel color image segmentationmethod and its application to white blood cell image analysis,” in Proceedings of the 8th International Conference on Signal Processing, vol. 2, 2006.
[16] T. Mouroutis, S. J. Roberts, and A. A. Bharath, “Robust cell nuclei segmentation using statistical modelling,” Bioimaging, vol. 6, no. 2, pp. 79–91, 1998.
[17] H. S. Wu, J. Barba, and J. Gil, “Iterative thresholding for segmentation of cells from noisy images,”
Journal of Microscopy, vol. 197, no. 3, pp. 296–304, 2000.
Niranjan Chatap
, IJRIT 731
[18] G. Lin, U. Adiga, K. Olson, J. F. Guzowski, C. A. Barnes, and B. Roysam, “A hybrid 3D watershed algorithm incorporating gradient cues and object models for automatic segmentation
of nuclei in confocal image stacks,” Cytometry Part A, vol. 56, no. 1, pp. 23–36, 2003.
[19] M. Ghosh, D. Das, S.Mandal et al., “Statistical pattern analysis of white blood cell nuclei morphometry,” in Proceedings of the IEEE Students’ Technology Symposium (TechSym ’10), pp. 59– 66, April 2010.
[20] A. Mehnert and P. Jackway, “An improved seeded region growing algorithm,” Pattern Recognition Letters, vol. 18, no. 10, pp. 1065–1071, 1997.
[21] Q. Liao and Y. Deng, “An accurate segmentation method for white blood cell images,” in Proceedings of the IEEE International Symposium on Biomedical Imaging, pp. 245–258, 2002.
[22] A. Sinha and A. Ramakrishnan, “Automation of differential blood count,” in Proceedings of the Conference on Convergent Technologies for Asia-Pacific Region, pp. 547–551, 547–551, 2003.
[23] D. Comaniciu and P. Meer, Cell Image Segmentation for Diagnostic Pathology, Springer, New York, NY, USA, 2001.
[24] M. Kass, A. Witkins, and D. Terzopoulos, “Snakes: active contour models,” in Proceedings of the 1st International Conference on Computer Vision, pp. 259–268, 1987.
[25] G. Ongun, U. Halici, K. Leblebicioˇglu, V. Atalay, S. Beksac, and M. Beksac¸, “Automated contour detection in blood cell images by an efficient snake algorithm,” Nonlinear Analysis, Theory, Methods and Applications, vol. 47, no. 9, pp. 5839–5847, 2001.
[26] C. Di Ruberto, A. Dempster, S. Khan, and B. Jarra, “Analysis of infected blood cell images using morphological operators,” Image and Vision Computing, vol. 20, no. 2, pp. 133–146, 2002.
[27] G. Ongun, U. Halici, K. Leblebicioglu, V. Atalay, M. Beksac, and S. Beksak, “A modified fuzzy clustering for white blood cell segmentation,” in Proceedings of the 3rd International Symposium on Biomedical Engineering, pp. 356–359, 2008.
[28] K. Jiang, Qing-Min, and S.-Y. Dai, “Red blood cell segmentation scheme utilizing various image segmentation techniques,” in Proceedings of the 2nd International Conference on Machine Learning and Cybernetics, 2003.
[29] B. R. Kumar, D. K. Joseph, and T. Sreenivas, “Teager energy based blood cell segmentation,” in Proceedings of the 14th International Conference on Digital Signal Processing, Bangalore, India, 2002.
[30] S. Mohapatra and D. Patra, “Automated cell nucleus segmentation and acute leukemia detection in blood microscopic images,” in Proceedings of the International Conference on Systems inMedicine and Biology (ICSMB ’10), pp. 49–54, 2010.
[31] http://www.dti.unimi.it/fscotti/all
[32] Ms.Minal D. Joshi, Prof. Atul H. Karode, Prof. S.R.Suralka “White Blood Cells Segmentation and Classification to Detect Acute Leukemia” International Journal of Emerging Trends & Technology in Computer Science (IJETTCS)Volume 2, Issue 3, May – June 2013.