Citrus Leaf Disease Detection using Svm Classifier
D. Sruthi Dr. P. Prakash
Department of Computer Science and Engineering
Department of Computer Science and Engineering
Amrita School of Engineering, Coimbatore
Amrita School of Engineering, Coimbatore
Amrita Vishwa Vidyapeetham Amrita Vishwa Vidyapeetham
India India
Abstract
Technologies are progressing, but farmers are still facing problems in identifying the plant diseases. Crop diseases are becoming a threat to the farmers due to which their crop productivity is going down. Each and every plant will be having different diseases and manual analysis of those diseases is very difficult. This calls for a huge amount of work, experience on diseases in plants and also extensive time for processing. All of these variables would reduce the productivity of crops. Image processing along with machine learning paves a way for the identification of the diseases. In this project citrus leaf disease dataset is used for identifying the diseases. K-Means Segmentation is used and inputs to the classifier. Both kNN and SVM classifiers are used out of which SVM acquired better results with an accuracy of 70.31%.
Keywords: Image Processing, Machine learning, K-Means, SVM, GLCM, leaf disease detection
_______________________________________________________________________
1. Introduction
In many developing nations, agriculture is the spine of economy. Particularly, Agriculture plays an important role in India‟s economic growth. About 70% of Indian economy relies on agriculture for their livelihood. Therefore, harm to the harvest would lead to massive production loss and would ultimately affect the economic growth of the country. One of the reasons is plant diseases and early detection of this will increase the quality and quantity of the production for economic growth. Leaves are the most sensitive part of plants. Plant leaves show disease symptoms at earlier stage. Thus, it is very important to detect plant leaf disease in earlier stage [1]. After disease detection, some necessary steps should be taken to prevent it from spreading to others regions of the farm. In general, the colour and shape of the plant leaves are monitored by farmers for disease detection. This process requires long time expertise and lot of regular hard work. This is almost impossible for the big farms. The variations in signs, spots and colour etc can be monitored to identify several infections occurring within the different parts of the plant.
The major need of farming sector is a time efficient and automated diagnosis method for improving the productivity rate of crop. In recent times, image processing techniques have been utilized for providing a solution of various issues based on farming applications such as detecting diseased leaf, stem, and fruit. Most work concentrated on image processing for the estimation and identification of leaf illness.
1.1. Challenges in detecting leaf diseases
In the identification of plant diseases with image processing techniques, there are few challenges which are mentioned below:
a. Data set collection: Creating an image database is the fundamental requirement of image processing. One has to travel to various locations for capturing images of plant diseases. It is a
challenging task to gather data due to the unavailability of different varieties of plant infections may at some farms. Occasionally, only for some seasons, diseases occur.
b. Image background: Segmentation of images is one of the major steps. Mainly required part of the image is separated in this process. It is difficult to apply leaf image segmentation because of the existence of plants, leaves and some other green components in backdrop.
c. Condition for image capture: Stable and competent results are provided by automatic plant disease detection systems only if all the pictures are clicked under similar condition. In laboratories, images can be captured under the same condition. Due to uncontrollable environment, image capturing under same condition within the farm is a difficult task [6].
d. Segmentation of symptom: Most plant infection indicators have no distinct boundaries. They fade on plants gradually due to which there will not be a good segmentation. This phenomenon affects the ultimate result.
e. Variations in symptom: Climate, disease and plant are dependent on symptoms. Some alteration in these elements may cause symptoms to change. It is a difficult task to diagnose plant pathogens with symptom variations.
Section 2 offers information on the related research for the identification of leaf disease.
2. Literature Survey
Identification of plant diseases can be done in different ways and few authors like Abirami Devaraj, et.al (2019) stated that farming was not just a technique as it was the main source of food for ever growing population [2]. Almost 70% of the total population of Asian nations was dependent on agriculture for their livelihood. However, different types of diseases reduced the quality of crop. Losses to farming could be prevented by efficient disease detection. The major purpose of this work was to develop a software system for the automatic classification and detection of disease. The disease detection process included different phases. Infections within plants were detected using leaf images.
Thus, it was advantageous to implement image process techniques for plant disease detection and classification in farming sector. Sharath D M, et.al (2019) stated that plant diseases in farming had become a serious issue as these diseases caused losses in the production [3]. It was an extremely complex job to monitor the health of plant and identify different plant infections in manual manner. Expertise in the plant infection detection was required for this purpose. Moreover, this process was very time consuming. Therefore, plant infections were identified using image processing. There was various infection detection stages included in this process. Disease affecting plant was monitored on the basis of output achieved using these stages. In this work, the images of infectious plants were used to discuss the technique implemented for plant infection recogniti on.
Vijai Singh, et.al (2015) proposed a novel image segmentation algorithm. This algorithm was utilized to detect and classify the infections in plant leaf [4]. In this work, various types of classification algorithms were reviewed. These algorithms could be implemented to detect different diseases within plant leaf. Genetic algorithm was used in this work for image segmentation. This algorithm played an important role to detect diseases within plant leaf. There are many segmentation and classification alg orithms and those depends on the leaf which has been chosen. R Anand, et.al (2016) proposed a novel technique for plant leaf infection detection [5]. In this work, image processing and artificial neural algorithms were used for detecting brinjal leaf infec tion. Instead of whole brinjal plant, focus was on just brinjal plant leaf. Nearly 85 -95 % of infections arose on the brinjal leaf. These infections included Bacterial Wilt, Cercospora Leaf Spot, Tobacco mosaic virus (TMV). K-means clustering algorithm was used for segmentation and classification is done using Neural-network. The leaf infections were detected efficiently using suggested detection model based on artificial neural networks. Pranjali B. Padol, et.al (2016) stated that infections within plant leaves were detected and classified using a popular approach called image Processing [6]. The main aim of this work was to provide
support in the recognition and classification of grape leaf infections with the help of SVM classifier. Initially, K-means clustering algorithm was used for segmentation to detect the infectious region. Later, the extraction of both colour and texture features was performed.
At last, different types of leaf infections were detected using classification algorithm.
Accuracy rate of 88.89% was achieved by the proposed system in the classification and detection of diagnosed diseases. Rashmi Pawar, et.al (2017) [7] reviewed different Pomegranate plant disease detection techniques were applied using plant leaf pictures . R.
Anand, et.al (2016) identified diseases on brinjal leaf using K -Means segmentation and ANN. Mean, Area, perimeter, centroid and diameter of the leaf are extracted using GLCM and feed those features into ANN classifier[12].
Namrata R. Bhimte, et.al (2018) used simple image processing algorithm for detecting infections within cotton leaves [8]. A classification algorithm named SVM was used for Classification by selecting suitable features such as colour, image textures, etc. The contaminated segmented region was obtained from Colour-based segmentation approach.
Feature extraction was performed using segmented picture. Different plant diseases can also be classified according to the diseases they belong to. Umut Bariş Korkut, et.al (2018) used image processing and machine learning techniques to detect plant infections in automatic manner [9]. Leaf images of different plant species were gathered in this work.
Transfer learning technique was used to extract important features from the images.
Accuracy rate of 94 % was obtained by proposed model using different machine learning techniques. The created datasets are trained under Random Forest to classify the diseased and healthy images. V. Sahithya, et.al (2019) used K-means segmentation for image segmentation and for selecting suitable features they have used PCA. A comparison study between SVM and ANN is done using with and without noise in the image. All these techniques are performed on lady‟s finger leaf dataset[11].
Section 3 outlines the phases required for identifying the diseases from the leaf.
3. Phases Involved
3.1. Phases of Disease Detection on leaf
The identification method of plant diseases has the following phases: -
1. Pre-processing of Image: - The main aim for pre-processing the images is to boost the image data containing redundant artefacts or to improve certain image processing features. Pre- processing stage uses a range of methodologies, for example variable image shape and size, noise filtration, image translation and image enhancement.
2. Segmenting Images: - The technique of dividing a digital picture into various segments parts is called image segmentation. The main motive of segmentation is to identify objects or retrieve the information from images. This process simplifies the task of picture inspection.
Image segmentation is used to locate things and hopping line of pictures. In order to assign a label to every pixel within an image, differentiated features are shared by pixels having same label part. K-means clustering, Otsu‟s algorithm and thresholding etc are some techniques that are used for image segmentation.
3. Feature Extraction: - Up to now, the result obtained is the region of interest. Therefore, this step is implemented to extract the features from this region of interest. A feature extraction is the method of extracting a set of quality characteristics from images. These features provide information regarding the picture for more processing. Various features such as colour, texture, morphological and colour coherence vector are generally utilized to identify infection within plant. Feature extraction can be performed using different techniques. These methods can be used to develop a system. These methods include such as grey-level co-occurrence matrix (GLCM), colour co-occurrence method, spatial grey-level dependence matrix, and histogram- based feature extraction. The GLCM method is a statistical technique for texture classification.
4. Classification: - Classification is the process of the categorisation of the observed pattern.
Supervised and unsupervised are the two main types of classification. Training is needed in supervised classification. In supervised classification, the user can select sample pixels for creating a class. Training is not required in unsupervised classification. In unsupervised
classification, the results depend on the software analysis devoid of sample classes. In order to detect plant diseases, various classification algorithms such as support vector machine, neural networks, k- nearest neighbour, and also fuzzy logic etc. are implemented. Figure1. gives the overview of the workflow.
Figure 1. Phases of leaf disease detection
In the next paragraph (Section 4), the structure of the proposed system is addressed.
4. System Architecture
In this system, the original images are taken as an input and preprocessing followed by segmentation and feature extraction is done. Figure 2. depicts the system architecture of the proposed method.
Figure.2. System Architecture Diagram START
Input the image for the plant disease detection
Algorithm for segmentation.
Algorithm for the feature extraction
Classification model for the disease detection
Analyze performance in terms of certain parameters
STOP
Section 5 deals with the methodologies of the proposed method
5. Methodology
Phases in given system architecture contains image acquisition, image preprocessing, image segmentation, feature extraction and classification. These stages are explained below:
5.1. Image Acquisition
Citrus leaf disease detection is being carried out for classification. The source of the dataset is in [9]. The dataset contains 759 images of both diseased and healthy leaves,609 images of leaves spread across with Black spot, Canker, Greening, Melanose and Healthy. 427 images of Canker, Greening and Healthy are taken for disease classification. Each image is of 256 * 25 dimensions with 72 dpi resolution.
The following refers to the diseased and healthy citrus leaf (Figure.3)
Figure 3. Four different citrus leaf diseases and healthy leaf.
5.2. Image Preprocessing
In this system, the input image is first resized to 256x256 which will be the good resolution to go forward for further steps. The image contrast is enhanced so that the diseased area of the leaf will be clearly observed. This is useful for the image segmentation as it segments according to the colour. This output will be sending to the segmentation algorithm.
5.3. Image Segmentation
The pre-processed image is taken as an input for K-Mean segmentation algorithm. This algorithm is used for segmenting the disease part from the leaf. Colour segmentation is done in this algorithm. The pre-processed image is converted to LAB (L-Luminosity, a*b- Chromaticity) form. Luminosity is defined as brightness layer and chromaticity layer „a*‟ is defines the colour fall along red and green axis, chromaticity layer „b*‟ defines the colour fall along red and blue axis. Only colour component „a‟ is extracted. Each cluster contains similar colour pixel values.
K-Means Algorithm follows few steps that includes:
a. Data points are randomly assigned to each cluster and the image is split into clusters k.
b. Distance between each data point and each cluster is calculated using Euclidean distance.
c. The data point that is close to the nearby cluster will be placed in that.
d. The data point that is not close to the cluster will be placed in nearby cluster.
e. Repeat the whole process till constant clusters are reach. Once it reaches stop the process by those clusters.
Figure 4. shows the segmentation of the enhanced image with disease separated
Figure 4. Input image with the segmented images using K-Means Segmentation
5.4. Feature Extraction
In this process Grey-Level Co-Occurrence Matrix (GLCM) is used for extracting textual features like Mean, Standard Deviation, Variance, Correlation, Entropy, Energy, IDM, RMS, Homogeneity, Contrast, Skewness, Kurtosis. In this process below features are extracted from the segmented images.
a. Mean: This measure gives the average of all the pixels in the segmented image.
b. Standard Deviation: It gives information about how the data being spread across.
c. Variance: It measure the heterogeneity of the image. It‟s an inverse to homogeneity. It is sum of difference between intensity of the central pixel and its neighborhood. If variance increases then homogeneity decreases.
d. Correlation: If an image contains a significant amount of linear structure, this correlation is high.
e. Entropy: Entropy is an information content calculation and measures the complexity of an image. The frequency distribution is determined by randomness.
f. Energy: It calculates the textual uniformity and detects the textual disorders.
The energy cost is high when the window is effective in order
g. Inverse Difference Moment (IDM): It gives about the local homogeneity of the image. IDM value is directly proportional to homogeneity.
h. Root Mean Square (RMS): It is a measure of the magnitude of a set of pixels.
It gives an indication of the typical numbers size.
i. Homogeneity: It tells about the image homogeneity. If the image pixels are same then the value of homogeneity will be 1. If energy is constant, contrast is high then the homogeneity will be low.
j. Contrast: It gives the amount of local variance in the given image.
k. Skewness: It reflects an inconsistency and asymmetry of the mean of the distribution of data
l. Kurtosis: It talks about the outliers in the data.
5.5. Classification
In this method, for classification of leaf diseases SVM classifier is used. Once features are extracted, SVM takes that as input and classifies between Greening, Canker and Healthy leaves. (70,30) ratio is taken for splitting training and testing which sending to SVM classifier. 10-fold cross validation is used to reduce the variance so that performance estimate will be less sensitive to the partitioning of the data.
Section 6 addresses outcomes of the proposed methodology.
6. Results and analysis
In this system, the K-Means segmentation algorithm is performed on 427 images of citrus leaf dataset with three different classes Greening, Canker and Healthy. The extracted features are then sent to SVM and kNN classifier for classification process. Comparing both the classifiers SVM is providing better results than that of kNN. Accuracy of SVM is 70.31% and kNN‟s accuracy is 64%.
Table 4, Table 5 refers to the confusion matrix of both SVM and kNN respectively.
Table 6 refers to performance metrics of the SVM model.
Table 4. Confusion matrix of SVM
Table 5. Confusion matrix of kNN
Table 6. Performance metrics of the SVM model Canker Greening Healthy Sensitivity 0.833 0.66 0.27 Specificity 0.75 0.78 0.94 Misclassification 24.08
Accuracy 70.31
Canker Greening Healthy
Canker 45 16 2
Greening 8 42 6
Healthy 1 5 3
Accuracy 70.31%
Canker Greening Healthy
Canker 31 13 2
Greening 14 48 7
Healthy 5 5 3
Accuracy 64%
7. Conclusion
The precise identification and diagnosis of plant disease is absolutely essential for successful cultivation of the crop and this can be achieved by image processing and machine learning. This report uses a segmentation algorithm called K-Means and further textual features are extracted from the segmented images using GLCM algorithm. As it becomes multi class classification, the extracted features are given to SVM classifier which classifies the diseased and healthy leaves.
SVM performed better with accuracy of 70.31% when compared SVM to kNN. In future work, more number of disease classes will be added to the dataset and Ensemble classifier along with Histogram of Oriented Gradients as feature descriptor will be used.
References
1. Rekha Balodi, Sunaina Bisht, Abhijeet Ghatak and K.H. Rao, “Plant disease diagnosis:
technological advancements and challenges”, Indian Phytopath. 70 (3): 275-281 (2017) 2. Abirami Devaraj, Karunya Rathan, Sarvepalli Jaahnavi, K Indira, “Identification of Plant
Disease using Image Processing Technique”, International Conference on Communication and Signal Processing (ICCSP), Year: 2019, IEEE Conference Paper 3. Sharath D M, Akhilesh, S Arun Kumar, Rohan M G, Prathap C, “Image based Plant
Disease Detection in Pomegranate Plant for Bacterial Blight”, International Conference on Communication and Signal Processing (ICCSP), Year: 2019 IEEE Conference Paper 4. Vijai Singh, Varsha, A K Misra, “Detection of unhealthy region of plant leaves using
image processing and genetic algorithm”, International Conference on Advances in Computer Engineering and Applications, Year: 2015 IEEE Conference Paper
5. R Anand, S Veni, J Aravinth, “An application of image processing techniques for detection of diseases on brinjal leaves using k-means clustering method”, International Conference on Recent Trends in Information Technology (ICRTIT), Year: 2016, IEEE Conference Paper
6. Pranjali B. Padol, Anjali A. Yadav, “SVM classifier based grape leaf disease detection”, Conference on Advances in Signal Processing (CASP), Year: 2016, IEEE Conference Paper
7. Rashmi Pawar, Ambaji Jadhav, “Pomogranite disease detection and classification”, IEEE International Conference on Power, Control, Signals and Instrumentation Engineering (ICPCSI), Year: 2017, IEEE Conference Paper
8. Namrata R. Bhimte, V. R. Thool, “Diseases Detection of Cotton Leaf Spot Using Image Processing and SVM Classifier”, Second International Conference on Intelligent Computing and Control Systems (ICICCS), Year: 2018 IEEE Conference Paper
9. Umut Bariş Korkut, Ömer Berke Göktürk, Oktay Yildiz, “Detection of plant diseases by machine learning”, 26th Signal Processing and Communications Applications Conference (SIU), Year: 2018, IEEE Conference Paper
10. https://data.mendeley.com/datasets/3f83gxmv57/2
11. V. Sahithya, B. Saivihari, V. K. Vamsi, P. S. Reddy and K. Balamurugan, "GUI based Detection of Unhealthy Leaves using Image Processing Techniques," 2019 International Conference on Communication and Signal Processing (ICCSP)
12. R. Anand, S. Veni and J. Aravinth, "An application of image processing techniques for detection of diseases on brinjal leaves using k-means clustering method," 2016 International Conference on Recent Trends in Information Technology (ICRTIT)
Authors
D. Sruthi pursuing her Mtech in Computer Science and Engineering during 2018-2020 at Amrita Vishwa Vidyapeetham, Coimbatore.
P. Prakash received his ME in Computer Science and Engineering from SSN College of Engineering during 2007–2009 and his PhD in Information and Communication Engineering from Anna University in 2016. He is currently an Assistant Professor in the Department of Computer Science and Engineering at Amrita University. He is an Assistant Professor in the Department of Computer Science and Engineering at Amrita University. His research interests include cloud computing and big data analysis.