Fruit Recognition Using Deep Convolutional Neural Network With Color Feature

(1)

Fruit Recognition Using Deep Convolutional

Neural Network With Color Feature

Dr.E.Gothai, Dr.R.Thamilselvan, Dr.R.R.Rajalaxmi, Dr.P.Natesan

Abstract: Deep learning is a major part in the family machine learning. Learning is based on data representations and not like the traditional method which uses task-specific algorithms. There are different forms of deep learning architectures where the learning it uses may be of supervised, unsupervised or semi supervised. First one common one is deep neural networks. Later there are other ones such as belief and recurrent network. The above said can be applied to the areas where images, audio and videos are involved. It has produced good results when it is compared to human experts. New dataset is considered in this paper. It contains fruits images. In order to get a good classifier the quality of the images is very important. Many datasets has noises in build into it. If noisy images are present then it may lead to incorrect identification and classification of the objects, which we try to find out. Our next objective is to train the network in order to find out the fruits. The objective ﬁts to the current companies which are working under the area of augmented reality. Here we describe the training and testing data and the performance obtained. Finally, we propose and suggest few things to improve the accuracy.

Keywords: Augmented reality, Deep Convolutional Neural network, Feature Recognition

————————————————————

I. INTRODUCTION

Deep Learning works in similar to the brain of human being, which processed the data and create the patterns, which are used to make the decisions. Deep learning has a capability to learn any unsupervised data. The data can be in unstructured form or in a unlabeled form. In this new world, deep learning method shows a vigorous role, which has brought about an explosion of data in all forms and from every region of the world. This huge data, which is evolving in this digital world, is known as big data where the data comes from the sources like social media applications, search engine applications, online cinemas, e-commerce portals and many applications. This enormous amount data is readily accessible and is shared through cloud computing. The data, which is received in the unstructured form if huge in nature. It will take many years for the humans to get the information out of it. Machine learning algorithm is used to process the big data. The algorithm will give good analysis report and will suggest good patterns. If any company is using digital mode of payments, they have a good option to use machine learning algorithms. Deep learning algorithms are used to carry out the steps it needs to carry out. In the traditional method, programs use a linear method to provide the solutions. Non-linear method is followed in the deep learning algorithms. In earlier methods only few parameters like the amount of transaction is used to identify the fraud that takes place. However, in deep learning method use most of all the parameters for

analysis. The information like time, the location of the data, IP details, retailer’s details and even other information are included in it, which points to an activity of fraudulent. Deep learning method is very much successful on areas where analog inputs are given and the outputs are in the analog form. It tells that it is in pixel form. It may also be present in text form or audio. The convolution neural network is now days used in most of the areas because this method will scale well and the data can be trained using back propagation method. Deep learning is composed of computational models, which contains multiple processing layers. The processing done in the layers helps to know about the data, and its abstraction. The deep learning is used in building and training. It helps in decision-making. The most modern algorithms in machine learning are said to be a "shallow" one.

II. CONVOLUTION

NEURAL

NETWORK

The Convolution Neural Network is one kind of artificial network, which is helpful in image identification, classification, extraction and processing the data. Artificial Neural Networks uses deep learning to do generative and descriptive tasks. For this purpose, it uses computer vision that uses image recognition. A CNN uses a multi-layer perceptron, which is used for processing the data requirements. The layers in CNN contains one type input layer, one type of output layer and a set of hidden layers. The hidden layer has many layers in it. The layers include multiple convolution layers with pooling layer. It also has fully connected layers. When limitations are considered, it automatically improves the efficiency than the other method used previously for image processing. The convolution layer is the body like building block in the convolution neural network. Parameter sharing is also present. Local connectivity is that each neural is connected only to a part of the given image. This helps to reduce the parameters in the system and which intern will make the computation very efficient. In recent world, the hottest topics to make an automatic system of fruit detection is machine vision and it also _________________________________

 Dr. E. Gothai*, Department of CSE, Kongu Engineering College, Perundurai, India Email: [email protected].  Dr.R.Thamilselvan, Department of IT, Kongu Engineering

College, Perundurai, India

Email: [email protected].

 Dr.R.R.Rajalaxmi, Department of CSE, Kongu Engineering College, Perundurai, India. Email: [email protected]

(2)

combine with machine learning. In an automated system, it identifies fruit species and it detects the number of leaves. The automated system improves accuracy. In a robust system, images are taken to cover the natural variation. It considers the environmental conditions and development stages. The condition for that are settings of the light, types of soil, etc., the automated system detects the number of leaves of fruit plants before applying of fruit management techniques. One of the major issues is the fruit leaves overlap and it is closed by the plant. But in order to count all the leaves which are hided should also be considered to find the growth of the plant so that correct treatment can be given. When we go manual counting of fruit leaves it is much difficult to do even by the expert members in that area. In an automatic system, counting of leaves is done using computer vision. Here it considers the binary images. The purpose of choosing binary images is from the background, the images are segmented and now number of leaves can then be counted. Nevertheless, what is the issue with the segmented type of plants is again there comes the failure in identifying overlapping. Giuffrida introduces a new method that counts the leaves. Here the images to a log polar space from RGB space. The properties of log polar images were taken. The vector regression method was used. It now counts the number of leaves. Here, the problem with log polar space is it uses segmented images. Since segmented images are used for training phase and in final model, it is difficult to implement. In recent years, CNN have shown significant achievement in the areas like computer vision and in machine learning. Since it has the capability to remove efficient features. CNN is used for solving problems like classification of plant species, detection of fruits, classification of pest and detection of plant disease and its diagnosis. Ren and co-authors discuss the usage of recursive neural networks. It also requires training images. Here the leaves are fully segmented based on instance level. It will take minutes for each kind of plant to do it correctly. ich uses encoders and decoders for the same dataset by using the convolutional neural network. The CNN uses regression method to count the leaves. The method considered is based on a CNN, which train the map images for nine different classes. In a deep convolutional network, it has weight sharing among the nodes between the layers. This makes it easy to classify images in uncontrolled conditions. It has less error rate when compared with the other classifier methods. The CNN consists of many layers inside it with sampling filters. This is used in feature extraction. The fully connected layers then follow it.

III. LITERATURE

REVIEW

The study on CNN has expanded thoroughly and rapidly in recent times in the field of fruit detection and many researchers have many terms to describe the combining models involving different algorithms. Horea Mures, Mihail Kogalniceanu, and Babes-Bolyai uses a new dataset of images which contains fruits. The

output of the experiment to identify the fruits is given by training the network. This is part of a complex project. The project is to classify the classifiers which will identify the array of objects from the given images. The discussion is started with fruit recognition. In the forthcoming one, Fruits-360 dataset is taken and discussed about it like its creation and its details. In the next part, we have discussion about the framework. The framework taken is Tensor Flow. We also summarize about training data plus testing data. There is also a discussion about the performance of the system. S.Arivazhagan et al proposed an improved method of fruit recognition. The minimum distance method of the classifier is used for recognition. Experimental results are taken and different classes are confirmed. Woo Chaw Seng proposes a New Method of fruits detection system. Many fruit images are considered and if so it have matching color and values for the shape. Therefore this proposed method is still effective enough to identify and distinguish fruits images. Following the system shows the fruit name and a short description about it is given to user.

IV.

FEATUREEXTRACTION

CNN is a form of deep and feedforward neural network that are used to analyze visual imageries. Convolutional network is similar to biological processes in that connectivity pattern among the neurons looks like the structure of the animal cortex image. CNN consists of input, output and the middle layer is the hidden layers. The middle layers consists of convolution layers. It also has pooling layers and ReLU layer. Finally there is a fully connected layer with normalization plus loss function layer. The convolution layer is the center block of a CNN. The parameters of layers contains a group of filters. The filter present is convolved through the width. The filters does the same across the height. It then computes dot product. It is computed with the entries contained in the filter and with the input. It produce an activation map for that filter. Finally it identifies the feature with its nearby position.

Pooling layer, is a step followed in the convolution process. The most used pooling functions are MAX pooling, Average Pooling, etc., It divides the given input image into a pack of nonoverlapping parts and for each such part it takes output with maximum if MAX pooling is considered. The pooling layer reduces size by reducing parameters and computations and therefore it controls over fitting. Over a period insert a pooling layer among successive convolution layers. The translation process is carried out in pooling layer.

(3)

The most commonly employed activation function for the neurons in the fully connected output layer of CNN is ReLU, the Rectifier Unit. This Activation function enables better training of more deeper networks and its performance is compared to the widely used other activation function like Softmax.

ReLU function which is used in the output layer applies the non-saturating form of the activation function f ( x ) = max ( 0 , x ). Due to this, the non-linear property of the decision function increases in the network which will not affect the acceptable fields.

After several convolution operations and MAX pooling operations, high level computation in the neural network is done through the fully connected layers. The neurons in a fully connected layer have the characteristics of having connections to all other neurons in the previous layer of CNN.

The final layer, loss layer shows how training penalizes the deviation among the predicted labeling and true labeling which is normally in the final layer of our proposed CNN architecture. An appropriate loss functions for different tasks may be employed in the respective layers.

Fig 2: Convolutional Neural Architecture

The training stages of Convolution Network for fruit recognition are as given below:

Step 1: Initialize with random values for all filters and parameters or weights.

Step 2: The training image is taken as input in the network and goes to convolution layer as it is the step of forward propagation. ReLU function is combined with the specified MAX pooling operations carried out in the fully connected layer. The output from the pooling layer moves along in the forward direction. The output probabilities is determined for each category of images. Let’s consider the fruit image with its output probabilities as [0.2, 0.4, 0.1, 0.3]. Here the output probabilities are shown as random values since network link weights assigned are random for the first training sample.

Step 3: Calculate total prediction error by taking summation for all category of classes in the output layer.

Total Error = ∑½ (targetprobability –outputprobability) (1)

Step 4: Calculate the gradients of the class prediction error with respect to all network link weights in the CNN architecture. Based on the gradient descent, update the values of the filters used in the architecture. The various hyper parameter values of CNN are used to reduce the output prediction error. The weights in the network are adjusted to the minimize the total error based on their contribution in the class label prediction.

Step 5: Repeat the steps from 2 to 4 by considering all images in the given training dataset. The above mentioned steps fully trains the Convolution Network which means that all the neural network hyper parameters is been optimized. This trained optimized model is employed to correctly classify the new unseen images which are given in the test dataset.

Preprocessing of the dataset is done in such a way that its format is changed and its original size is reduced into 256*256 pixels. It is then given to the deep learning model which is built using ConvNet framework which contains 3 dense layers and an output layer. This convolution layer helps in extracting the features of fruit and its accuracy is evaluated for the trained model.

V.

RESULTSANDDISCUSSION

The identified dataset contains extracted features of the image set which are used to classify the predictions of fruit image. All features identified shows an anatomical part or it is given as an image level descriptor. The proposed method used for feature extraction is purely based on gray level co-occurrence of matrix for 126 different color features and 60 different texture features.

The parameter used for evaluating the datasets is accuracy. Accuracy is one of the metrics which are commonly considered for evaluating classification models. Informally, accuracy is the fraction of right predictions for our model Actually, accuracy is given by;

Accuracy=

(2)

For binary classification the accuracy is calculated as

Accuracy= (TP + TN )/(TP+ FP +TN+ FN)

Where TP = True Positives; FP = False Positives;

(4)

Table 1: Confusion Matrix

Accuracy obtained through the model is 75% (0.75). The accuracy and further classification of model will be considered in near future.

Table 2: Performance Matrix

EPOCHES TRAINING ACCURACY

VALIDATION ACCURACY

10 81.25 58.0

20 83.5 60.5

30 85.65 61.75

40 86.0 58.56

50 89.90 65.71

Here we have calculated for 50 epoches and got maximum of 80% (0.80) as validation accuracy and 90% (0.90) as training accuracy.

Fig:3 : Total number of test and train samples

Fig 4: Arrary representation of input images

Fig 5: Training the model

Fig 6: Prediction score and digits for Apple

Fig 7: Prediction score and digits for other fruits Predicted Class

Positive Negative Actual Class

P TP FN

(5)

Author-2 Photo

to Fig 8: Training and Testing Performance

VI.

CONCLUSION

Feature extraction is a method that reduces the features that are used to describe a large set of data. In the proposed work, CNN is used to extract the features and the extracted features are used to predict a random image and to determine whether it the fruit or not. It gives a training accuracy of nearly 90%. In nearby future, steps may be taken to improve the accuracy and to evaluate the model with other parameters.

REFERENCES

[1] Bargoti, S., and Underwood, J. Deep fruit detection in orchards. In 2017 IEEE International Conference on Robotics and Automation (ICRA) (May 2017).

[2] Barth, R., Hemming, J., and Henten, E. V. Data synthesis methods for semantic segmentation in agriculture: A capsicum annuum dataset. Computers and Electronics in Agriculture 144 (2018), 284 – 296. [3] Chan, T. F., and Vese, L. A. Active contours without

edges. IEEE Transactions on Image Processing 10, 2 (Feb 2001), 266–277.

[4] Cheng, H.,Damerow, L., Sun,Y., and Blanke, M. Early yield prediction using image analysis of apple fruit and tree canopy features with neural networks. Journal of Imaging 3, 1 (2017).

[5] Cires¸an, D. C., Giusti, A., Gambardella, L. M., and Schmidhuber, J. Deep neural networks segment neuronal membranes in electron microscopy images. In Proceedings of the 25th International Conference on Neural Information Processing Systems - Volume 2 (USA, 2012), NIPS’12, Curran Associates Inc.,

[6] Cires¸an, D. C.,Meier, U., Masci, J., Gambardella, L. M., and Schmid- huber, J. Flexible, high performance convolutional neural networks for image classification, In Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence - Volume Two (2011), IJCAI’11, AAAI Press, pp.

[7] Ciresan, D. C., Meier, U., and Schmidhuber, J. Multi-column deep neural networks for image classification. CoRR abs/1202.2745 (2012).

[8] Clevert, D., Unterthiner, T., and Hochreiter, S. Fast and accurate deep network learning by exponential linear units (elus). CoRR abs/1511.07289 (2015).

[9] Hannun, A. Y., Case, C., Casper, J., Catanzaro, B., Diamos, G., Elsen, E., Prenger, R., Satheesh, S., Sengupta, S., Coates, A., and Ng, A. Y. Deep speech: Scaling up end-to-end speech recognition.

[10] Hemming, J., Ruizendaal, J.,Hofstee, J.W., and vanHenten, E. J. Fruit detectability analysis for di_erent camera positions in sweet-pepper. Sensors 14, 4 (2014).

AUTHORS

PROFILE

Dr.E.Gothai completed Ph.D in the area of Big Data Analytics. She has 19 years of teaching experience. She published 9 research papers in International journals and presented 16 papers in the conferences. She is working as an Associate Professor in the Department of Computer Science Engineering, Kongu Engineering College, Tamilnadu. Her areas of interest include Data mining, Big Data Analytics and Deep Learning.

Dr.R.Thamilselvan completed Ph.D from Anna University in the area of Grid Computing. He has 19 years of teaching experience. He published 14 research papers in International journals and presented 17 papers in the conferences. He is working as an Assistant Professor in the department of Information Technology, Kongu Engineering College affiliated to Anna University, Chennai, Tamilnadu. His areas of interest include Cloud computing, Grid Computing and Nature Inspired

Dr.R.R.Rajalaxmi completed M.E. Computer Science and Engineering under Bharathiar University in the year 2000. She completed her PhD in Information and Communication Engineering under Anna University in the year 2011. Currently she is working as the professor and Head of the Department, Department of Computer Science Engineering, Kongu Engineering College, Tamilnadu. She is a life member of CSI and ISTE. Her area of interest is Data Mining, Data Analytics and Machine Learning. She organized nine sponsored seminars/workshops/training prorammes. She published 11 research papers in the international journals and 6 papers in the international conferences. She also completed three research projects sponsored by various funding agencies.