Vol 7, No 11 (2017)

(1)

Research Article

a

November

2017

Computer Science and Software Engineering

ISSN: 2277-128X (Volume-7, Issue-11)

Analysis of Crop Yield Prediction of Kharif & Rabi Jowar

Crops Using Data Mining Techniques

Prof. Sujata Mulik

Assistant Professor BVDU IMED Pune, Maharashtra, India

Dr. Ajit More

Professor BVDU IMED Pune, Maharashtra, India

Abstract— Agriculture sector in India is facing rigorous problem to maximize crop productivity. More than 60 percent of the crop still depends on climatic factors like rainfall, temperature, humidity. This paper discusses the use of various Data Mining applications in agriculture sector. Data Mining is used to solve various problems in agriculture sector. It can be used it to solve yield prediction. The problem of yield prediction is a major problem that remains to be solved based on available data. Data mining techniques are the better choices for this purpose. Different Data Mining techniques are used and evaluated in agriculture for estimating the future year's crop production. In this paper we have focused on predicting crop yield productivity of kharif & Rabi Crops.

Keywords— Agriculture, Data Mining, Yield Prediction, kharif, Rabi, Crops, Linear Regression, k-means

I. INTRODUCTION

Indian agriculture is known for its diversity which is mainly result of variation in resource and climate, to geography and historical and socio economic factors. Agriculture is the backbone of Indian Economy. In India, majority of the farmers are not getting the expected crop yield due to several reasons. The agricultural yield primarily depends on weather conditions like Rainfall, Humidity, and Temperature.

Yield prediction is an important agricultural problem. In the past, yield prediction was performed by considering farmer's previous experience on a particular crop or crop cutting experiment. The volume of crop data is enormous in Indian agriculture. Data Mining is widely applied to agricultural problems and is used to analyze large data sets and establish useful classifications and patterns in the data sets. The overall goal of the Data Mining process is to extract the information from a massive data set and transform it into understandable structure form for analysis.

In this paper the main aim is to create a user friendly interface for farmers, Government which gives the analysis of Kharif Jowar & Rabi Jowar based on available data. Different Data Mining techniques were used to predict the crop yield for maximizing the crop productivity. Till today, a very few farmers are actually using the new methods, tools and technique of farming for better production. Data Mining can be used for predicting the future trends of agricultural processes.

II. OVERVIEW OF DATA

The data used for this paper is obtained for the years from 2005 to 2010 for Sangli district of Maharashtra in India. The evaluation is considered for only Sangli district in India.

The data are taken in seven input variables. The variables are 'Year', 'Rainfall', „Temperature‟, ‟Humidity‟, ‟Crop area‟, ‟Crop Production‟, ‟Crop productivity‟. The attribute 'Year' specifies the year in which the data is available in Hectares. 'Rainfall' attribute specifies the average rainfall in the specified year in Millimeters. 'Temperature' attribute specifies the average temperature in the specified year in Degree Celsius. 'Area of Sowing' attribute specifies the total area sowed in the specified year for that region in Hectares. 'Crop Productivity (Yield)' specifies in Kilogram per hectare. 'Production' attribute specifies the production of crop in the specified year in Tons. This paper we focused study on dependent and independent variables. Rainfall, Humidity, Temperature are independent variables & crop production & crop productivity dependent variable.

III. OVERVIEW OF TOOLS

3.1] Overview of Weka:

(2)

ISSN(E): 2277-128X, ISSN(P): 2277-6451, pp. 79-85

predictive modeling, together with graphical user interfaces for easy access to this functionality. Weka is a collection of machine learning algorithms for solving real-world data mining problems. Data mining tool predict future trends and behavior.

3.2] Overview of Ms-Excel

Ms-Excel is a powerful spreadsheet that is easy to use and allows you to store, manipulate, analyze, and visualize data.

IV. RESEARCH METHODOLOGY

In this paper the statistical methods namely linear regression techniques and data mining method is k – means are taking up for the estimation of crop yield analysis.

4.1] Linear Regression:

Linear regression is the most basic and commonly used predictive analysis. Regression estimates are used to describe data and to explain the relationship between one dependent variable and one or more independent variables. The dependent variable is called as „predictant‟ and independent variables are called „predictors‟.

The correlation coefficient measures the robustness of the relationship between two variables. Pearson's correlation coefficient is one of the most commonly used correlation coefficient and measures the linear relationship between two variables. The value of the correlation coefficient, denoted as r, ranges from -1 to +1, which gives the strength of the relationship and whether the relationship is negative or positive. When the value of r is greater than zero, it is a positive relationship; when the value is less than zero, it is a negative relationship. A value of zero indicates that there is no relationship between the two variables.

RJR KJMinT KJMAXT KJMH KJEH

Correlation coefficient 0 1 0 1 0

Mean absolute error 12.744 0 0.5576 0 4.1528 Root mean squared error 15.1663 0 0.6387 0 5.6066 Relative absolute error 100 % 0 100% 0 100% Root relative squared error 100 % 0 100% 0 100% Total Number of Instances 5 5 5 5 5

The above table shows that MinT and Morning humidity are strongly positive relationship with Rabi jowar productivity.

KJR KJMinT KJMAXT KJMH KJEH

Correlation coefficient 0 1 1 1 1

Mean absolute error 5.2464 0 0 0 0 Root mean squared error 6.2882 0 0 0 0 Relative absolute error 100% 0 0 0 0 Root relative squared error 100% 0 0 0 0 Total Number of Instances 5 5 5 5 5

The above table shows that MinT, Morning humidity, MinT, MaxT and Evening humidity are strongly positive relationship with kharif jowar productivity.

4.2] K-means:

Clustering is a formation of groups on the basis of its attributes and is used to determine patterns from the data .k-means is a one algorithm or techniques of clustering. It is centroid based techniques, used to find the clusters and each cluster is associated with a centroid (center point) and each point assigned to the cluster with the closest centroid.

Crop Name: Kharif Jowar

Year Area in " 00" ha Production in "00" Tonnes Productivity in Kg /ha

(3)

ISSN(E): 2277-128X, ISSN(P): 2277-6451, pp. 79-85

2010-2011 872 1087 1247

2011-2012 604 453 751

2012-2013 558 266 476

2013-2014 618 533 862

The highest productivity of Kh. Jowar was in the year 2010-11 (1247 Kg / Ha) and the lowest productivity was in the year 2012-13 (476 Kg / Ha)

Crop Name: Rabi Jowar

Year Area in " 00" ha Production in "00" Tonnes Productivity in Kg /ha

2009-2010 1861 1117 600

2010-2011 1806 1030 570

2011-2012 1772 419 236

2012-2013 1864 439 236

2013-2014 2176 1245 572

The highest sowing area for Rb. Jowar was in the year 2013-14 (217600 Ha). However, the highest productivity was in the year 2009-10 (600 Kg / Ha)

(4)

ISSN(E): 2277-128X, ISSN(P): 2277-6451, pp. 79-85

Kharif season weather data

The lowest rainfall measured for the Kharif season was in the year 2011-12 (9.55 MM) and the highest rainfall was in the year 2009-10 (28.35 MM)

Rabi season weather Data:

Year Rainfall MM) Temp.OC Humidity (%)

Min Max Morning Evening

2009-2010 30.4 18.5 30.04 92.03 60.19 2010-2011 18.8 16.09 31.75 92.95 54.03 2011-2012 14.8 18.5 31.62 85.91 43.45 2012-2013 0 17.1 31.06 92.82 56.64 2013-2014 45.1 17.44 30.59 88.88 54.85

Year Rainfall (MM) Temp.OC Humidity (%)

Min Max Morning Evening

(5)

ISSN(E): 2277-128X, ISSN(P): 2277-6451, pp. 79-85

The lowest rainfall measured for the Rabi season was in the year 2012-13 (0 MM) and the highest rainfall was in the year 2013-14 (45.1 MM)

Year KJR (mm) RJR (mm) KRJR _diff KJTem p.OC RJ Temp.O C KRJ T_dif f KRJ T_dif f KJH (%) RJH (%) KRJ H_dif f Mi n M ax Mi n M ax

Min Max M E M E M E

2009-2010 28.35 30.4

2.05 (Rb) 21. 27 31. 15 18. 5 30. 04 2.77( Kh) 1.11( Kh) 91. 76 66. 77 92. 03 60. 19 0.27( Rb) 6.58( kh)

2010-2011 22.57 18.8

3.77 (Kh) 17. 58 30. 27 16. 09 31. 75 1.49( kh) 1.48( Rb) 95. 97 57. 23 92. 95 54. 03 3.02(k h) 3.2(k h)

2011-2012 9.55 14.8

5.25 (Rb) 22. 35 30. 15 18. 5 31. 62 3.85( Kh) 1.47( Rb) 93. 97 66. 28 85. 91 43. 45 8.06(k h) 22.83 (kh)

2012-2013 21.02 0

21.02 (Kh) 22. 07 31. 08 17. 1 31. 06 4.97( Kh) 0.02( Kh) 93. 52 66. 93 92. 82 56. 64 0.7(kh ) 10.29 (kh)

2013-2014 16.55 45.1

28.55 (Rb) 20. 98 32. 27 17. 44 30. 59 3.54(

kh) 1.68 93. 82 66. 89 88. 88 54. 85 4.94(k h) 12.04 (kh)

The above table analyzes the difference of predictors (Independent) variables for kharif and Rabi seasons.

V. RESULT ANALYSIS / DISCUSSION

KJR KJMINT KJMAXT KJMH KJEH KJA KJPRO KJP

minimum 9.55 17.58 30.15 91.76 57.23 558 266 476

maximum 28.35 22.35 32.27 95.97 66.93 907 1087 1247

mean 19.608 20.85 30.984 93.808 64.82 711.8 622.2 837.4

stddev 7.03 1.912 0.851 1.498 4.251 164.196 316.995 276.874

RJR RJMINT RJMAXT RJMH RJEH RJA RJPRO RJP

minimum 0 16.09 30.04 85.91 43.45 1772 419 236

maximum 45.1 18.5 31.75 92.95 60.19 2176 1245 600

mean 21.82 17.526 31.012 90.518 53.832 1895.8 850 442.8

stddev 16.956 1.018 0.714 3.058 6.268 161.342 391.917 189.154

The above table shows minimum, maximum mean of the data for the independent variables rainfall, humidity

(6)

ISSN(E): 2277-128X, ISSN(P): 2277-6451, pp. 79-85

Abbrevations:

1) K – kharif 2) J- jowar 3) R- Rabi

4) KJR –kharif jowar rainfall

5) KJMINT- kharif jowar minimum temperature 6) KJMAXT- kharif jowar maximum temperature 7) KJMH – kharif jowar morning humidity 8) KJEH- kharif jowar evening humidity 9) KJPro – kharif jowar production 10) KJP – kharif jowar productivity

K means Analysis:

Kharif jowar analysis by k-means:

Cluster 0: year 2009-2010, 2011-2012, 2012-2013, and 2013-2014 Cluster 1: year 2010-2011

1) The first cluster contained 4 data points, which is 80% of the total of 5 data points. crop productivity is medium. 2) The second cluster contained 1 data points which is 20% of the total of 5 data points. Crop productivity is high.

Rabi jowar analysis by k-means:

Clustered Instances 0 2 (40%) 1 3 (60%)

Cluster 0: year 2011-2012, 2012-2013

Cluster 1: year 2009-2010, 2010-2011, 2013-2014

1) The first cluster contained 2 data points, which is 40% of the total of 5 data points. crop productivity is low 2) The second cluster contained 3 data points which is 60% of the total of 5 data points. Crop productivity is

medium

VI. CONCLUSION

Agriculture is the most significant application area particularly in the developing countries like India. Agriculture can change the situation of decision making and farmers can take yield in better way using information technology in agriculture sector. In this paper we have focused on how data mining plays a crucial role for decision making on several issues related to agriculture field. We have also used MS-Excel and Weka tool for analyzing data of Kharif Jowar and Rabi Jowar. We found that some pattern of independent variables have an effect on crop yield productivity, this has been represented graphically in this paper.

REFERENCES

[1] Raorane A.A., Kulkarni R.V, “Data Mining: An effective tool for yield estimation in the agriculture sector”, International journal of Emerging Trends & Technology in Computer Science,ISSN volume1,Issue2july – August2012.

[2] Rajshekhar Borate, 2Rahul Ombale, 3Sagar Ahire, 4Manoj Dhawade,5Mrs. Prof. R. P. Karande ,”Applying Data Mining Techniques to Predict Annual Yield of Major Crops and Recommend Planting Different Crops in Different Districts in India”, International Journal of Novel Research in Computer Science and Software Engineering ISSN 2394-7314Vol. 3, Issue 1, pp: (34-37), Month: January-April 2016

[3] Raj Kumar Tripathi , Nishtha Kesswani ,” Clustering the Indian States on the Basis of Agriculture Produce of KHARIF and RABI Crops “,International Journal of Electronics Communication and Computer Technology (IJECCT) ISSN:2249-7838 Volume 2 Issue 2 (March 2012)

[4] Maria Rossana C., de Leona Eugene Rex L, “A Prediction Model Framework for Crop Yield Prediction”,

(7)

ISSN(E): 2277-128X, ISSN(P): 2277-6451, pp. 79-85

[5] Raorane A.A.1 Kulkarni R.V, “Review- Role ofx` Data Mining in Agriculture”, , International Journal of Computer Science and Information Technologies, Vol. 4 (2) , 2013.

[6] Ramesh1, B Vishnu Vardhan2, “Data Mining Techniques and Applications to Agricultural Yield Data”, International Journal of Advanced Research in Computer and Communication Engineering Vol. 2, Issue 9, September 2013.