Data Processing - Determination and modelling of energy consumption in wheat production using n

4.4 Survey

4.4.3 Data Processing

In order to convert different quantitative data to energy and for data processing, it was necessary to design effective spreadsheets. Spreadsheets should provide appropriate

spaces for original and calculated data. The final estimations were calculated from conversion coefficients, formulae, and equations in spreadsheets. Data were entered manually into a Microsoft Excel spreadsheet. Microsoft Excel was used for its abilities and facilities for mathematical and statistical calculations and analyses. The main spreadsheet contained all the necessary energy coefficients and was used to calculate energy inputs and outputs for each farm.

From the literature review, equivalent energy inputs were determined for all input parameters. After finishing the survey, for better analysis, some indices and new variables were defined to examine the influence of the interaction of variables on energy use in wheat production. The most important new variables and indices in this study were wheat area/ total farm, crop area/ total farm, average size of wheat paddocks, average size of crop paddocks, tractor power (hp)/farm area (ha), average age of tractors on each farm, and average power of tractors on each farm.

The formulae and equations were entered manually into the main spreadsheet. The inputs, including direct and indirect factors, were placed in approximately 140 columns and 46 columns were used to calculate energy use of different sources and operations. Finally, the energy consumption per hectare for each farm was calculated. The data, formulae, and equations were checked several times to avoid any mistakes or errors. For better analysis, a series of spreadsheets were designed for various aspects of the study and the main spreadsheet was linked to these spreadsheets. For example, in one of the spreadsheets, the final estimations of average energy consumption for energy sources in irrigated farms, dryland farms and all farms were calculated, separately; also, operational energy consumption was calculated in the same spreadsheet. Furthermore, the amount of energy in each operation or energy source, the percentage of that operation or energy source was calculated. In another spreadsheet graphs were drawn. Additional spreadsheets were also designed for statistical calculations and fuel use estimations. To gain an insight into energy consumption in wheat production, operational (direct) energy including human energy, fuel, and electricity use were employed to calculate energy use for farm operations including tillage, drilling, spraying, fertilizer distributing,

irrigating, and harvesting. All energy inputs (direct and indirect) were entered in another spreadsheet that contained direct energy inputs including labour, electricity, and fuel, and indirect energy inputs including fertilizer, pesticides, machinery (production, maintenance and service), and seed. Furthermore, the relationship between farm inputs and outputs were examined. As the farms were of different sizes, an average per hectare was calculated for each input by summing the total amount of a particular input for each farm and dividing the result by the sum total size of all farms. This meant that the averages of the different factors were not averages for those factors on each farm; therefore, larger farms had a relatively higher value in the final average estimations. In the first step of data analysis, the Pearson product-moment correlation coefficient was used to explore the relationship between different variables. It was the most common measure of correlation; denoted by the letter r. The correlation coefficient gives insight into the relationship between two variables and it has been used for comparing and analysing different variables. The correlation coefficient ranges between -1 and +1 where a coefficient of +1 indicates a perfect and positive correlation between two variables. A coefficient of -1 indicates a negative correlation. In contrast, a coefficient of zero indicates no linear relationship between the two variables. In practice, correlation coefficient usually stays between -1 or +1. The correlation coefficient, r, is given by the formula:

(4-4)

where X and Y are the mean of the X and Y values being correlated, Sx and Sy are their

standard deviations, respectively, and N is the sample size.

In analysing the relationship between two variables it is important to know, if the correlation coefficient between two variables is significant or not. To establish the significance of correlation, the highest correlation coefficient was obtained from the table of critical values of the Pearson product-moment correlation coefficient. If the highest correlation coefficient for a given number of degrees of freedom is greater or equal to the

Y XS S N Y Y X X r



(  )(  )

value in this table, the correlation is significant at the level of significance given. The significant correlation between two variables is important to analyse the relationship between two different variables and it was also used for data reduction in this study. One of the common problems in correlation is that if the x-y relationship is curvilinear, the r value cannot explain the relationship completely. However, a linear relation can be mistakenly assumed to be the best fit based on the correlation coefficient. In general, it is not advisable to extrapolate the relations beyond the range of the data collected because even if the linear relationship is good within the range of the data, it may be nonlinear outside this range. To reduce these kinds of common mistakes in this study, all graphs for the data that had significant correlations were drawn, and their relationships were examined.

In document Determination and modelling of energy consumption in wheat production using neural networks : "A case study in Canterbury Province, New Zealand" (Page 125-128)