ESTIMATING ECONOMETRIC MODEL OF AVERAGE TOTAL MILK COST: A SUPPORT VECTOR MACHINE REGRESSION APPROACH

(1)

ESTIMATING ECONOMETRIC MODEL OF AVERAGE TOTAL MILK COST: A SUPPORT VECTOR MACHINE REGRESSION APPROACH

Reet Põldaru, Jüri Roots, Ants-Hannes Viira

Institute of Informatics, Estonian Agricultural University, Estonia

This paper gives an overview of the basic ideas underlying support vector machines (SVM) for regression and function estimation. A summary of currently used algorithms for training SVM is be presented. Application of SVM regression for estimating parameters of econometric model of average total milk cost in Estonian farms is considered and possibilities of application of SVM regression in rural areas are discussed. Studies on implementation of SVM regression methods (algorithms) and software packages in agricultural research and business must be extended.

Key words: econometric models, data mining, support vector machine regression, average total milk cost.

Introduction

In recent years the long-term prospects for the agricultural world markets have been subject of intensive dis- cussions mainly for two reasons:

• The rising concern is given to the food security situation in a number of developing countries.

• The high level of support for the agricultural sec- tors and large production surpluses in many developed countries, which could be exported only with the exten- sive use of export subsidies, result in the need for international efforts to liberalise the markets.

Through last decades, the use of economic models in relation to agricultural policy issues has increased substantially, and a big number of literature sources on these issues is available. A number of different modelling ap- proaches have been applied.

Estonia is one of the new members of the European Union. The EU enlargement brings for East European countries a lot of changes in their agriculture. These are changes at political, economic and technical level. This means that information systems on agriculture (databases, models, etc.) have to move along with those changes.

Consequently, the economic models in Estonia have to be created, developed and renewed, and must be harmonised with the European requirements. Hopefully, new information technology can be used to lead such evolutions.

A variation in the behavioural characteristics of agricultural production systems over time as well as between countries is recognised. The diverse nature of agricultural production systems and agri-food markets across the EU poses a challenge to anyone seeking to develop a model that can be used to analyse policy at the EU and its mem- ber state level.

The guiding principle in constructing the national level commodity models is that the models are first and foremost economic models and such economic theory is

our first guide in specifying the models. Economic relationships in the national commodity models are based, in so far as is practicable, on time series econometric estimates of these relationships. Theory and expert judgement are also used in the verification and, if necessary, adjust- ment of econometrically estimated equations, particularly when used to generate projections for further periods.

Improving the competitiveness of Estonian agriculture is a priority objective of Estonian agricultural policy. The outcome and impacts of those policy actions will strongly depend on development of agricultural world markets.

Dairy sector is the most competitive branch of Esto- nian agriculture. Consequently, the need to make Esto- nian dairy farms more competitive is obvious.

New data analysis procedures provided by current data mining (DM) (Andriaans and Zantinge 2003, Dunham 2003, Fayyad et al. 1996, Friedman 1997) have substantially changed the situation in the field of data processing (DP). The situation in data mining is the most challenging one. Data mining, often called knowledge discovery in databases (KDD), started to depart from the statistics and machine learning ghettos and move into the mainstream almost 10 years ago.

Data mining is the process of discovery of useful information from large collections of data. It has common frontiers with several fields including Data Base Man- agement (DBM), Artificial Intelligence (AI), Machine Learning (ML), Pattern Recognition (PR), and Data Visualisation (DV).

The researchers of the Institute of Informatics of the EAU have investigated the possibilities of using some new DM methods and also have some experience in im- plementing algorithms used in DM packages (Bayesian statistical methods (Põldaru and Roots 2001b, 2001c, 2003b), neural networks (Põldaru and Roots 2002a, 2003a), principal components method (Põldaru and Roots 2001a), decision trees and rules (e.g. CART – classification

(2)

and regression trees) (Põldaru et al. 2003a, 2003d), association rules discovery (Põldaru et al. 2003b), fuzzy regression methods (Põldaru et al. 2004b) and support vector machine regression (Põldaru et al. 2004a, 2004b)). At Estonian Agricultural University, some experience has been gained in teaching the new data analysis procedures (principal component analysis, Bayesian methods, neural networks) for postgraduate students (Põldaru and Roots 2001b, 2003b, Põldaru et al. 2003c, Põldaru et al. 2004a).

The results of the research are published in many papers and conference theses (see references by Põldaru, Roots, Ruus and Viira).

Considering the reasons outlined above, there is a con- tinuing need for the application of more accurate and in- formative estimation techniques to econometric analysis.

The objective of this study is estimation of parameters of econometric model of average total milk cost and analysis of the results. The paper provides an overview about the support vector machines regression (SVMR), describes the potential implementation of it in rural areas and discusses the implementation of this method for analysing the dairy sector in Estonia (estimating econometric model of average total milk cost). The data used is an unbalanced panel of milk producers drawn from the FADN (Farm Accountancy Data Network) database of Estonian milk producers. The parameters are estimated on the basis of alternative models of SVM regression using non-linear model. The results are compared mutually and with results of ordinary linear regression. For model (parameter) estimation the SVM module of Programming Environment R is used. R is an inte-

grated suite of software facilities for data manipulation, calculation and graphical display.

Methods of investigation

Support vector machines have been successfully applied to a number of applications ranging from particle identification, face identification and text categorisation to engine knock detection, bioinformatics and database marketing. The approach is systematic and properly mo- tivated by statistical learning theory (Vapnik 1998).

Training (model parameter estimation) involves optimisa- tion of a convex cost function: there is no false local minimum to complicate the estimation process. The approach has many other benefits, for example, the model constructed has an explicit dependence on the most in- formative patterns in the data (the support vectors), hence interpretation is straightforward and data cleaning could be implemented to improve performance. SVMs are the best known from the class of algorithms, which use the idea of kernel substitution and which we will broadly re- fer to as kernel methods.

Suppose beeing given statistical data {(x1, y1), . . . , (xn, yn)}. The goal in SVM regression is to find the function f(x) that has at most ε deviation from the actually obtained targets yi for all the (training) data, and at the same time, is as flat as possible. SVM regression uses the ε -insensitive loss function shown in Figure 1. If the deviation between the actual and predicted value is less than ε , then the regression function is not considered to be in error.

Figure 1. A piecewise linear ε -insensitive loss function and plot of f

( )

x =a⋅x+b versus y with ε -insensitive tube. Points outside tube are errors

Thus, mathematically it looks like −ε≤a⋅x_i+b−y_i≤ε. Geometrically, this can be visualized as a band or tube of size 2ε around the hypothesis function f(x) and any points outside this tube can be viewed as errors (Figure 1). All training (data) points (xi, yi) for which f

( )

x_i −y_i ≥ε are known as support vectors; it is only these points that de- termine the parameters of f(x). In other words, errors are not considered as long as they are less than ε, but any deviation larger than this will not be accepted. To begin the

case of simple linear functions f(x), taking the following form is described:

( )

x a x b

f = ⋅ + (1).

For estimating the parameters of model (1) this prob- lem can be written as a convex optimization problem (Vapnik 1998):

minimise

_∑ ( )

=

+

⋅

+ ⁿ

i ξi ξi

C a

1 2 *

0 Loss

- ε ε

0 5 10 15

ε ε

(3)

subject to

⎪⎩

⎪⎨

⎧

≥

+

≤

− +

⋅

+

≤

−

⋅

− 0 , ^*

* i

i

i i i

i i

i

ξ ξ

ξ ε y b x a

ξ ε b x a y

(2),

where ξ_i; ξ_i^* are slack variables to cope with otherwise in- feasible constraints of the optimization problem (2). The constant C > 0 is specified beforehand. C is a regulariza- tion parameter that controls the trade-off between the flatness of f(x) and minimizing the training error. If C is too small then insufficient stress will be placed on fitting the training data. If C is too large then the algorithm will overfit the training data. The formulation above corresponds to dealing with a so called ε-insensitive loss function ξ_ε described by

⎩⎨

⎧

−

= ≤

otherwise ε

ξ

ε ξ ξ_ε if

....

0 (3).

It turns out that the optimization problem (3) can be solved more easily in its dual formulation. Hence a standard dual method utilizing Lagrange multipliers will be used. In the case of the Lagrangian dual (supporting) op- timisation problem needs to be optimised:

maximise

( ) ( )

∑ ∑

= =

−

⋅ + +

−

⎪⎩ −

⎪⎨

⎧− − ⋅ − ⋅ ⋅

n i

i i

i i n i

n

j i i j j i j

α α y α

α ε

x x α α α α

1 1

*

* 1 1

*

2 1

subject to

( )

⎪⎩

⎪⎨

⎧

≤

=

∑

−

=

C α α

α α

i i n

i i i

* 1

*

, 0

0 (4),

where αi and αi* are Lagrangian multiplier.

The value of regression parameter a and predicted value f(x) can be calculated as follows:

( )

∑

=

⋅

−

= ⁿ

i αi αi xi

a

1

* (5)

and

( ) _∑ ( )

=

+

⋅

−

= ⁿ

i αi αi xi x b

x f

1

* (6).

This is the so-called support vector expansion, i.e.

the regression coefficient a can be completely described as a linear combination of the training patterns xi. In a sense, the complexity of a function's representation by SVs is independent of the dimensionality of the input space X, and depends only on the number of SVs. More- over, the complete algorithm can be described in terms of dot products between the data. Even when evaluating f(x)

explicit computing of a is not needed (although this may be computationally more efficient in the linear setting).

These observations will come handy for the formulation of a non-linear extension.

The next step is to make the SVM algorithm non- linear. This, for instance, could be achieved by simply preprocessing the training patterns xi by a map into some feature space F, as described in (Vapnik 1998) and then applying the standard SVM regression algorithm.

Firstly, a mapping must be defined from the space X of regressors to the possibly infinite dimensional hypothesis space H, in which an inner product < , > is defined. This map is formally described as

Η Χ →

Φ: or x_aΦ

( )

x (7).

The choice of regression function f(x) is limited to the class of functions which can be expressed as inner products in H, taken between some weight vector a and the mapped regressor Φ

( )

x :

( )

x aΦ

( )

x b

f = , + (8).

The regression function in the hypothesis space is consequently linear, and thus the non-linear regression problem of estimating f

( )

x has become a linear regression problem in the hypothesis space H. Note that the mapping Φ need never be computed explicitly; in-

( )

^⋅

stead, the fact that if H is the reproducing kernel Hilbert space induced by k

( )

⋅,⋅ , then writing Φ

( ) ( )

x =k x,⋅ is used. This gives

( )

xi Φ

( ) (

xj k xi xj

)

Φ , = , (9).

The latter requirement is met for kernels fulfilling the Mercer conditions (Vapnik 1998). These conditions are satisfied for a wide range of kernels, including Gaussian radial basis function (RBF)

(

xi^,xj

)

^exp

{

γ

(

xi xj

)

²

}

k = − ⋅ − (10),

and polynomial function

(

xi xj

) (

γ xi xj g

)

^d

k , = − ⋅ ⋅ + (11).

It is emphasised that the feature space need never be defined explicitly, since only the kernel is used in SVM regression algorithms. Indeed, it is possible for multiple feature spaces to be included by a single kernel.

Consequently, this allows to rewrite the SV algorithm (formulas (4)…(6)) as follows:

(4)

maximise

( ) ( ) ( )

( ) _∑ ( )

∑

∑ ∑

=

= =

−

⋅ + +

⋅

−

⎪⎩ −

⎪⎨

⎧− − ⋅ − ⋅

n

i i i i

n

i i i

n i

n

j i i j j i j

α α y α

α ε

x x k α α α α

1

* 1

* 1 1

*

* ,

2 1

subject to

( )

⎪⎩

⎪⎨

⎧

≤

=

∑

−

=

C α α

α α

i i n

i i i

1 *

*

, 0

0 (12),

and regression parameter a and predicted valuef

( )

x can be calculated as follows:

( ) ^{( )}

∑

=

⋅

−

= ⁿ

i αi αi Φ xi

a

1

* (13)

( ) ∑ ( ) ⁽ ⁾

=

+

⋅

−

= ⁿ

i αi αi k xi x b

x f

1

* , (14).

The difference to the linear case is that a is no longer explicitly given. However due to the theorem of Fischer-Riesz (see e.g. (Riesz and Nagy, 1955)) it is al- ready uniquely defined in the weak sense by the dot products a,Φ

( )

x . Also note that in the non-linear setting, the optimization problem corresponds to finding the flattest function in feature space, not in input space.

Results and discussion

Next the potential implementation of SVM regression in rural areas is considered and the implementation of this method for estimating an econometric model of

the average total milk cost (average total milk cost per kg output) is discussed.

The econometric model is defined by

7 7 6 6 5 5 4 5

3 3 2 2 1 1

0x b x b x b x

bb b x b x b x

y ⋅ + ⋅ + ⋅ + ⋅

+= + ⋅ + ⋅ + ⋅ +

(15),

where

y represents the average total milk cost per unit of output (Estonian kroons per kg output),

x1 represents the average milk yield per cow (kg), x2 represents the farm total labor input (hours per hectare),

x3 represents the manufactured (purchased) milk price (Estonian kroons per 100 kg milk),

x4 represents the total labor input per 100 kg of milk (hours),

x5 represents the wage per hour (Estonian kroons), x6 represents the total costs of feed per cow (Esto- nian kroons),

x7 represents the invested capital per hectare (Esto- nian kroons).

The data is an unbalanced panel of milk producers drawn from the FADN (Farm Accountancy Data Net- work) database of Estonian milk producers. Some previous studies (Põldaru et al. 2004b, Viira 2003) have also based on the FADN database. The characteristics of the data are reported in Table 1.

Table 1. Data summary statistics

Statiatic Y x₁ x₂ x₃ x₄ x₅ x₆ x₇

Mean 2.11 5128 127.7 264.7 5.1 10.4 6889.5 16184.2

Median 1.99 5013.8 87.2 274.6 4.7 8.1 6122 10189.1

Standard Deviation 0.68 1154.6 115.7 53.8 2.6 9.3 3486.6 17180.8

Min 0.81 2447.3 13.2 95.3 0.7 6.0 613.8 425.2

Max 3.99 9706.7 769.8 409.4 16 44.4 22243.7 115657.9

The total number of observations n = 436.

In the case of the average total milk cost the non-linear functions are the most acceptable and must exhibit characteristics stated in the law of diminishing returns. According to the law of diminishing returns when one or more variable inputs are added to one or more fixed inputs the extra production obtained will, after a point, decline.

For linear and non-linear model (parameter) estimation the SVM module (Meyer 2003) of Programming En- vironment R (Venables et al. 2003) is used. R is an inte- grated suite of software facilities for data manipulation, calculation and graphical display.

When using SVM module for any given task, it is always necessary to specify a set of parameters (the parameters must be chosen in advance). Normally the archi- tecture of the SVM is specified in advance and weights and biases are estimated by supervised learning. These parameters include such indexes as whether one is inter- ested in regression estimation or pattern recognition, what kernel is used, what scaling is to be done on the data, etc.

Previous studies (Põldaru et al.. 2004a, 2004b) show that the most influential parameters are kernel type, parameter gamma and parameter epsilon. The some non- linear SVM models are sensitive to “overfitting” and

(5)

polynomial kernel function gave more acceptable results.

Previous experience has been considered for selecting parameter sets in Table 2.

The summary of (given) parameter set for various vari- ants of the non-linear model is reported in Table 2. Two other parameters (parameter g and d in formula (11)) have constant values for all alternatives, whereas g = 5, and d = 3.

The SVM models in Table 2 are compared with the results of neural network models (varnn1 and varnn2) and

ordinary linear regression (OLS). In the case of alternative “varnn1” there is one hidden node and in the case of alternative “varnn2” – two hidden nodes.

The study shows that the parameter sets in Table 2 give the most acceptable results.

Table 2 presents also result summaries of the results of various model alternatives. Summary characteristics for various alternatives are number of support vectors and coefficient of determination R².

Table 2. Parameter set for various alternatives and summary characteristics of the models Specified set of parameters Summary characteristics Variant

Kernel type Epsilon Gamma Cost C Number of SV R²

var1 Polynomial 0.5 0.08 1 61 0.886

var2 Polynomial 0.4 0.08 1 79 0.900

var3 Polynomial 0.3 0.08 1 122 0.909

var4 Radial 0.3 0.40 1 184 0.921

var5 Linear 0.3 0.14 1 188 0.806

varnn1 x x x x x 0.818

varnn2 x x x x x 0.848

OLS x x x x x 0.804

* parameters epsilon, gamma and C are specified for standardised data

Further the summary characteristics in Table 2 are discussed. For different alternatives the number of support vectors are different. The number of support vector depends mainly on the value of parameter epsilon (ε).

The SVM models have, in general, offered greater accu- racy than have their statistical forebears (OLS and neural network models). The values of the coefficient of determination R²are higher than in linear model case (OLS and var4 in Table 2). The minimal value in Table 2 (0.886) for non-linear SVM models is higher than maximal value for linear models (0.806). Consequently, the non-linear SVM regression models work well.

For econometric models the values of the coefficient of determination R²are not the only characteristic for estimating the models generalizing capacity. Next exhibiting

of the characteristics, stated in the law of diminishing returns, by selected models is analysed. That may be done by calculating the rules for partial derivatives of the models.

Next the partial derivatives for average milk yield per cow (x1) for considered alternatives are computed and analyzed. The values of partial derivatives are computed from predicted values numerically. The graph of derivatives with respect to independent variables for average milk yield per cow (x1) is shown in Figure 2. The dot line on the graphs presents OLS regression coefficient.

From the economic point of view the derivative for average milk yield per cow, as for a production factor or resource, must be negative (increasing the milk yield the milk cost decreases) and increase (increasing the milk yield the cost decrease diminish).

Figure 2. Graphs of partial derivatives with respect to value of independent variable for average milk yield per cow -1,2

-1 -0,8 -0,6 -0,4 -0,2 0 0,2 0,4

-3 -2 -1 0 1 2 3

Ave ra ge milk yie ld pe r cow Partial derivatives for average milk yield per cow

Var3 Varnn2

Varnn1 OLS

Var4

(6)

From Figure 2, it can be seen that when the Gaussian radial basis kernel is used (var4), the graph of the partial derivative is essentially non-linear (can not be founded from economic point of view), differ substantially from other graphs on Figure 2 and varies within [–0.73, 0.33]

(has positive values). Consequently, the case of “overfitting” is observed. Although, in this case, (of the alternative var4 ) the value of the coefficient of determination is maximal (0.921) that alternative is not acceptable from economic point of view. Therefore, for estimating the parameters of econometric model the Gaussian radial basis kernel can not be recommended.

From Figure 2, it can be seen that when the neural network models are used (varnn1 and varnn2), values of derivatives are negative and the graphs of the partial derivative are concave (can not be founded from economic point of view).

From the economic point of view the alternative var3 is the most acceptable one.

The study shows that the graphs of derivatives with respect to the other independent variables behave analo- gously. Consequently, the most acceptable kernel from economic point of view is polynomial kernel, and in the following discussion the potential implementation of alternative var1, var2 and var3 (based on polynomial kernel) is considered.

Next the partial derivative for the most essential independent variables is computed and analyzed: average milk yield per cow (x1), total labor input per 100 kg of milk (x4), the wage per hour (x5), and the total costs of feed per cow (x6) for considered alternatives.

The graph of derivatives with respect to independent variables for average milk yield per cow is shown in Figure 3.

Figure 3. Graphs of partial derivatives with respect to value of independent variable for average milk per cow

Analysis of the graphs may bring to the following conclusions:

• The graphs of partial derivatives are analogous (moderately non-linear and convex).

• The non-linear SVM regression models with lower value of epsilon are more flexible (the value of derivative varies more (see var3).

• From the economic point of view the values of partial derivative for average milk yield per cow, as for a production factor, are negative (increasing the milk yield the milk cost decreases) and increase (increasing the milk yield the cost decrease diminishes).

Consequently, every relation is acceptable and the considered alternatives can be recommended for practical use.

Next the partial derivative with respect to independ- ent variable for total labor input per 100 kg of milk (x4 ) is computed and analyzed.

The graph of the partial derivative with respect to the independent variables for total labor input per 100 kg of milk is shown in Figure 4.

The graph shows that the relation for alternative var2 and var3 is essentially non-linear. At the same time for alternatives var1 graph is moderately non-linear and decreasing (Figure 4). Consequently, all considered alternatives are acceptable from economic point of view.

Figure 5 shows the graphs of partial derivatives with respect to wage per hour (x5).

The last graph shows that all relations are moderately non-linear and convex. The graphs of partial derivatives for alternative var2 and var3 have a tendency to increase.

At the same time in the case of alternative var1 the graph is decreasing and, consequently, that variant is acceptable from economic point of view.

The graph of derivatives with respect to independent variables for total costs of feed per cow (x6) is shown in Figure 6.

-1 -0,8 -0,6 -0,4 -0,2 0

-3 -2 -1 0 1 2 3

Milk yield per cow Partial derivatives for Milk yield per cow

Var1

Var2 Var3

OLS

(7)

Figure 4. Graphs of partial derivatives with respect to value of independent variable for total labor inputs (hours per 100 kg milk pro- duced)

Figure 5. Graphs of partial derivatives with respect to value of independent variable for wage per hour (Estonian kroons per hour)

Figure 6. Graphs of partial derivatives with respect to value of independent variable for total cost of feed per cow (Estonian kroons)

Analysis of these graphs may bring to the following conclusions:

• The graphs of partial derivatives are analogous and moderately non-linear.

0 0,1 0,2 0,3 0,4 0,5

-3 -2 -1 0 1 2 3

Total labor inputs (hours per 100 kg of milk) Partial derivatives for total labor inputs

Var3

Var1

Var2

0,3 0,4 0,5 0,6 0,7 0,8 0,9

-3 -2 -1 0 1 2 3

Wage per hour Partial derivatives for wage per hour

Var2

Var3 Var1 OLS

0,7 0,8 0,9 1 1,1 1,2

-3 -2 -1 0 1 2 3

Total cost of feed per cow Partial derivatives for total cost of feed per cow

OLS Var1

Var3

Var2

(8)

• From the economic point of view the values of partial derivative for total costs of feed per cow, as for a production factor, are positive (increasing the total costs of feed per cow, the milk cost increases) and decrease (increasing the total costs of feed per cow the cost decrease diminish). Consequently, every relation is acceptable and the considered alternatives can be recommended for practical use.

Figures 3, 4, 5 and 6 and Table 2 show that the considered models are acceptable from economic point of view and the models have high generalising capacity (R² varies within [0.886, 0.909]. In comparison with the other methods, the SVM regression gives better results.

From considered alternative models the most suitable for practical use are the alternatives var1 and var2 when there is polynomial kernel, epsilon = 0.5 (va1) and epsilon = 0.4 (var2) and gamma = 0.08.

Conclusions

SVM regression provides a new approach to the problem of parameter estimation of linear and especially non-linear econometric models. In this paper a brief ex- position of SVM regression and their flexibility in han- dling economic data is given. Different SVM regression models are used for estimation of the econometric model of average total milk cost in Estonian farms. The results are compared mutually and with results of ordinary linear regression. The discussion may be summarised in the following conclusions:

1. Application of the SVM classification and regression in many fields of science and engineering (including econometrics) is rapidly increasing.

2. The SVM regression models may be used for estimating the parameters of linear and non-linear econometric models.

3. The SVM regression estimates and the least square estimates of econometric model of average total milk cost are similar, whereas the estimates for some independent variables are essentially equiva- lent.

4. Using of polynomial kernel function gives more acceptable results.

5. The non-linear SVM regression models are sensitive to “overfitting”.

6. The suitable parameter selection allows diminish the “overfitting” problem.

7. Programming Environment R can be used to find the model parameter values.

8. Model combination and Bayesian methods can partially overcome these methods, but require many models to be trained and are hence computationally expensive.

9. SVM regression, as a potential model estimation method, can replace neural networks to solve non- linear problems in econometric modelling.

This analysis has demonstrated that interesting new methods can be implemented for parameter estimation of econometric models. This paper is expected to encourage the use of SVM regression for econometric analysis.

References

1. Adriaans P. and D. Zantinge (2003). Data Mining. Pearson Education, Indian Branch, Delhi, India, 158 p.

2. Dunham M. H. (2003) Data Mining Introductory and Ad- vanced Topics. Pearson Education, Indian Branch, Delhi, India, 315 p.

3. Fayyad, U., G. Piatetsky-Shapiro, and P. Smyth. (1996).

From Data Mining to Knowledge Discovery in Databases (a survey), AI Magazine, 17(3): Fall, pp. 37-54.

4. Friedman J. H., (1997). Data Mining and Statistics: What's the Connection? Available at http://www- stat.stanford.edu/~jhf/ftp/dm-stats.ps (Nov 1997)

5. Meyer D. (2003). Support Vector Machines. The Interface to libsvm in package e1071. User Guide. Available at http://cran.r-project.org/src/contrib/e1071_1.3-15.tar.gz.

(December 10, 2003).

6. Põldaru R., J. Roots. (2001a). On the Implementation of the Principal Component regression for the Estimation of the Econometric Model of Grain Yield in Estonian Coun- ties. "Problems and Solutions for Rural Development" In- ternational Scientific Conference Reports (Poceedings).

Jelgava, pp. 340-345, Latvia University of Agriculture.

7. Põldaru R., J. Roots. (2001b). On the Implementation of the Bayesian Statistics in Agricultural Research and Edu- cation. Proceedings of the International Conference "Third Nordic-Baltic Agrometrics Conference". Jelgava, pp. 48- 53, Latvia University of Agriculture.

8. Põldaru R., J. Roots. (2001c). Bayesian Statistics (BUGS) in the Estimation of the Econometric Model of Grain Yield in Estonian Counties. Agriculture in Globalising World. Pro- ceedings (volume II) of International Scientific Conference on June 1-2, 2001 in Tartu, 64 Kreutzwaldi Street dedicated to the 50-th Anniversary of the Estonian Agricultural Uni- versity. Tartu, pp. 178-189, Estonian Agricultural Univer- sity.

9. Põldaru R., J. Roots. (2002a). The estimation of the Econometric Model of Grain Yield in Estonian Counties Using Neural Networks. - Theses of International Scien- tific Conference "Information Technologies in Agriculture:

Research and Development", 16-17 October 2002, Kaunas, pp. 14-18. Lithuanian University of Agriculture.

10. Põldaru R., J. Roots. (2002b). Changes in Using Statistical Methods. - Theses of International Scientific Conference

"Rural Development Strategies", 14 -15 November 2002, Kaunas, 2, pp. 46-47, Lithuanian University of Agricul- ture.

11. Põldaru R., Roots J. and Ruus R. (2003a). A Perspective of Using Data Mining in Rural Areas. Rural Development 2003. Globalization and Integration Challenges to the Ru- ral Areas of East and Central Europe. Proceedings of Inter-

(9)

national Scientific Conference, Kaunas, 2003. p. 256-257, Lithuanian University of Agriculture.

12. Põldaru R., Roots J., (2003a). The Estimation of the Econometric Model of Grain Yield in Estonian Counties Using Neural Networks. ”VAGOS”, Nr. 57 Mokslo Darbai 57 (10). Akademija, Kaunas, pp. 124-130, Lithuanian Uni- versity of Agriculture.

13. Põldaru R., Roots J., (2003b). Perspectives of teaching and training new data analysis procedures. Information Tech- nology for Better Agri-Food Sector, Environment and Ru- ral Living, Proceedings EFITA 2003, 4-th Conference of the European Federation for Information Technology in Agriculture, Food and Environment, 5-9 July, 2003, De- brecen-Budapest, Hungary, 2003. pp. 525-530, University of Debrecen.

14. Põldaru R., Roots J., Ruus R. (2003b). A Perspective of Using Data Mining (Association Rules) in Rural Areas.

Transactions of the Estonian Agricultural University, No 218, Perspectives of the Baltic States’ Agriculture under the CAP Reform, 19-20 September, 2003. Proceedings of International Scientific Conference, Tartu, pp. 184-199, Estonian Agricultural University.

15. Põldaru, R., Roots, J., Ruus R. (2003c). Implementation of Data Mining Methods in Agricultural Research and Educa- tion. In: Ulf Olsson and Jaak Sikk (Eds): Forth Nordic- Baltic Agrometrics Conference. Uppsala, Sweden, June 15-17, 2003. Conference Proceedings. Uppsala, SLU, De- partment of Biometry and Informatics, Report 81, pp. 109- 118, Swedish University of Agricultural Sciences.

16. Põldaru R., Roots J., Ruus R. (2003d). A Perspective of Using Data Mining in Rural Areas. ”VAGOS”, No. 61 Mokslo Darbai 61 (14). Akademija, Kaunas, pp. 133-141.

Lithuanian University of Agriculture.

17. Põldaru R., Roots J., Ruus R. (2004a) On the Implement- ing of New Teaching and Training Methods in Agricultural Education, In: M. Vlachopoulou, V. Manthou, L. Illiadis, S. Gertsis, and M. Salampasis (Eds): 2^nd HAICTA Interna- tional Conference on Information Systems & Innovative

Technologies in Agriculture, Food and Environment.

Thessaloniki, Greece, March 18-20, 2004. Conference Pro- ceedings – Volume I. Thessaloniki, Greece, pp. 127-136, Technological Education Institute of Thessaloniki.

18. Põldaru R., Roots J., Ruus R. (2004b). Using Fuzzy Re- gression in Rural Areas. Economic Science for Rural De- velopment – Possibilities of Increasing Competitiveness, Proceedings of the International Scientific Conference No 7. Jelgava, pp. 43-48, Latvia University of Agriculture.

19. Põldaru R., Jakobson R., Roosmaa T., Roots J., Ruus R., Viira A-H. (2004). Support Vector Machine Regression in Estimating Econometric Model Parameters. Information Technologies and Telecommunication for Rural Develop- ment, Proceeding of the International Scientific Con- ference Jelgava, Latvia, 6 – 7 May, 2004. Jelgava, pp. 66- 77, Latvia University of Agriculture.

20. Põldaru R., Roots J., Viira A.-H., (2004c) The Estimation of the Econometric Model of Milk Yield per Cow: A Sup- port Vector Machine Regression Approach. Operations Research 2004 International Conference, Tilburg Univer- sity, Netherlands, 1-3 September, Tilburg, (Submitted), Tilburg University.

21. Riesz F., Nagy B.S. (1955). Functional Analysis. Frederick Ungar Publishing Co.

22. Vapnik V. (1998) Statistical Learning Theory, Springer, N.Y.

23. Venables W. N., Smith D. M. and the R Development Core Team. (2003) An Introduction to R. Notes on R: A Pro- gramming Environment for Data Analysis and Graphics Version 1.8.1. Available at http://cran.at.r-project.org.

(2003-11-21).

24. Viira A.-H., (2003) The Problems Related to Usage of the FADN Data for Modelling the Milk Supply in Estonia.

Transactions of the Estonian Agricultural University, No 218, Perspectives of the Baltic States’ Agriculture under the CAP Reform, 19-20 September, 2003. Proceedings of International Scientific Conference, Tartu, pp. 267-275, Estonian Agricultural University.

ESTIMATING ECONOMETRIC MODEL OF AVERAGE TOTAL MILK COST: A SUPPORT VECTOR MACHINE REGRESSION APPROACH