3. Chapter 3: Efficiency Measurement, Methods, Estimation, Model Specification
3.7 Statistical Parametric Representation of the Production Possibility Set
3.7.1 Deterministic Frontier
Moving from transformation functions to frontiers, data points could be enveloped using an arbitrarily chosen function (Coelli et al., 2005). Early economists assumed that all producers were efficient (i.e. production happened on the frontier), and perfect competition71 implies a market free of inefficiency. If this is the case, then estimation processes would be facilitated by using simple regression analysis since the residual would only capture random error (noise). However, the transformation process can and
65 | P a g e
does diverge from the ideal hypothesis of perfect competition72 and so not all producers
are able to achieve potential (ideal) output. Potentially, there are many cases in which perfect competition is not a manageable hypothesis, so any divergences should be measured. Therefore, if we want to portray accurately the real world, there is a need to account for inefficiencies. By using a deterministic frontier, all deviations from the frontier are attributed to technical inefficiency since there is no account of measurement errors and other sources of statistical noise.
Under the statistical approach, the production function can be represented by:
ππ = π(π₯π1+ β― + π₯ππ)πβπ’π (1)
Where ππis the output of producer π and π₯ππ is the amount of the π-th input(π = 1, β¦ π)
used by producer π. The exponential π’πβ₯ 0 and π’π represents the inefficiency factor of producer π (Lovell, 1993), and a specific distribution is assumed for π’π (FΓΈrsund et al., 1980). If we take a log-linear version (CD technology) of the equation, this can be written as:
ln(ππ) = ππ[π(π₯π1+ β― + π₯ππ) β π’π] = π½0+ β π½πππ π₯ππβ π’π
π π=1
π’π β₯ 0 and represents the efficiency of producer π. TE of firm π ππΈπ is the ratio of the actual (observed) output of producer π to the maximum possible output (ideal) that it could achieve, as represented by the production frontier. Thus, technical efficiency73 is
then measured by the equation:
ππΈπ = ππ π(π₯π1+ β― + π₯ππ) ππΈπ = ππ π(π₯π1+ β― + π₯ππ)= π βπ’π ln ππ= πππ(π₯π1+ β― + π₯ππ) + ππππΈπ= ππ(π₯π1+ β― + π₯ππ) β π’π
72 Many situations may potentially prevent the competition from being perfect:
i. Imperfect markets (monopolies, oligopolies, market power, or markets with excessive entry barriers) ii. Information asymmetries (price information is not always available prior to production)
iii. Agency issues and misaligned incentive between owners and executives
73 Here we use a production frontier framework. If cost is the centre of attention, then, instead of TE, cost efficiency is calculated.
This type of efficiency applied to cost functions is a similar notion to technical efficiency. Cost efficiency is the ratio of potential costs to observed costs, i.e.
πΆπΈπ=
π(ππ; π€π)
πΆπ
Where 0 < πΆπΈπβ€ 1. So πππΆπ= πππ(ππ; π€π) β πππΆπΈπ= πππ(ππ; π€π) + π’π.
Note that π’π reflects cost inefficiency so π’π= βπππΆπΈπβ 1 β πΆπΈπ. π’πβ₯ 0 . Larger values denote lower cost efficiency.
Cost inefficiency is the percentage by which the observed costs need to decrease in order for the DMU to attain 100 percent cost efficiency (produce observed output at minimum cost).
66 | P a g e
Note that π’π represents technical inefficiency, so π’π = βππππΈπ β 1 β ππΈπ and cannot be negative with larger values to denote lower technical efficiency. Technical inefficiency is the percentage by which the observed output needs to grow (increase) in order for the DMU to become 100 percent technically efficient. ππΈπ= πβπ’π =
exp(βπ’π) where π’π β₯ 0. Note as well that 0 < ππΈπ β€ 1.
A particular functional form is assumed for the production function in Equation (1). A variety of econometric techniques can be used in the estimation process of inefficiency (uk), including corrected ordinary least squares (COLS), modified ordinary least squares (MOLS), and maximum likelihood estimation (MLE) (Lovell, 1993). However, some caveats should be considered when applying parametric techniques, regarding possible misspecification of the models used, despite the testable estimates of the parameters of the frontier. Also, these types of method cannot handle a situation with multiple inputs and multiple outputs, which is the case in the HE context. In addition, OLS estimates introduce a deficiency regarding the displacement of the constant term (intercept); therefore, if we want to continue with regression models, we have no alternative but to βfixβ the regression model (Greene, 2008). Two approaches have been suggested in the literature to bridge this gap in the OLS. Both COLS and MOLS are based on the result that the OLS estimator of the slope parameters is consistent and unbiased, so the OLS residuals are pointwise consistent estimators of linear translations of the original π’ππ .
The first attempt to estimate a Cobb-Douglas (CD) production frontier utilising cross- section data on firms was made by Aigner and Chu (1968).74 Later, Afriat (1972)
assumed that π’ππ were gamma distributed random variables and applied MLE for estimation purposes. Hence, the main issue here is how to estimate π’π. By using a simple OLS setting to estimate the parameters, the regression line is shifted up75
(production) until all residuals are non-positive (ensuring that π’π are non-negative) and at least one is zero, on which we hang the function, so that it envelops all observations and is possible since the slope parameters of OLS are consistent when the residual is non-normal. This approach is referred to here as COLS.
π½πΆππΏπ = π½β+ πππ₯ πππ
The COLS residuals are ππ,πΆππΏπ = ππβ πππ₯πππ, and technical inefficiency76π’π for the
π DMU is given by max(ππ) β ππ. The logic of the estimator was first suggested by Winsten (1957), and much later the consistency of the COLS estimator was proved by Gabrielsen (1975) and Greene (1980a). A lengthy application with an extension to panel data77 appears in Simar (1992). A couple of methodological problems have been identified; however, the method used to be a popular approach in the analysis of panel data (seeCornwell et al. (1990) and Evans et al. (2000a, 2000b)). It should be stressed
74 They used linear and quadratic programming and the actual task was to minimise the sum of π’
π= ππππβ π₯β²ππ½ π . π‘ π’πβ₯ 0.
75 In a cost framework is shifted down. 76 Cost inefficiency for DMU π (π’
π) is given by: ππβ πππππ.
67 | P a g e
that no distribution is specified for the residual term, and the entire deviation from the frontier for a particular DMU is attributed to inefficiency (Johnes, 2004).
An alternative to COLS was introduced by Richmond (1974), namely, modified OLS (MOLS) (Lovell, 1993), and instead of shifting the regression line by the maximum or minimum (cost frontier) residual, it shifts based on the modelβs residual sum of squares. The OLS residuals πβ, of the transformation function, save for the constant displacement, are pointwise consistent estimates of inefficiency, π’π (Greene, 2008). The variance of πβ78 of the residuals, since the displacement is constant,79 is a consistent
estimator of the variance of inefficiency (π’π). The variance of πβ is known and is given by the modelβs residual sum of squares. In this way, we can use this information to derive an estimate of πΈ(π’π), if we assume that π’π follows one parameter distribution.80 This is commonly a half-normal distribution, if we make the assumption that higher inefficiency is less likely than lower inefficiency, although an exponential distribution might alternatively be used (Lovell, 1993). The technical inefficiency81 of DMU π (π’π)
is given by πΈΜ(π’π)82β ππ where the regression line is shifted by πΈ(π’π). MOLS is less severe than COLS but it requires more restrictive assumptions for the distribution of the residuals that cannot be testable.
Thus, the parameters of the regression are identified by using OLS and an additional parameter, namely, the mean of the inefficiency πΈ(π’π) is also estimated and identified through the variance of the residuals. The estimated frontier function can now be displaced upward by this estimate of πΈ(π’π). Apart from the known limitations of deterministic residuals, MOLS has the disadvantage that the production function is not necessarily shifted far enough to ensure that all observations lie on or below the frontier, and so some residuals may have the wrong sign (FΓΈrsund et al., 1980; Lovell, 1993). The MOLS method is a little less orthodox than the COLS since it is unable to result in a full set of negative residuals.
MLE production (or cost) frontiers differ significantly from those produced by the classical OLS regression approach since the relationship underlining inputs and outputs is non-linear. Therefore, this allows efficient observations (that is, those lying on the frontier) to differ in terms of technology compared to observations lying inside the frontier (Lovell, 1993). The computation logic underlying ML was established on the idea that a sample of observations is more likely to have been generated from some distributions than from others. So, the ML estimate of an unknown parameter is attributed to that value of the parameter that maximises the probability/likelihood of randomly drawing a particular sample of observations (Coelli et al., 2005). Therefore,
78 The mean of πβ is by construction zero, so useless.
79 We assume that the shift from the average production (or cost) to the frontier is constant.
80 One parameter distribution in this setting means that the expected value (mean) of the distribution depends only on the variance
of the distribution.
81 Cost inefficiency of DMU π (π’
π) is given by πΈΜ(π’π) + ππ.
82 Exponentially distributed inefficiency: πΈΜ(π’
π) = πΜπ’, πΜ = βπ’ π ππ πβ(πΎ+1).
Half-normally distributed inefficiency: πΈΜ(π’π) = ( β2 βπ) πΜπ’.
68 | P a g e
utilising panel data in this context gives rise to what has been stressed by Cornwell and Schmidt (1996): βrepeated observation of the same firm makes it possible to estimate its level of efficiency more preciselyβ.
The joint probability density function (PDF), known as the likelihood function, for a vector of observations qk= (q1, q2, β¦ β¦ qK)β²: L(q|Ξ², Ο) = (2ΟΟ2)β1/2exp {β 1 2Ο2β(qkβ xkβ²Ξ²)2 K k=1 }
This is the likelihood of observing the sample observations as a function of the unknown parameters Ξ² and Ο2.
In log forms, we maximise the log-likelihood function with respect to π½:
πππΏ = βπΎ 2ln(2π) β π« 2ln(π 2) β 1 2π2β(qkβ xkβ²Ξ²)2 π« π =1
When a linear regression model is used with errors distributed normally π£π~ππππ(0, π2), the ML estimate is equivalent to the OLS estimate.
All these methods suffer from the disadvantages of statistical parametric deterministic models since they do not take into account the stochastic element of the transformation process. Although COLS and MOLS have their place since they are trivial in estimations with solid theoretical foundations even in small datasets, it should be stressed that they produce an identical ranking of producers to OLS. In addition, the methods might be unsuitable for applications in which there are multiple outputs as well as multiple inputs, despite their robustness in providing efficiency estimates under modest measurement error (Johnes, 2004). An informative comparison of these three deterministic methods can be found in Lovell (1993).83
The frontier functions specified above, and labelled as deterministic frontier functions, assume that the econometric model is perfectly specified and the data are free of error. Thus, any deviation of an observation from the theoretical maximum is attributed solely to the inefficiency of the DMU. Due to the absence of any stochastic element84 in the discussed methods, there is need for a specification of the frontier in which the maximum output that a producer can obtain is assumed to be determined both by the production function and by random events. This gives further fringes to recast the models to what is labelled extensively in the literature as stochastic frontier production models.
83 See appendix 8 chapter 3 for the graphical illustration.
69 | P a g e