Deterministic Frontier - Statistical Parametric Representation of the Production Possibility Se

3. Chapter 3: Efficiency Measurement, Methods, Estimation, Model Specification

3.7 Statistical Parametric Representation of the Production Possibility Set

3.7.1 Deterministic Frontier

Moving from transformation functions to frontiers, data points could be enveloped using an arbitrarily chosen function (Coelli et al., 2005). Early economists assumed that all producers were efficient (i.e. production happened on the frontier), and perfect competition71 implies a market free of inefficiency. If this is the case, then estimation processes would be facilitated by using simple regression analysis since the residual would only capture random error (noise). However, the transformation process can and

65 | P a g e

does diverge from the ideal hypothesis of perfect competition72_{and so not all producers}

are able to achieve potential (ideal) output. Potentially, there are many cases in which perfect competition is not a manageable hypothesis, so any divergences should be measured. Therefore, if we want to portray accurately the real world, there is a need to account for inefficiencies. By using a deterministic frontier, all deviations from the frontier are attributed to technical inefficiency since there is no account of measurement errors and other sources of statistical noise.

Under the statistical approach, the production function can be represented by:

𝑞𝑘 = 𝑓(𝑥𝑘1+ ⋯ + 𝑥𝑘𝑁)𝑒−𝑢𝑘 (1)

Where 𝑞_𝑘is the output of producer 𝑘 and 𝑥_𝑘𝑖is the amount of the 𝑖-th input(𝑖 = 1, … 𝑁)

used by producer 𝑘. The exponential 𝑢𝑘≥ 0 and 𝑢𝑘 represents the inefficiency factor of producer 𝑘 (Lovell, 1993), and a specific distribution is assumed for 𝑢_𝑘 (Førsund et al., 1980). If we take a log-linear version (CD technology) of the equation, this can be written as:

ln(𝑞_𝑘) = 𝑙𝑛[𝑓(𝑥_𝑘1+ ⋯ + 𝑥_𝑘𝑁) − 𝑢_𝑘] = 𝛽₀+ ∑ 𝛽_𝑖𝑙𝑛 𝑥_𝑘𝑖− 𝑢_𝑘

𝑁 𝑖=1

𝑢_𝑘 ≥ 0 and represents the efficiency of producer 𝑘. TE of firm 𝑘 𝑇𝐸_𝑘 is the ratio of the actual (observed) output of producer 𝑘 to the maximum possible output (ideal) that it could achieve, as represented by the production frontier. Thus, technical efficiency73_is

then measured by the equation:

𝑇𝐸_𝑘 = 𝑞𝑘 𝑓(𝑥_𝑘1+ ⋯ + 𝑥_𝑘𝑁) 𝑇𝐸_𝑘 = 𝑞𝑘 𝑓(𝑥_𝑘1+ ⋯ + 𝑥_𝑘𝑁)= 𝑒 −𝑢𝑘 ln 𝑞𝑘= 𝑙𝑛𝑓(𝑥𝑘1+ ⋯ + 𝑥𝑘𝑁) + 𝑙𝑛𝑇𝐸𝑘= 𝑙𝑛(𝑥𝑘1+ ⋯ + 𝑥𝑘𝑁) − 𝑢𝑘

72_{Many situations may potentially prevent the competition from being perfect:}

i. Imperfect markets (monopolies, oligopolies, market power, or markets with excessive entry barriers) ii. Information asymmetries (price information is not always available prior to production)

iii. Agency issues and misaligned incentive between owners and executives

73_{Here we use a production frontier framework. If cost is the centre of attention, then, instead of TE, cost efficiency is calculated.}

This type of efficiency applied to cost functions is a similar notion to technical efficiency. Cost efficiency is the ratio of potential costs to observed costs, i.e.

𝐶𝐸𝑘=

𝑓(𝑞𝑘; 𝑤𝑘)

𝐶𝑘

Where 0 < 𝐶𝐸𝑘≤ 1. So 𝑙𝑛𝐶𝑘= 𝑙𝑛𝑓(𝑞𝑘; 𝑤𝑘) − 𝑙𝑛𝐶𝐸𝑘= 𝑙𝑛𝑓(𝑞𝑘; 𝑤𝑘) + 𝑢𝑘.

Note that 𝑢𝑘 reflects cost inefficiency so 𝑢𝑘= −𝑙𝑛𝐶𝐸𝑘≈ 1 − 𝐶𝐸𝑘. 𝑢𝑘≥ 0 . Larger values denote lower cost efficiency.

Cost inefficiency is the percentage by which the observed costs need to decrease in order for the DMU to attain 100 percent cost efficiency (produce observed output at minimum cost).

66 | P a g e

Note that 𝑢_𝑘 represents technical inefficiency, so 𝑢_𝑘 = −𝑙𝑛𝑇𝐸_𝑘 ≈ 1 − 𝑇𝐸_𝑘 and cannot be negative with larger values to denote lower technical efficiency. Technical inefficiency is the percentage by which the observed output needs to grow (increase) in order for the DMU to become 100 percent technically efficient. 𝑇𝐸𝑘= 𝑒−𝑢𝑘 =

exp(−𝑢_𝑘) where 𝑢_𝑘 ≥ 0. Note as well that 0 < 𝑇𝐸_𝑘 ≤ 1.

A particular functional form is assumed for the production function in Equation (1). A variety of econometric techniques can be used in the estimation process of inefficiency (u_k), including corrected ordinary least squares (COLS), modified ordinary least squares (MOLS), and maximum likelihood estimation (MLE) (Lovell, 1993). However, some caveats should be considered when applying parametric techniques, regarding possible misspecification of the models used, despite the testable estimates of the parameters of the frontier. Also, these types of method cannot handle a situation with multiple inputs and multiple outputs, which is the case in the HE context. In addition, OLS estimates introduce a deficiency regarding the displacement of the constant term (intercept); therefore, if we want to continue with regression models, we have no alternative but to ‘fix’ the regression model (Greene, 2008). Two approaches have been suggested in the literature to bridge this gap in the OLS. Both COLS and MOLS are based on the result that the OLS estimator of the slope parameters is consistent and unbiased, so the OLS residuals are pointwise consistent estimators of linear translations of the original 𝑢_𝑘𝑠.

The first attempt to estimate a Cobb-Douglas (CD) production frontier utilising cross- section data on firms was made by Aigner and Chu (1968).74_{Later, Afriat (1972)}

assumed that 𝑢_𝑘𝑠 were gamma distributed random variables and applied MLE for estimation purposes. Hence, the main issue here is how to estimate 𝑢_𝑘. By using a simple OLS setting to estimate the parameters, the regression line is shifted up75

(production) until all residuals are non-positive (ensuring that 𝑢_𝑘are non-negative) and at least one is zero, on which we hang the function, so that it envelops all observations and is possible since the slope parameters of OLS are consistent when the residual is non-normal. This approach is referred to here as COLS.

𝛽_{𝐶𝑂𝐿𝑆} = 𝛽∗_{+ 𝑚𝑎𝑥} 𝑘𝑒𝑘

The COLS residuals are 𝑒_{𝑘,𝐶𝑂𝐿𝑆} = 𝑒_𝑘− 𝑚𝑎𝑥_𝑘𝑒_𝑘, and technical inefficiency76𝑢_𝑗 for the

𝑗 DMU is given by max(𝑒𝑘) − 𝑒𝑗. The logic of the estimator was first suggested by Winsten (1957), and much later the consistency of the COLS estimator was proved by Gabrielsen (1975) and Greene (1980a). A lengthy application with an extension to panel data77 appears in Simar (1992). A couple of methodological problems have been identified; however, the method used to be a popular approach in the analysis of panel data (seeCornwell et al. (1990) and Evans et al. (2000a, 2000b)). It should be stressed

74_{They used linear and quadratic programming and the actual task was to minimise the sum of}_𝑢

𝑘= 𝑙𝑛𝑞𝑘− 𝑥′𝑘𝛽 𝑠. 𝑡 𝑢𝑘≥ 0.

75_{In a cost framework is shifted down.} 76_{Cost inefficiency for DMU}_{𝑗 (𝑢}

𝑗) is given by: 𝑒𝑗− 𝑚𝑖𝑛𝑒𝑘.

67 | P a g e

that no distribution is specified for the residual term, and the entire deviation from the frontier for a particular DMU is attributed to inefficiency (Johnes, 2004).

An alternative to COLS was introduced by Richmond (1974), namely, modified OLS (MOLS) (Lovell, 1993), and instead of shifting the regression line by the maximum or minimum (cost frontier) residual, it shifts based on the model’s residual sum of squares. The OLS residuals 𝑒∗, of the transformation function, save for the constant displacement, are pointwise consistent estimates of inefficiency, 𝑢_𝑘 (Greene, 2008). The variance of 𝑒∗78_{of the residuals, since the displacement is constant,}79_{is a consistent}

estimator of the variance of inefficiency (𝑢_𝑘). The variance of 𝑒∗_{is known and is given} by the model’s residual sum of squares. In this way, we can use this information to derive an estimate of 𝐸(𝑢_𝑘), if we assume that 𝑢_𝑘 follows one parameter distribution.80 This is commonly a half-normal distribution, if we make the assumption that higher inefficiency is less likely than lower inefficiency, although an exponential distribution might alternatively be used (Lovell, 1993). The technical inefficiency81 of DMU 𝑗 (𝑢_𝑗)

is given by 𝐸̂(𝑢𝑘)82− 𝑒𝑗 where the regression line is shifted by 𝐸(𝑢𝑘). MOLS is less severe than COLS but it requires more restrictive assumptions for the distribution of the residuals that cannot be testable.

Thus, the parameters of the regression are identified by using OLS and an additional parameter, namely, the mean of the inefficiency 𝐸(𝑢_𝑘) is also estimated and identified through the variance of the residuals. The estimated frontier function can now be displaced upward by this estimate of 𝐸(𝑢_𝑘). Apart from the known limitations of deterministic residuals, MOLS has the disadvantage that the production function is not necessarily shifted far enough to ensure that all observations lie on or below the frontier, and so some residuals may have the wrong sign (Førsund et al., 1980; Lovell, 1993). The MOLS method is a little less orthodox than the COLS since it is unable to result in a full set of negative residuals.

MLE production (or cost) frontiers differ significantly from those produced by the classical OLS regression approach since the relationship underlining inputs and outputs is non-linear. Therefore, this allows efficient observations (that is, those lying on the frontier) to differ in terms of technology compared to observations lying inside the frontier (Lovell, 1993). The computation logic underlying ML was established on the idea that a sample of observations is more likely to have been generated from some distributions than from others. So, the ML estimate of an unknown parameter is attributed to that value of the parameter that maximises the probability/likelihood of randomly drawing a particular sample of observations (Coelli et al., 2005). Therefore,

78_{The mean of}_𝑒∗_{is by construction zero, so useless.}

79_{We assume that the shift from the average production (or cost) to the frontier is constant.}

80_{One parameter distribution in this setting means that the expected value (mean) of the distribution depends only on the variance}

of the distribution.

81_{Cost inefficiency of DMU}_{𝑗 (𝑢}

𝑗) is given by 𝐸̂(𝑢𝑘) + 𝑒𝑗.

82_{Exponentially distributed inefficiency:}_𝐸̂(𝑢

𝑘) = 𝜎̂𝑢, 𝜎̂ = √𝑢 𝑅𝑆𝑆 𝑛−(𝐾+1).

Half-normally distributed inefficiency: 𝐸̂(𝑢𝑘) = ( √2 √𝜋) 𝜎̂𝑢.

68 | P a g e

utilising panel data in this context gives rise to what has been stressed by Cornwell and Schmidt (1996): ‘repeated observation of the same firm makes it possible to estimate its level of efficiency more precisely’.

The joint probability density function (PDF), known as the likelihood function, for a vector of observations q_k= (q₁, q₂, … … q_K)′: L(q|β, σ) = (2πσ2)−1/2_{exp {−} 1 2σ2∑(qk− xk′β)2 K k=1 }

This is the likelihood of observing the sample observations as a function of the unknown parameters β and σ2_.

In log forms, we maximise the log-likelihood function with respect to 𝛽:

𝑙𝑛𝐿 = −𝐾 2ln(2𝜋) − 𝛫 2ln(𝜎 2_{) −} 1 2𝜎2∑(qk− xk′β)2 𝛫 𝜅=1

When a linear regression model is used with errors distributed normally 𝑣_𝑘~𝑖𝑖𝑑𝑁(0, 𝜎2₎_{, the ML estimate is equivalent to the OLS estimate.}

All these methods suffer from the disadvantages of statistical parametric deterministic models since they do not take into account the stochastic element of the transformation process. Although COLS and MOLS have their place since they are trivial in estimations with solid theoretical foundations even in small datasets, it should be stressed that they produce an identical ranking of producers to OLS. In addition, the methods might be unsuitable for applications in which there are multiple outputs as well as multiple inputs, despite their robustness in providing efficiency estimates under modest measurement error (Johnes, 2004). An informative comparison of these three deterministic methods can be found in Lovell (1993).83

The frontier functions specified above, and labelled as deterministic frontier functions, assume that the econometric model is perfectly specified and the data are free of error. Thus, any deviation of an observation from the theoretical maximum is attributed solely to the inefficiency of the DMU. Due to the absence of any stochastic element84 in the discussed methods, there is need for a specification of the frontier in which the maximum output that a producer can obtain is assumed to be determined both by the production function and by random events. This gives further fringes to recast the models to what is labelled extensively in the literature as stochastic frontier production models.

83_{See appendix 8 chapter 3 for the graphical illustration.}

69 | P a g e

In document Drivers of efficiency in higher education in England (Page 64-69)