The Almost Ideal Demand System
3.2 The Estimation of the AIDS Model
3.2.1 The Empirical Model Assumptions
The budget share consumption equations (3. 1 . 10) are non-linear in the parameters, and can be represented as:
(3.2. 1 ) where w iht is the observed consumption of commodity i (budget share of total expenditure) by household h in period t, Xht is the matrix of explanatory variables consisting of Total Expenditure, the price vector Pi and the household size vector2 sh1;
fJ is the parameter vector consisting of the AIDS parameters, and the household size parameters
oi,
and finally £iht is a vector of stochastic errors,N(O,O.)
distributed3 .We also assume that £iht is independent of Xht , and that, for each h , it is:
for t -:t s (3.2.2)
1 On the possible introduction of bias in the estimates of the AIDS parameters when using this approximation see Pashardes, 1 993. A demographically augmented linear version of the AIDS model has also been used by Rossi, 1988 to analyse the consumption behaviour of Italian households from aggregated data.
2 As shown in Chapter 2, household size enters model (3.2. 1 ) linearly. A non-linear form did not perform any better. For an Equivalence Scales approach to introducing demographic effects into an AIDS type demand system, see Ray, 1 986.
3 For an application of the AIDS model under an assumption of autoregressive errors, see Xepapadeas and Habib, 1 995.
The parameters were estimated by a ML procedure4 based on a modified Newton optimisation algorithm that approximates the inverse of the matrix of second derivatives, the Hessian, of the objective function at each iteration step by adding to it a correction matrix. This modified matrix provides the change to the parameter estimates at each successive iterative step (see Judge et al., 1 985, App. B.2.4). As the model converges, the latest approximation of the Hessian is used to estimate the covariance matrix of the estimates. This method is fast, and adaptable to most types of functions, but sometimes when.Jh� surface of the objective function - in our case the log-likelihood (LL) function for the sample data - is irregular, or "lumpy" , with many local maxima, it might converge to saddle points, and then follow the ridge5 never reaching convergence (See Cossarini and Michelini, 1 97 1).
A drawback of this method is that there is no certainty that the estimation procedure will reach convergence, and even when it does, there is no certainty that the parameter estimates correspond to a global maximum of the LL function. Therefore, it is imperative to re-estimate the model a few times with different sets of parameter starting values to verify that a global maximum has been attained, and that the estimates are effectively ML. If different starting values regularly generate different final estimates, the estimation iterative process is inherently unstable, and its
4 Part of the SHAZAM econometric package in its "Power Mac Version 7". All computations were performed on a Power Macintosh 6 100/60 computer.
5 We found an interesting example of such a likelihood function when estimating the Linear Expenditure System for the Italian data by the Kmenta procedure in Chapter 2. As explained there the estimation proceeds in steps by successively estimating the variance-covariance matrix from the OLS residuals and then using it for a GLS estimation of the model parameters. The procedure can be repeated over and over again until two successive GLS estimates of the parameters are close enough to be accepted as identical and therefore to represent the "true" estimates. The solution is not found by maximising an objective function but relies on the fact that successive GLS covariance matrices will become more and more similar as they succesively generate one another. In the case of the Italian data the procedure needed 188 iterations to converge (instead of the usual 20-25) and the likelihood function kept increasing from one set of estimates to another up to the 49th iteration, hit a ridge between the 50th and the 58th, decreased up to the 1 65th and then increased again up to its maximum at the 1 88th in the following pattern:
Iteration Value of the LL 42 2793.82 75 49 2793.92 1 00 50 2793.93 ** 5 3 2793.93 150 55 2793.93 1 65 5 8 2793.93 ** 59 2793.92 1 80 65 2793.91 188
Iteration Value of the LL 2793.90 2793.86 125 2793.82 2793.75 2793.87 175 2797.24 2802.54 2802.80
An attempt to find a ML estimator by a gradient method in a situation like the one described above would be likely to generate local maximum estimates only, possibly in the ridge region marked by the asterisks.
"chaotic"6 behaviour must cast serious doubts on the suitability of the model to fit, and explain the data.
We initially estimated the full AIDS system of four equations without imposing any of the constraints in (3. 1 . 1 1 ), and tested all of them. Firstly, we tested all the constraints together with a joint Wald test; then we tested for the homogeneity and symmetry constraints separately. Both constraints were clearly rejected at any level of
probability. · - .
As a consequence of the adding-up condition, which is an essential part of any budget share model like (3.2. 1 ), .Q is singular. To overcome this difficulty, we have two alternatives: either to constrain .Q itself (see for example Winters, 1 984), or to delete one equation from the demand system (see Barten, 1 969). We chose the latter solution, as we felt that it was less arbitrary than to impose ad hoc restrictions on .Q, and during estimation we deleted Equation 3 (Housing Operations). All the parameters of Equation 3 appear in the other equations as well, and can therefore be obtained by constrained estimation; but {33 and 83 will have to be computed separately as residuals. From condition (3 . 1 . 1 1) it follows that {33 = ( 1 - Li {3i), and to satisfy
adding-up it must be 83 = ( 1 - Li 8).
3.2.2 The Estimation Results for New Zealand
For the New Zealand data, the estimation of the AIDS model with only the adding-up constraint proved almost impossible. For the few times convergence was achieved, either the parameter values were unacceptable (too large or too small), or the standard errors were meaningless (again too large or too small) or both. In most cases, the procedure did not converge, even after thousands of iterations?.
The situation improved substantially after we imposed the symmetry condition
'Yij = yiJ (for all i, j = 1 , ... ,n ), and estimation became much easier with the iterative procedure converging in a reasonable number of iterations from most sets of starting valuess.
6 For a definition of chaotic systems as crucially dependent on initial conditions see Devaney R L, 1 992, Ch 1 0. A discussion of the non-convergence of the Newton-Raphson iterative method from the perspective of chaotic dynamic systems can be found in Ch 1 3.
7 Nelson ( 1 988, p. 1 305) reports a similar lack of convergence in estimating AIDS parameters for US data when more than three commodities were considered.
8 The sets of starting parameter values we used more often were zeros and ones, or the parameter estimates obtained from the linearised ver�ion of the AIDS model (3. 1 . 1 3).
However, even after imposing symmetry, the Maximum Likelihood (ML) estimates for the New Zealand data were extremely sensitive to the set of parameter values used to start the iterative estimation procedure, different sets of starting values often generating totally different ML estimates. Although the values of the log-likelihood function (LL) corresponding to different sets of estimates were usually different, a fact allowing us to identify the set of estimates with the highest likelihood, most values of the maximised LL function were extremely close to one another, and there does not appear to be a well defined
��
i"mum, but lots of local maxima. An equally unsatisfactory result of the estimation procedure is that sometimes changes in the convergence criterion generate different sets of final estimates.In spite of the above difficulties, we are fairly confident that the parameter estimates we report in Table 3 . 1 are indeed ML estimates, as the LL function values associated with them are the highest we have obtained, within the parameters' domain, over a very large number of trial estimation runs, in which we have used different sets of parameter values to start the iterative procedure, and different convergence criteria.
The parameters which proved most unstable, and difficult to estimate were the price parameters, fi. , and the ai intercepts. Contrary to what is sometimes stated in the
literature (e.g. Deaton-Muellbauer, 1 980, p.3 16, also Winters 1984, p. 248), the lXo parameter - the intercept of the price index equation (3. 1 .9) - proved relatively easy