Variable-Coefficient Models
RANDOM-COEFFICIENTS MODEL 6.6.1 Model Formulation
6.7 DYNAMIC RANDOM-COEFFICIENT MODELS
f (yT| j, yT−1)P(j| yT−1) dj
×
f (yT−1| j, yT−2)P(j| yT−2) dj· · ·
×
f (yT1+1| j, y1)P(j| y1) dj, (6.6.28) where P(j| yT) denotes the posterior distribution of given observations from 1 to T . While the formula may look formidable, it turns out that the Bayes updating formula is fairly straightforward to compute. For instance, consider the model (6.6.5). Let = (, ¯␥), t, and Vt denote the posterior mean and variance of based on the first t observations. Then
t = Vt−1
Qt−1yt+ Vt−1−1t−1
, (6.6.29)
Vt =
Qt−1Qt+ Vt−1−1
−1
, t = T1+ 1, . . . , T, (6.6.30) and
P(yt+1| yt)=
P(yt+1| , yt)P( | yt) d
∼ N(Qt+1t, + Qt+1VtQt+1), (6.6.31) where yt = (y1t, y2t, . . . , yN t), Qt = (xt, wt), xt = (x1t, . . . , xN t), wt = (w1t, . . . , wN t), = Eutut, and ut = (u1t, . . . , uN t) (Hsiao, Appelbe, and Dineen (1993)).
Hsiao and Sun (2000) have conducted limited Monte Carlo studies to evaluate the performance of these model selection criteria in selecting the random-, fixed-, and mixed random–fixed-coefficient specification. They all appear to have very good performance in selecting the correct specification.
6.7 DYNAMIC RANDOM-COEFFICIENT MODELS
For ease of exposition and without loss of the essentials, instead of considering generalizing (6.6.5) to the dynamic model, in this section we consider the generalization of the random-coefficient model (6.2.1) to the dynamic model of the form17
yi t= γiyi,t−1+ βixi t+ ui t, |γi| < 1, i = 1, . . . , N,
(6.7.1) t = 1, . . . , T,
where xi t is a k× 1 vector of exogenous variables, and the error term ui t is assumed to be independently, identically distributed over t with mean zero and varianceσu2i and is independent across i . The coefficientsi = (γi, i)are assumed to be independently distributed across i with mean ¯ = ( ¯γ , ¯)and
covariance matrix . Let
i = ¯ + ␣i, (6.7.2)
where␣i = (αi 1, ␣i 2). We have E␣i = 0, E␣i␣j =
if i = j,
0 otherwise, (6.7.3)
and18
E␣ixj t = 0. (6.7.4)
Stacking the T time-series observations of the i th individuals in matrix form yields
yi
T×1= Qii+ ui, i = 1, . . . , N, (6.7.5) where yi= (yi 1, . . . , yi t), Qi = (yi,−1, Xi), yi,−1= (yi 0, . . . , yi,T −1), Xi = (xi 1, . . . , xi t), ui = (ui 1, . . . , ui t), and for ease of exposition we assume that yi 0are observable.19
We note that because yi,t−1 depends onγi, we have E Qi␣i = 0, i.e., the independence between the explanatory variables and␣i (equation (6.2.3)) is violated. Substitutingi = ¯ + ␣i into (6.7.5) yields
yi = Qi + v¯ i, i = 1, . . . , N, (6.7.6) where
vi = Qi␣i+ ui. (6.7.7)
Since
yi,t−1=∞
j=0
( ¯γ + αi 1)jxi,t− j−1( ¯ + ␣i 2)+∞
j=0
( ¯γ + αi 1)jui,t− j−1, (6.7.8) it follows that E(vi| Qi)= 0. Therefore, contrary to the static case, the least-squares estimator of the common mean, ¯, is inconsistent.
Equations (6.7.7) and (6.7.8) also demonstrate that the covariance matrix V of vi is not easily derivable. Thus, the procedure of premultiplying (6.7.6) by V−1/2to transform the model into one with serially uncorrelated error is not implementable. Neither does the instrumental-variable method appear imple-mentable, because the instruments that are uncorrelated with viare most likely uncorrelated with Qi as well.
Pesaran and Smith (1995) have noted that as T → ∞, the least-squares regression of yi on Qi yields a consistent estimator ˆi of i. They suggest finding a mean group estimator of ¯ by taking the average of ˆi across i ,
ˆ¯ = 1 N
N i=1
ˆi. (6.7.9)
6.7 Dynamic Random-Coefficient Models 177
The mean group estimator (6.7.9) is consistent and asymptotically normally distributed so long as√
N/T → 0 as both N and T → ∞ (Hsiao, Pesaran, and Tahmiscioglu (1999)).
Panels with large T are the exception in economics. Nevertheless, under the assumption that yi 0are fixed and known and␣i and ui t are independently normally distributed, we can implement the Bayes estimator of ¯ conditional on σi2and using the formula (6.6.19), just as in the mixed-model case discussed in Section 6.6. The Bayes estimator condition on and σi2is equal to
ˆ¯B =
N
i=1
σi2(QiQi)−1+ −1
−1 N
i=1
σi2(QiQi)−1+ −1ˆi,
(6.7.10) which is a weighted average of the least-squares estimator of individual units with the weights being inversely proportional to individual variances. When T → ∞, N → ∞, and √
N/T3/2 → 0, the Bayes estimator is asymptoti-cally equivalent to the mean group estimator (6.7.9) (Hsiao, Pesaran, and Tahmiscioglu (1999)).
In practice, the variance components,σi2and , are rarely known, so the Bayes estimator (6.7.10) is rarely calculable. One approach is to substitute the consistently estimatedσi2 and , say (6.2.8) and (6.2.9), into the formula (6.7.10), and treat them as if they were known. For ease of reference, we shall call (6.7.10) with knownσi2and the infeasible Bayes estimator. We shall call the estimator obtained by replacingσi2and in (6.7.10) with their consistent estimates, say (6.2.8) and (6.2.9), the empirical Bayes estimator.
The other approach is to follow Lindley and Smith (1972) and assume that the prior distributions ofσi2and are independent and are distributed as
P
−1, σ12, . . . , σN2
= W( −1| (ρ R)−1, ρ) .N i=1
σi−1, (6.7.11)
where W represents the Wishart distribution with scale matrix (ρ R) and degrees of freedomρ (e.g., Anderson (1958)). Incorporating this prior into the model (6.7.1)–(6.7.2), we can obtain the marginal posterior densities of the parameters of interest by integrating outσi2and from the joint posterior density. However, the required integrations do not yield closed-form solutions. Hsiao, Pesaran, and Tahmiscioglu (1999) have suggested using Gibbs sampler to calculate marginal densities.
The Gibbs sampler is an iterative Markov-chain Monte Carlo method which only requires the knowledge of the full conditional densities of the parameter vector (e.g., Gelfand and Smith (1990)). Starting from some arbitrary initial values, say ((0)1 , (0)2 , . . . , (0)k ), for a parameter vector = (1, . . . , k), it samples alternatively from the conditional density of each component of the parameter vector, conditional on the values of other components sampled in the
latest iteration. That is:
As the number of iterations j approaches infinity, the sampled values in effect can be regarded as drawing from true joint and marginal posterior densities.
Moreover, the ergodic averages of functions of the sample values will be con-sistent estimates of their expected values.
Under the assumption that the prior of ¯ is N( ¯∗, ), the relevant conditional distributions that are needed to implement the Gibbs sampler for (6.7.1)–(6.7.2) are easily obtained from and IG denotes the inverse gamma distribution.
Hsiao, Pesaran, and Tahmiscioglu (1999) have conducted Monte Carlo ex-periments to study the finite-sample properties of (6.7.10), referred to as the infeasible Bayes estimator; the Bayes estimator using (6.7.11) as prior for andσi2obtained through the Gibbs sampler, referred to as the hierarchical Bayes estimator; the empirical Bayes estimator; the group-mean estimator (6.7.8); the bias-corrected group-mean estimator obtained by directly correcting the finite-T bias of the least-squares estimator ˆi, using the formula of Kiviet (1995) and Kiviet and Phillips (1993), then taking the average; and the pooled least-squares estimator. Table 6.4 presents the bias of the different estimators of ¯γ for N = 50
Table6.4.Biasoftheshort-runcoefficient¯γ Bias Bias-correctedInfeasibleEmpiricalHierarchical T¯γPooledOLSMeangroupmeangroupBayesBayesBayes 510.30.36859−0.23613−0.140680.05120−0.12054−0.02500 20.30.41116−0.23564−0.140070.04740−0.11151−0.01500 30.61.28029−0.17924−0.109690.05751−0.028740.02884 40.61.29490−0.18339−0.108300.06879−0.007040.06465 50.30.06347−0.26087−0.155500.01016−0.18724−0.10068 60.30.08352−0.26039−0.154860.01141−0.18073−0.09544 70.60.54756−0.28781−0.172830.05441−0.12731−0.02997 80.60.57606−0.28198−0.169350.06258−0.10366−0.01012 2090.30.44268−0.07174−0.013650.00340−0.002380.00621 100.30.49006−0.06910−0.012300.00498−0.001060.00694 110.350.25755−0.06847−0.01209−0.00172−0.01004−0.00011 120.350.25869−0.06644−0.01189−0.00229−0.008420.00116 130.30.07199−0.07966−0.01508−0.00054−0.01637−0.00494 140.30.09342−0.07659−0.012820.00244−0.01262−0.00107 150.550.26997−0.09700−0.02224−0.00062−0.016300.00011 160.550.29863−0.09448−0.02174−0.00053−0.013520.00198 Source:Hsiao,Pesaran,andTahmiscioglu(1999).
and T = 5 or 20. The infeasible Bayes estimator performs very well. It has small bias even for T = 5. For T = 5, its bias falls within the range of 3 to 17 percent. For T = 20, the bias is at most about 2 percent. The hierarchical Bayes estimator also performs well,20followed by the empirical Bayes estimator when T is small; but the latter improves quickly as T increases. The empirical Bayes estimator gives very good results even for T = 5 in some cases, but the bias also appears to be quite large in certain other cases. As T gets larger its bias decreases considerably. The mean-group and the bias-corrected mean-group es-timator both have large bias when T is small, with the bias-corrected eses-timator performing slightly better. However, the performance of both improves as T in-creases, and both are still much better than the squares estimator. The least-squares estimator yields significant bias, and its bias persists as T increases.
The Bayes estimator is derived under the assumption that the initial observa-tions yi 0are fixed constants. As discussed in Chapter 4 or Anderson and Hsiao (1981, 1982), this assumption is clearly unjustifiable for a panel with finite T . However, contrary to the sampling approach, where the correct modeling of ini-tial observations is quite important, Bayesian approach appears to perform fairly well in the estimation of the mean coefficients for dynamic random-coefficient models even the initial observations are treated as fixed constants. The Monte Carlo study also cautions against the practice of justifying the use of certain estimators on the basis of their asymptotic properties. Both the mean-group and the corrected mean-group estimator perform poorly in panels with very small T . The hierarchical Bayes estimator appears preferable to the other consistent estimators unless the time dimension of the panel is sufficiently large.
6.8 AN EXAMPLE --- LIQUIDITY CONSTRAINTS