Chapter 3 A Fresh Look at Calorie-Income Elasticities in China
3.3 Theoretical framework
3.4.3 Analytical methods
Calorie-income elasticity: parametric models. To estimate the effect of income on calorie intake, we begin by estimating functions of the following form:
ππππΆπΌ = πΌ0+ πΌ1ππππΌππΆ + πΌ2πΌ + πΌ3π» + πΌ4πΆ + πΌ5π + π (4)
where logCI is log-transformed individual 3-day average calorie intake, and logINC is log-transformed household per capita income. I is a matrix of individual controls, H is a matrix of household controls, and C is a matrix of community controls. T is a matrix of year dummies with 1991 as the reference
36
year, Ξ΅ is a matrix of random disturbance errors, and Ξ±1 is the key coefficient of interest whose magnitude is the calorie-income elasticity.
Given the possible nonlinear nexus of income and calorie intake, we also use two more flexible specifications in this parametric framework: a log-log squared model (LLS, transformed calorie intake and income with squared log-transformed income) and a log-log inverse model (LLI, log-log-transformed calorie intake and income with inverse income). Such treatment is similar to that of Skoufias et al. (2009). The detailed specifications of these two models are as follows:
ππππΆπΌ = π0 + π1ππππΌππΆ + π2(ππππΌππΆ)2+ π3πΌ + π4π» + π5πΆ + π6π + π (5)
ππππΆπΌ = πΏ0+ πΏ1ππππΌππΆ + πΏ2βπΌππΆ+ πΏ3πΌ + πΏ4π» + πΏ5πΆ + πΏ6π + π (6)
INC in model (6) is household per capita income; the other parameters are the same as in model (4).
The corresponding calorie-income elasticities of models (5) and (6) are shown in models (7) and (8), respectively:
βlogCI βlogINCβ = π1+ 2π2ππππΌππΆ (7)
βlogCI βlogINCβ = πΏ1β πΏ2βπΌππΆ (8)
In these models, if Ξ΄2 or ΞΈ2 are significant, calorie-income elasticities vary with income level. For instance, in model (8), if Ξ΄1>0 and Ξ΄2<0, then calorie-income elasticities decrease as income increases; if Ξ΄1=Ξ΄2/INC, they equal 0; and if Ξ΄1<Ξ΄2/INC, they become negative. However, when Ξ΄2=0, the calorie-income elasticity becomes a constant equal to Ξ΄1, which neither equals 0 nor changes sign
37
when Ξ΄1=0 (see Huang and Gale, 2009, for a detailed discussion on the advantages of LLI models).
We perform our OLS estimates on models (4), (5), and (6); however, it is worth noting that such estimations are only valid if
πΈ(πβ²π) = 0, π = π(ππππΌππΆ, πΌ, π», πΆ) (9)
If unobservable heterogeneity (e.g., individual food preferences and eating habits or social norms) is correlated with income, then OLS estimates of the calorie-income elasticities will be biased (Lu and Luhrmann, 2012). Another potential issue is measurement error, a problem, however, that is likely to be minimal in our study because the quality of the income and calorie data in the CHNS is high.
The estimates may also be biased if the analysis fails to account for community, household, and individual fixed effects (Behrman and Deolalikar, 1990), so we also employ individual-level fixed effect models to control for community, household, and individual unobservables. These specifications include both one-way and two-one-way fixed-effects models:
ππππΆπΌπππ‘ = πΎ0+ πΎ1ππππΌππΆπππ‘+ πΌπππ‘πΎ2+ π»ππ‘πΎ3+ πΆππ‘πΎ4+ πΌπ + ππππ‘ (10)
ππππΆπΌπππ‘ = πΎ0+ πΎ1ππππΌππΆπππ‘+ πΌπππ‘πΎ2+ π»ππ‘πΎ3+ πΆππ‘πΎ4+ πΎ5π + πΌπ + ππππ‘ (11)
where logCIict is the 3-day average calorie intake of individual i in community c at time t and logINCict is household per capita income of individual i in community c at time t. Iict is a matrix of individual covariates, Hict is a matrix of household level covariates, and Cct is a matrix of community level covariates. T denotes time effects, πΌπ is an individual time-invariant fixed effect, Ξ΅ict is the
38
idiosyncratic error term, and ο§1 is the key coefficient of interest. We also take into account clustering at the individual level.
Calorie-income elasticity: nonparametric model. When assumptions such as normality do not hold, parametric estimates will be inefficient (DiNardo and Tobias, 2001). Moreover, rather than estimating at a single point, it is important to explore the full range of calorie-intake responses to income (Gibson and Rozelle, 2002). One possible way to take nonlinearity into account is to use nonparametric estimates, which are more flexible primarily because the functions to be estimated are based on pliable assumptions (DiNardo and Tobias, 2001).
The model we estimate is a univariate nonparametric model:
π¦π = π(π₯π) + ππ, ππ~πππ(0, ππ2) (12)
where yi denotes 3-day average individual calorie intake and m(xi) is an unknown functional form of household per capita income. Because of the underlying biases stemming from kernel smoothing estimates, in this study, we employ the locally weighted scatterplot smoothing (LOWESS) developed by Cleveland (1979), which uses changeable bandwidths and mitigates the possible biases associated with boundary problems.
Calorie-income elasticity: semiparametric model. Although nonparametric methods help us to identify potentially nonlinear calorie-income relations, their most evident limitation is the dimensionality that inhibits the introduction of control variables (DiNardo and Tobias, 2001). To solve this obstacle, we employ a semiparametic approach using a partially linear model (PLM):
π¦π = πππ½ + π(π§π) + ππ, πΈ(ππ|ππ, π§π) = 0, π = 1, β― , π (13)
39
where Xi is a matrix of explanatory variables, and zi is the variable of household per capita income with unknown functional form.
For this model, we use the Robinson (1988) double residual estimator with clustering at the community level, which first takes the conditional expectation on zi:
πΈ(π¦π|π§π) = πΈ(ππ|π§π)π½ + π(π§π) + πΈ(ππ|π§π), π = 1, β― , π (14)
Subtracting (13) from (14) then yields
π¦πβ πΈ(π¦π|π§π) = (ππβ πΈ(ππ|π§π))π½ + ππ, π = 1, β― , π (15)
E(yi|zi) and E(Xi|zi) are first estimated using nonparametric techniques and then replaced into model (15), so that Ξ² can be estimated consistently. Finally, Ξ»(zi) is estimated by regressing (16) nonparametrically (Verardi and Debarsy, 2012):
π¦πβ πππ½Μ = π(π§π) + ππ, π = 1, β― , π (16)
Semiparametric fixed-effects model. Instead of traditional fixed-effects techniques, we employ Baltagi and Liβs (2002) series estimator of partially linear panel data models with fixed effects, which is better able to identify an unknown or complicated nexus of two potential variables (Libois and Verardi, 2013). The general panel-data semiparametric model is as follows:
π¦ππ‘ = πππ‘β²πΎ + π(π§ππ‘) + πΌπ+ πππ‘, π = 1, β― , π; π‘ = 1, β― , π, π βͺ π (17)
where Xit is a matrix of explanatory variables assumed to be exogenous, zit
represents strictly exogenous variables, Ξ±i denotes time-invariant fixed effects, and Ξ΅it is the idiosyncratic error term. In this semiparametric fixed-effects approach, a series of pk(z) (the first k terms of a sequence of functions) is the
40
approximation of f(z) (Libois and Verardi, 2013). Again, we account for clustering at the individual level.
3.5. Results