• No results found

4. RESEARCH DESIGN

4.3 Research Methodologies

4.3.2 Spatial Hedonic Model

4.3.2.3 Theory of the Spatial Hedonic Model

To ensure that ordinary least squares (OLS) is the best linear unbiased estimator and predictor, there is a set of ideal conditions that must be satisfied: the OLS must be independent of the errors and the errors themselves must be independent, homoskedastic, and normally distributed. However, when dealing with cross sectional data on geographic units, existence of spatial autocorrelation or dependence (either in the dependent variable or the error term) violates the basic assumptions for the OLS estimator. Hence, employing an OLS estimator in the analysis might lead to misleading model interpretations when there is significant spatial autocorrelation (Anselin, 1988).14

A typical example in housing market research is housing prices. The housing prices in a neighborhood will affect or be affected by the housing prices in adjacent neighborhoods. There are two types of alternatives that incorporate spatial dependence in the model explicitly (see Anselin, 1988; Anselin and Hudak, 1992; Smirnov and Anselin, 2001 for detailed discussions). They represent two closely related but different spatial effects. Figure 4.10 illustrates the concept of maximum likelihood (ML) Spatial Lag and Error models.

14

Anselin (2006) and Anselin and Lozano-Gracia (2008a) provide a recent comprehensive review of this field. The explicit consideration of spatial effects through the application of spatial econometrics has become more commonplace in empirical studies of housing and real estate markets after some pioneer work by Dubin (1988) and Can (1990, 1992), among others. Reviews of the basic specifications and estimation methods applied to these spatial hedonic models are provided in Anselin (1988); Basu and Thibodeau (1998); Can and Megboluge (1997); Pace, Barry, and Sirmans (1998); Dubin, Pace, and Thibodeau (1999); Gillen, Thibodeau, and Wachter (2001); Kelejian and Prucha (1998); and Pace and LeSage (2004); among others.

Figure 4.10. ML Spatial Lag and Error Models. Source: Baller et al., 2001.

The first is the dependence in the spatially lagged dependent variable (similar to the time series autocorrelation), which is referred to by Anselin and Ray (1991) as "substantive dependence" in that the model form is intended to capture either interaction effects, market heterogeneity, or both. This type of dependence suggests that spatial spillover is dominant in the development. The substantive dependence model can be expressed as:

y=

ρ

Wy + βX + ε [4.7]

Where, W is a spatial weight matrix describing the spatial linkage among spatial units; Wy is a so-called spatially lagged dependent variable;

ρ

is the spatial coefficient of

the spatially lagged dependent variable; X is the n × k matrix of unit characteristics with the associated k × 1 coefficient vector β; andε is the error term. The result of ignoring this form of spatial autocorrelation is similar to the consequences of omitting a significant explanatory variable in the right hand side of the OLS regression model.

The spatial lag model is an appropriate tool when capturing neighborhood price spillover effects. That is, this model assumes that the spatially weighted sum of neighborhood housing prices (the spatial lag) enters as an explanatory variable in the specification of housing price formation. This spillover effect only occurs among neighborhoods in close proximity. This specification, therefore, is in accord with the standard real-estate appraisal process of using comparable sales prices.

The second is the dependence in the regression's error term, or the "nuisance" dependence, which is referred by Anselin and Ray (1991) as spatial autocorrelation in omitted variables, or unobserved externalities and heterogeneities relegated to the error term. It is more likely a result from the mismatch between the boundaries of the spatial process and data collection units. The nuisance dependence model usually takes a spatially autoregressive error in the following form:

y = βX + ε

ε =λWµ+µ [4.8]

Where λ is the spatial autoregressive coefficient of the error; W is the spatial weighting matrix; ε is the spatial error term; and µ is another error term. The spatial

multiplier now pertains to the unobserved variables (the errors µ) but not to the explanatory variables of the model (X). In other words, the price at any location is a function of the local characteristics and also of the omitted variables at neighboring locations. The residuals, µ, are assumed to be uncorrelated with each other; the dependence is accounted for in the spatial weight matrix. The spatial dependence in the error term µis an independent and identically distributed (i.i.d.) error term.

The error term in a statistical model is an unobservable random variable representing the effects of all those unexplained factors that cause property to differ from the population mean. The error term accounts for omitted variables, an incorrect functional form, and an inadequate sampling. The consequences of ignoring spatial error dependence are the same as the result of ignoring heteroskedasticity. OLS estimates of the spatial error model are unbiased but are inefficient since the correlation between error terms is ignored. As a result, inference based on t and F statistics will be misleading and indications of fit based on R2 will be incorrect (Anselin, 1988).

In house pricing models, the error term also accounts for a transaction error that represents the difference between transaction prices and the expected market price relative to other houses in the market (Can and Megbolugbe, 1997). The spatial error model uses the correlated errors on nearby properties to improve the overall prediction.

4.3.3 Alternatives of ML Spatial Hedonic Models