Difference-in-difference (DID) HPM regression

1.3 Empirical applications

1.3.2 Difference-in-difference (DID) HPM regression

Bin and Polasky (2004) observed that previous studies which estimated the implicit price of flood risk and found a large discount of between 4 and 12% were all in areas that had experienced recent flooding (Donnelly, 1989; MacDonald, Murdoch, and White, 1987; Shilling, Benjamin, and Sirmans, 1985; Speyrer and Ragas, 1991); however, the study of Harrison, Smersh, and Schwartz (2001) which found a discount of between 1 and 3% was somewhat unusually in a location that had not experienced any major flooding in the recent past. This led the authors to propose that recent experience with flooding raises the perception of flood risk and the associated discount for living in a floodplain. This argument follows from prospect theory which suggests that people poorly integrate risk

into their decisions, especially when the risk is high consequence and low probability such as natural disasters (Camerer and Kunreuther, 1989; Kunreuther, 1976, 1996; Kunreuther and Slovic, 1978; McDaniels, Kamlet, and Fischer, 1992; Slovic, 1987; Smith, 1986). Authors such as Hallstrom and Smith (2005), Kousky (2010), Atreya and Ferreira (2011a, 2012b) and Bin and Landry (2013) argue that since floods are low probability events (floodplains are delineated as 1% or 0.2% annual probability of flooding), individuals often neglect the associated risk, 𝜋. However, recent experience with flooding provides new information to individuals to update their subjective assessment of risk, 𝑝(𝑖, 𝑟), which causes a change in the price of floodplain property.

Since Bin and Polasky (2004), there has been a growing body of research examining the effect of new information about flood risk and how this is capitalised into property prices of floodplain-designated properties using a Difference-in-difference (DID) approach. The strategy for identification relies on the occurrence of natural events as a source of exogenous variation in the explanatory variable by introducing a temporal element to the analysis by the use of a before-after approach. These natural events are usually changes or spatial variation in rules governing behaviour, which are assumed to satisfy the randomness criterion (Rosenzweig and Wolpin, 2000). The vast majority of DID HPM applications in the flood risk literature use the occurrence of a flood event (or hurricane) as a source of exogenous variation to identify the information effect on property prices. However, Harrison, Smersh, and Schwartz (2001), Troy and Romm (2004) and Pope (2008), look at the information effect of changes in regulations for properties in the floodplain; and Samarasinghe and Sharp (2010) evaluate the effect of floodplain zoning. Hallstrom and Smith (2005) suggest that changes in information can also be due to media coverage and education programs, amongst others.

Authors such as Meyer (1995), Rosenzweig and Wolpin (2000) and Carbone, Hallstrom, and Smith (2006) emphasise that the use of weather events as an identification strategy does not guarantee a random treatment; as it is clear from equations (19) and (20) flood hazard is a spatially delineated risk such that the probability of occurrence of a flood and the amount of losses is highly related to location and individuals’ preferences towards risk and other environmental amenities. Therefore, the application of DID models in the flood risk literature has been denoted as quasi-experimental (quasi-random) approach, where control is not guaranteed to meet the standards of a completely random assignment (Carbone, Hallstrom, and Smith, 2006; Meyer, 1995).6

Following Meyer (1995), Carbone, Hallstrom, and Smith (2006) and Parmeter and Pope (2012), there are two dimensions distinguishing the structure of a quasi-experiment: the group assignment for each unit (house) in the study, i.e. inside or outside a floodplain, and the timing (𝑡) of the potential outcome that is observed for each unit. The price 𝑃_𝑖𝑡 designates the outcome we observe for each house, 𝑖. Therefore there exists two potential prices 𝑃𝑖0 and 𝑃𝑖1 for the before and after treatment effect. Carbone, Hallstrom and Smith

(2006) argue that combining temporal variation in risk perceptions with spatial variation in risk characteristics can also help to avoid the endogeneity problems discussed earlier.

Hedonic studies using a DID approach can be distinguished according to the kind of data they use. The most common approach is to model property prices as a pooled cross-section over time, i.e. the prices of houses at different points in time do not correspond to sales of the same property. Housing sales of the same region are observed over time and

6_{As Meyer (1995) and Carbone, Hallstrom and Smith (2006) pointed out, in economics natural experiments are usually}

induced by policy changes, government randomization, or other events which examine outcome measures for observations in treatment groups and comparisons groups that are not randomly assigned.

unobserved heterogeneity is controlled for using region or neighbourhood level fixed effects (Parmeter and Pope, 2012). The second approach is known as the repeat-sales model. It uses actual panel data by considering sale prices of the same houses that has been sold multiple times over a given time period in which some of the houses experienced an environmental change which is not uniform across properties; a flood event can be regarded as one of such changes. Below, we describe these two applications and discuss the evidence.

Pooled Cross-Section over time

Empirical applications of the DID framework using pooled data of cross-section over time can be described using the general regression model in equation (25) which assumes a linear HPF specification as in (22). The treatment group is distinguished by floodplain location (𝐹𝑃100𝑖 or 𝐹𝑃500𝑖) and the treatment refers to the occurrence of a flood. The

timing is the date of the sale in relation to the flood event (𝐹𝑙𝑜𝑜𝑑𝑖) (for studies analysing

the information effect of policy changes the timing is given with respect to the date in which the new policy was implemented).

𝑙𝑛𝑃_𝑖𝑡 = 𝛽₀+ 𝜃₁𝐹𝑃100_𝑖 + 𝜃₂𝐹𝑃500_𝑖+ 𝛼𝐹𝑙𝑜𝑜𝑑_𝑖 + 𝜓₁(𝐹𝑙𝑜𝑜𝑑_𝑖× 𝐹𝑃100_𝑖)

(25) +𝜓2(𝐹𝑙𝑜𝑜𝑑𝑖 × 𝐹𝑃500𝑖) + ∑ 𝛽𝑗

𝑗=1

𝑍𝑖𝑗+ 𝛾𝑟𝑖 + 𝜀𝑖𝑡

The variables 𝑍 and 𝑟 are observable sources of heterogeneity and have the same interpretation as in equation (23); 𝜀_𝑖𝑡 captures unobservable sources of heterogeneity which vary with the property, and the usual iid assumptions apply. The variable 𝐹𝑙𝑜𝑜𝑑 is a dummy variable equal to one if the sale happened after the flood event of interest and 𝐹𝑃100 and 𝐹𝑃500 are dummy variable which takes the value of one if the house is located

within a 100-year or 500-year floodplain, respectively. The parameter 𝜃_𝑖 represents the group effect, i.e. the pre-flood relative price differential between the control group (no floodplain location) and the treatment group (100-year and/or 500-year floodplain location); 𝛼 captures the time effect, i.e. the relative price difference for all properties that were sold after the flood; and 𝜓𝑖 represents the treatment response, i.e. the incremental

effect due to information conveyed by the flood (treatment) in known risky locations (floodplains). That is,

𝜓̂₁ = (𝑙𝑛𝑃̅̅̅̅̅̅̅̅̅̅̅ − 𝑙𝑛𝑃₁𝐹𝑃100 0𝐹𝑃100

̅̅̅̅̅̅̅̅̅̅̅) − (𝑙𝑛𝑃̅̅̅̅̅̅ − 𝑙𝑛𝑃₁ ̅̅̅̅̅̅) 0 (26)

By introducing separate variables to control for different levels of risk it is possible to analyse how new information about flood risk is capitalised at different levels of risk. A similar expression applies for 𝜓̂₂ in the 500-year floodplain. The key assumption for identification is that 𝐸[𝜀_𝑖𝑡|𝐹𝑙𝑜𝑜𝑑_𝑖] = 0, for 𝑡 = 0, 1 (before and after the flood).

In this case, it is possible to obtain an expression for the post-flood price differential for properties located in the floodplain Θ_𝑖, which is defined as the sum of two terms: the pre- flood price differential (𝜃𝑖), plus the incremental effect due to an update of flood risk

perception for properties within known risky locations (𝜓𝑖), as it appears in equation (27).

Θ𝑖 = 𝜃𝑖 + 𝜓𝑖 ; for 𝑖 = 1 , 2 for FP100 and FP500, respectively. (27)

Recent applications of the DID HPF approach to the flood risk literature focus on the issue of potential correlation between spatially delineated risks and unobserved characteristics, and the exogeneity assumption 𝐸[𝜀_𝑖𝑡|𝐹𝑙𝑜𝑜𝑑_𝑖] = 0 of the treatment variable for the identification of 𝜓𝑖. In particular, authors such as Atreya and Ferreira (2012a), Atreya,

spatial specification of the DID HPM; these applications can be generalized using the following equation: 𝑙𝑛𝑃𝑖𝑡 = 𝛽0+ 𝜌𝑊𝑖𝑙𝑛𝑃𝑖 + 𝜃1𝐹𝑃100𝑖 + 𝜃2𝐹𝑃500𝑖 + 𝛼𝐹𝑙𝑜𝑜𝑑𝑖 + 𝜓1(𝐹𝑙𝑜𝑜𝑑𝑖 × 𝐹𝑃100𝑖) (28) +𝜓₂(𝐹𝑙𝑜𝑜𝑑_𝑖 × 𝐹𝑃500_𝑖) + ∑ 𝛽_𝑗 𝑗=1 𝑍_𝑖𝑗+ 𝛾𝑟_𝑖 + (𝐼 − 𝜆𝑊_𝑖)−1_𝜀 𝑖𝑡

Equation (28) includes two different types of spatial process models which can be derived by putting restrictions on the parameters 𝜌 and 𝜆. Meldrum (2013) and Bin and Landry (2013) estimate a spatial lag model which is obtained in the case where 𝜌 ≠ 0 and 𝜆 = 0; as in equation (24) 𝑊 represents a spatial weights matrix. On the other hand, authors such as Atreya and Ferreira (2012c), Atreya and Ferreira (2012a) and Atreya, Ferreira, and Kriesel (2012) estimate a Spatial Autoregressive Model with Autoregressive Disturbances (SARAR) (Anselin and Florax, 1995), this model is obtained in the case where 𝜌 ≠ 0 and 𝜆 ≠ 0. Spatial interactions in the dependent variable are modeled with the spatial lag structure, with spatial weights 𝑊, and the error term assumes a spatially weighted error structure accounting for unobserved spatial correlation.

A shortcoming of this approach is the amount of information it requires, since information on all the major structural and locational characteristics (𝑍_𝑖 and 𝑟_i) influencing the value of a house must be included in the regression to ensure unbiased estimates (Palmquist, 1982, 2005). A focus of debate has been the existence of unobserved, time invariant, omitted variables causing econometric issues due to spatial autocorrelation that hinders estimation of the hedonic price model (Bin and Polasky, 2004; Hallstrom and Smith, 2005). Hallstrom and Smith (2005) and Carbone, Hallstrom, and Smith (2006) argue that the hedonic model cannot isolate the effects of new information even considering transactions before and after

a flood event. For these reasons authors such as Hallstrom and Smith (2005), Kousky (2010) and Lamond, Proverbs, and Antwi (2007a) have suggested the use of the repeat- sales model.

Repeat-sales model

The repeat-sales model uses actual panel data by considering sale prices of the same houses that have been sold multiple times over a given time period in which some of the houses experienced an environmental change which is not uniform across properties. This model can be derived from the hedonic price model, and it is used to remove unobservable, time-invariant characteristics of a property from the specification of the hedonic model (Kousky, 2010; Palmquist, 1982, 2005). Palmquist (1982) argue that between the sales of a house there are changes in some characteristics such as age, environmental quality and the general real state price level; however other characteristics of the house (structural and locational) remain the same. Therefore, by considering two sales of the same property it is possible to recover estimates for the effect of those aspects of home’s location that change over time.

Among the flood risk literature, the specification of the repeat-sales model to assess the effects of new information on property prices assumes that the effect of a flood event as new information is to introduce a constant differential between homes located within a floodplain, and those which are not (Hallstrom and Smith, 2005). Here we present the basic theoretical framework of the repeat-sales model. The exposition follows closely that of Palmquist (1982, 2005) and Kousky (2010).

Formally, let 𝑍_𝑖 again represent the set of structural, locational and environmental characteristics; 𝑟𝑖 the site attributes related to flood risk and 𝐶𝑖 the set of unobserved

characteristics for house 𝑖. All these variables are assumed to remain unchanged between the two sales period, 𝑡 = 0, 1. Thus, the time of the first sale is denoted by 0 and that of the second sale by 1. The variable 𝑎𝑔𝑒 represents the age of the structure at the time of sale (for 0 or 1). As in equation (25), 𝐹𝑃100 and 𝐹𝑃500 are dummy variables which takes the value of one if the house is located within a 100-year and 500-year floodplain, respectively, and the variable 𝐹𝑙𝑜𝑜𝑑_𝑖 is a dummy variable equal to one if the sale happened after the flood event of interest (𝑡 = 1). Therefore, following Palmquist (1982) the price of property 𝑖 in year 𝑡 is given by:

𝑃𝑖𝑡 = 𝐵𝑡 𝑔(𝑍𝑖, 𝑟𝑖, 𝐶𝑖) 𝑒𝑥𝑝(𝛾1𝐹𝑃100𝑖 × 𝐹𝑙𝑜𝑜𝑑𝑖𝑡)𝑒𝑥𝑝(𝛾2𝐹𝑃500𝑖) 𝑒𝑥𝑝(𝛾2𝐹𝑙𝑜𝑜𝑑𝑖𝑡) (29)

𝑒𝑥𝑝(𝛾3𝑎𝑔𝑒𝑖𝑡) 𝑒𝑥𝑝(𝜀𝑖𝑡)

where 𝐵_𝑡 denotes the real estate price index and 𝛾’s are parameters to be estimated. 𝜀_𝑖𝑡 represents an idiosyncratic error term for which the usual iid assumptions apply. As the repeat-sales model requires at least two sales for each property, for house 𝑖 there is an earlier sale in year 𝑠 (𝑡 = 0) for which the price is explained by an equation similar to (29). Dividing the former by the latter and assuming structural, locational, and neighbourhood characteristics (𝑍_𝑖, 𝑟_𝑖, 𝐶_𝑖) are constant over the period of analysis, as well as the parameters of the hedonic price function, the term 𝑔(𝑍_𝑖, 𝑟_𝑖, 𝐶_𝑖) drops out of the equation such that unobservables, 𝐶_𝑖, are no longer a concern. Following Palmquist (1982) it is also possible to drop the 𝑎𝑔𝑒 variable as it is perfectly collinear with the price index and an estimation of depreciation is not of interest in this case (Kousky, 2010). Taking the natural logarithm of the remaining expression yields the repeat-sales model specification in equation (30).

∆𝑙𝑛(𝑃_𝑖) = 𝛽₁(𝐹𝑃100_𝑖 × 𝐵𝑟𝑎𝑐𝑘𝑒𝑡_𝑖) + 𝛽₂(𝐹𝑃500_𝑖 × 𝐵𝑟𝑎𝑐𝑘𝑒𝑡_𝑖) + 𝛽₃𝐵𝑟𝑎𝑐𝑘𝑒𝑡_𝑖

(30) +𝛽4𝑌𝑒𝑎𝑟1+ 𝛽5𝑌𝑒𝑎𝑟0+ ∆𝑒𝑖

Notice that after dividing expression (29) at 𝑡 = 1 by that at 𝑡 = 0 the term identifying post-flood sales, 𝐹𝑙𝑜𝑜𝑑_𝑖, now represents a term identifying sales that bracket the flood, i.e. those for which the first sale is before the flood and the second one after.7 The interaction term 𝐹𝑃100𝑖 × 𝐵𝑟𝑎𝑐𝑘𝑒𝑡𝑖 and 𝐹𝑃500𝑖 × 𝐵𝑟𝑎𝑐𝑘𝑒𝑡𝑖 identifies those properties within the

floodplain, at different levels of risk, with sales that bracket the flood. The natural logarithm of 𝐵 (in equation (29)) takes the form of coefficients on dummy variables taking on the year of the sale (Palmquist, 2005). Therefore, assuming there are no other changes in observable variables that contribute to price differences and that unobservables, represented by (𝑒_𝑖1− 𝑒_𝑖0), are not correlated with the effect being measured, then 𝛽̂₁ can be expressed as

𝛽̂₁ = (𝑙𝑛𝑃̅̅̅̅̅̅̅̅̅̅̅ − 𝑙𝑛𝑃₁𝐹𝑃100 0𝐹𝑃100

̅̅̅̅̅̅̅̅̅̅̅) − (𝑙𝑛𝑃̅̅̅̅̅̅ − 𝑙𝑛𝑃₁ ̅̅̅̅̅̅) 0 (31)

in which the coefficient on the environmental variable, 𝛽1, represents the marginal effect of

changes in environmental attributes on property values in relative terms, i.e. the panel data equivalent of 𝜓̂1 in equation (25) (Kousky, 2010; Palmquist, 1982). A similar expression

applies for 𝛽̂₂ in the 500-year floodplain. Notice that in this case it is not possible to recover an expression for the pre-flood and/or post-flood price differential for floodplain location. The only information we get is how new information, due to environmental changes, is capitalised in known risky locations with different levels of risk.

7_{If the flood occurred before the time of the first sale (𝑠), it also takes place before the second sale (𝑡) and 𝐹𝑙𝑜𝑜𝑑}

𝑖𝑡−

𝐹𝑙𝑜𝑜𝑑𝑖𝑠= 0, implying 𝐵𝑟𝑎𝑐𝑘𝑒𝑡𝑖𝑡𝑠= 0. When both sales were before the flood the variable is also zero, and it is

impossible for the flood to be before the first sale and not before the second. The only way for 𝐹𝑙𝑜𝑜𝑑𝑖𝑡− 𝐹𝑙𝑜𝑜𝑑𝑖𝑠 to be equal 1, i.e. 𝐵𝑟𝑎𝑐𝑘𝑒𝑡𝑖𝑡𝑠= 1, is when the two sales bracket the flood date.

Although the repeat-sales model deals with the possible omitted variable bias, it brings with it additional complications. The sample is restricted to properties that have been sold more than once, thus is not a random sample and usually a small fraction of the full data set. It also assumes that real estate depreciates at a geometric rate and that risk has a linear effect on the natural logarithm of property price (Kousky, 2010; Palmquist, 1982, 2005).

Evidence

Based on early empirical observations, studies by Tobin and Newton (1986) and Montz and Tobin (1988) describe different dynamics that house prices might experience following a flood event. The authors suggest that negative aspects of flood hazard are capitalised into property prices to an extent that varies spatially and temporally depending on the frequency, severity and spatial characteristics of flood events, and that a recovery process of house prices might follow depending on various socio-economic criteria along with the prevailing flood conditions.

Tobin and Newton (1986) propose four different price profiles which are shown in figure 1.1; the timing of flood events is characterized by vertical dotted lines. Figure 1.1.A depicts the evolution of house prices in a location with rare flood events; the flood has an initial impact reducing property prices but after a period of time prices recover to levels at or near to those prevailing prior to the event. Figure 1.1.B represents the situation of an area subject to regular flooding. In this case people are aware of flood risk in their community (this might also happen due to disclosure policies), the effects of floods are already capitalised into property prices and recent floods provide no new information; the market does not have sufficient time to recover before the occurrence of a subsequent flood. Lamond and Proverbs (2006) argue that in this case a DID study focusing on an

individual flood event would reveal no information effect. Figure 1.1.C shows a situation in which floods are less frequent than in figure 1.1.B such that the market has the ability to recover before a new flood occurs. In figure 1.1.C the occurrence of a catastrophic flood provides new information about flood risk for a community and permanently changes resident’s expectations; damage could be so great as to preclude any noticeable recovery in property prices.

Figure 1.1 Tobin and Newton (1986): Price dynamics after a flood event

Figure 1.1.A Figure 1.1.B

Figure 1.1.C Figure 1.1.D

Thus, it is possible to identify two main areas of research for application of hedonic DID models in the flood risk literature: studies which focus on examining the effect of new information about flood risk and how this is capitalised in prices of properties within the floodplain (see for example Bin and Polasky, 2004; Troy and Romm, 2004; Kousky, 2010; Atreya and Ferreira 2012a, 2012c), and those which additionally explore the persistence of this information effect over time (see for example Atreya and Ferrieria 2012a, 2012c, Atreya, Ferreira and Kriesel, 2013; Bin and Landry, 2013).

The literature suggests that the average price discount for floodplain location before a flood is about 1%. The results however, range between a discount of 20% by Atreya and Ferreira (2012a) and a premium of 32% by Morgan (2007). Dividing the estimates by different levels of risk we get that location in a 100-year floodplain before a flood is associated with an average discount of 3%, however, location in a 500-year floodplain under the same circumstances is associated with an average premium of the same magnitude.

There is general agreement that prices of properties within a floodplain decline after a flood event, the evidence suggesting an average reduction of around 15%. This evidence is consistent with the idea that recent experience of flooding raises the perception of flood risk and the associated discount for living in a floodplain (Bin and Polasky, 2004). However, there is no agreement on how the information update operates at different levels of risk. For instance, Kousky (2010) examines the change in the price differential for floodplain location for properties in St. Louis County, Missouri, US, after a flood in 1993 on the Missouri and Mississippi rivers, using data for 424,727 properties that were sold during the period 1979-2006. The results suggest that before the flood properties in the

100-year floodplain were significantly discounted, on average, by about 3%, whereas no significant price differential was observed for properties located in the 500-year floodplain. After the flood, property prices in the 100-year floodplain did not change significantly, but prices of properties in the 500-year floodplain experienced a significant decline of 2%. The author associates these results to pre-flood differences in information available about flood risk to homeowners among floodplains with different levels of risk. Kousky (2010) points out that in the US, sellers of houses at the highest level of risk (100-year floodplain) have to provide to prospective buyers a Natural Hazard Disclosure Statement (NHDS) prior to transaction. In this way potential buyers are aware of the risk they face. Nonetheless, this disclosure clause is not applicable for properties within 500-year floodplains. Thus, the results indicate that little updating after the flood occurs in areas that had some prior knowledge of flood risk (100-year floodplain), but significant updating occurred where no prior capitalisation of flood risk into property prices had taken place (500-year floodplain).

Bin and Landry (2013) explore the change in implicit flood risk prices after Hurricane Fran (1996) and Floyd (1999) using a sample of 8,159 properties in Pitt County, North Carolina, US, for the period 1992-2008. Their results suggest that it is properties in the 100-year floodplain the ones which experience a significant discount between 8 and 12% after a major flood, while properties in the 500-year floodplain did not experience significant changes. Atreya, Ferreira, and Kriesel (2013) find similar results but with a significant price decline between 28 and 48% for properties located in the 100-year floodplain after a major flood in Dougherty County, Georgia, US.

Following a flood event the discount in properties is expected to be large as homeowners are also likely to have experienced flood damages. Hallstrom and Smith (2005) argue that

estimates of 𝜓̂_𝑖 from DID HPM regressions also capture the effect of flood damages and how repairs and reconstruction after the flood might have affected property prices. In

In document Essays on the economic valuation of flood risk (Page 48-67)