Base Case Summary Statistics
Summary statistic for the base case presented in Table 3. Population density statistics are presented at the Census block group level while vehicle-kilometers travelled are shown in the aggregate (i.e., total forecast vehicle-kilometers travelled for the study region). Per capita vehicle-kilometers travelled are also displayed for illustrative purposes.
Table 3. Base Case Summary Statistics Mean Standard Deviation Maximum Observed Minimum Observed Total Population Density,
Census Block Group (persons/mi2)
2,180 2,315 18,784 0.27 460
Total Population,
Census Block Group 1,927 1,055 8,745 2 1,664,837
Vehicle-Kilometers
Travelled, AM Peak - - - -
21,298,906
(12.8 per capita)
The study area has a low aggregate population density, although there is significant variation amongst Census block groups in terms of population density. The highest observed value is greater than seven standard deviations above the mean whereas the lowest observed value is within one standard deviation of the mean, indicating a distribution that is somewhat positively skewed. Low regional population density, in conjunction with automobile dependence in the region, result in high predicted per capita vehicle-kilometers travelled. Under base case conditions, the TRM predicts 12.8 vehicle- kilometers per capita for the AM peak alone.
34 Calibrated Model
PM2.5 concentrations in the study area are estimated by the following equation:
[6] Predicted annual average PM2.5 concentration,
µgm-3
= Total vehicle-kilometers travelled during the AM peak within a 1,000 meter buffer of the estimation point, thousands of vehicle-kilometers travelled
Population density within a 1,500 meter buffer of the estimation point, thousands of persons per square mile
The model described in equation [6] explains 87.8% of the variation in observed PM2.5 concentrations in the study area. To reflect additional uncertainty in the model stemming from limited degrees of freedom, the adjusted R2 of the calibrated model falls to 0.7969. Thus, despite the small number of observations, the model exceeds the performance of many models in the literature and exceeds the performance of all models in the literature predicting particulate matter concentrations (see Table 2). While both the model constant and the VKT_1000m parameter are statistically significant (p = 0.015 and 0.019, respectively), the POP_DEN_1500m is not statistically significant (p = 0.590). However, inclusion of the POP_DEN_1500m parameter improves the model fit (R2 = 0.826, adjusted R2 = 0.7667 without parameter). Therefore, given the small sample size and the significance of population density in previous studies (see, for example, Ross et al. 2007, Henderson et al. 2007) and the improvement in overall model performance attributable to
35
inclusion of the parameter, POP_DEN_1500m is left in the model despite statistical insignificance. See Appendix III for the outputs of statistical analysis.
Base Case PM2.5 Concentrations
Predicted annual average PM2.5 concentrations in the Triangle area are depicted in Figure 5. As expected, areas of higher predicted annual average PM2.5 concentrations are spatially correlated with significant links in the transportation network. The spatial distribution of predicted annual average PM2.5 concentrations illustrative the highly localized impacts associated with regional transportation patterns, particularly along primary transportation links such as I-40. A notable shortcoming of the model is its inability to predict variation in local air quality associated with point-source emissions, general industrial land use, and special-case uses such as airports. This limitation is attributable to the lack of sufficient such uses near ambient air quality monitoring stations in the Triangle Region. Nonetheless, the model demonstrates that regional travel parts have a significant impact on local air quality in the Triangle Region.
The highest predicted concentration, 15.62 µg m-3, occurs in a grid cell located at the western intersection of I-40 and I-440 southwest of Raleigh. This value is significantly above the estimated regional average concentration, 15.62 µg m-3 (8.64 standard deviations above the mean). The very high predicted PM2.5 concentration this cell is the result of very high predicted vehicle-kilometers (132,296) travelled during the AM peak within the 1,000 meter buffer around the centroid of this cell. The predicted concentration in this cell is illustrative of the potentially highly localized impacts of regional mobility – while this cell contains only 0.006% of the total area of the study
36
region, 0.62% of the predicted vehicle-kilometers travelled during the AM peak occur within the 1,000 meter buffer extending from the centroid of this cell.
Figure 5. Predicted PM2.5 Concentrations, Base Case
Base Case Attributable Mortality
The risk assessment model synthesizes predicted PM2.5 concentrations spatially averaged at the block group level, county-level health data, and epidemiological evidence
37
regarding the health effects of PM2.5 to estimate total deaths attributable to PM2.5 in the study region in 2010. County-level deaths from all-cause mortality and associated death rates for 2009 are presented in Table 4 (North Carolina State Center for Health Statistics, 2012). While the Base Case study year is 2010, county-level health data are not yet available for 2010; therefore, it is assumed that 2010 death rates are equal to 2009 death rates. 2009 population estimates are taken from the 5-year estimates of the 2009 American Community Survey (U.S. Census Bureau, 2009). The relative risk for all-cause mortality related to exposure to PM2.5 is taken from Pope et al., who report a mean relative risk of 1.06 for all-cause mortality per 10 µg m-3 increase in annual average PM2.5 concentrations (2002). The model constant in the calibrated air quality model is assumed to represent regional background (i.e., variation unassociated with the model explanatory variables); therefore, in order to estimate deaths attributable to PM2.5 from local sources, the model constant is subtracted from predicted PM2.5 concentrations in the study region prior to estimating attributable deaths in the risk assessment model. In total, the model estimates 82 deaths in 2010 attributable to PM2.5 above regional background.
Table 4. Attributable Deaths from PM2.5, 2010 County 2009 Total Deaths 2009 Population 2009 Death Rate 2010 Estimated Deaths Attributable to PM2.5 Alamance 1,411 144,769 0.009747 <<1a Chatham 570 61,444 0.009277 <1a Durham 1,688 256,296 0.006586 20 Franklin 469 57,201 0.008199 <1a Granville 491 55,670 0.008820 <1a Harnett 849 108,885 0.007797 <1a Johnston 1,168 156,888 0.007445 3a Nash 913 92,814 0.009837 <<1a Orange 670 124,503 0.005381 6 Person 398 37,301 0.010670 <1a Wake 4,150 828,759 0.005007 50
TOTAL Attributable Deaths: 82 a
38
The spatial distribution of the estimated death rate per 100,000 persons attributable to exposure to PM2.5 above the regional background is presented in Figure 6. The areas with the highest predicted death rate attributable to PM2.5 are generally in close proximity to both freeways and of high population density. Thus, this analysis suggests that the co-location of areas affected by regional mobility needs and high population density is problematic from a public health perspective. Furthermore, Figure 6 suggests that regional mobility needs and associated transportation patterns have highly localized air quality impacts, and thus may have highly localized health impacts if population is located along, and particularly at the intersections of, major regional transportation links.
39
Figure 6. Estimated Deaths Attributable to PM2.5, 2010
Discussion
The final calibrated model described in equation [6] performs quite well despite the small number of observations. However, the small sample size does limit the statistical power of the model. While the model constant and the coefficient for the VKT_1000m parameter are statistically significant at the 95% level, the coefficient for the POP_DEN_1500m parameter is not (see Appendix III). Despite the statistical
40
insignificance of this model parameter, POP_DEN_1500m is not removed from the model because of the theoretical basis for its inclusion and improved observed performance of the model resulting from its inclusion. First, while the VKT_1000m parameter captures a significant portion of vehicle-kilometers travelled within the study area, the TRM is limited to predicting traffic flows only on primary and secondary streets. Thus, may neighborhood streets are unaccounted for in this parameter. The inclusion of household population density helps capture variation in neighborhood-level vehicle kilometers traveled, assuming that households have roughly similar trip generation characteristics on the aggregate. Second, the model parameter has been found to be significant in a number of existing studies that used a greater number of observations (see, for example, Ross et al. 2007, Moore et al. 2007). Third, household density may serve as a very rough proxy for capturing general variability in land use intensity at the neighborhood level that is not captured by vehicle-kilometers travelled, which is predicted using data at the regional scale. Fourth, the population density data seem to be independent of the vehicle-kilometers travelled data (see Appendix II); therefore, the data are capturing some meaningful variation, albeit partially random variation, that helps predict PM2.5 concentrations in the study area. Finally, including the POP_DEN_1500m improves the coefficient of determination from 0.8781 (0.7969 adjusted) to 0.8257 (0.7667 adjusted).
The model performs very well in predicting observed PM2.5 concentrations at monitoring stations. The mean square error of the estimate is fairly low and is driven primarily by the concentration measured at Station 37-20. Station 37-20, although surrounded primarily by rural land uses, may be affected by nearby point-source
41
emissions not accounted for in the model and may thus be a spatial anomaly compared to other monitoring stations in the Triangle Region. Table 5 reports the model estimated values at each monitoring station as well as the error in the estimate at each monitoring station.
Table 5. Model Error Terms Station Observed Value (µgm-3) Predicted Value (µgm-3) Error (µgm-3) Percentage Error Square Error (µg2m-6) 37-15a 8.93 9.09 0.17 1.85% 0.027 37-3 13.44 12.94 -0.50 -3.76% 0.255 37-14 10.19 10.74 0.55 5.42% 0.305 37-07 11.66 11.06 -0.59 -5.08% 0.350 37-01 11.87 12.58 0.71 6.01% 0.508 37-15b 10.21 9.89 -0.32 -3.14% 0.103 37-20a 10.03 7.93 -2.10 -20.95% 4.411 37-04a 8.91 7.92 -0.99 -11.13% 0.985
Mean Square Error: 0.868 a Observed values removed for model calibration, see Appendix II for rationale
A fundamental assumption of multiples linear regression is the independence of model parameters; thus, it is critical to consider the independence of the selected explanatory variables. As illustrated in Figure 42 in Appendix II, the variables VKT_1000m and POP_DEN_1500m are independent of each other around monitoring stations. It should be noted that inclusion of the two rural monitoring stations biases the relationship as both parameters are near zero around these two stations (see Figure 40,
Appendix II). However, as discussed earlier, removal of observations from these monitoring stations is justified. Therefore, the parameters in the final model may be considered independent of one another. Conceptually, this is representative of the regional nature of travel patterns (and thus vehicle-kilometers travelled) in the region that is generally disassociated from population density at the neighborhood scale.
42
Additionally, the Moran’s I test reveals no evidence of spatial autocorrelation amongst observed PM2.5 concentrations. Moran’s Index has a value of -0.400, suggesting dispersion of observations; however, the z-score of the test is non-significant at -0.939. Thus, the Moran’s I test suggests no significant difference in observed PM2.5 concentrations from a random spatial distribution. Performing the Moran’s I test on model residuals reveals a similar result, with a Moran’s Index of 0.125 and a z-score of 1.061. It can therefore be concluded that both observed PM2.5 concentrations and model residuals are not spatially autocorrelated, thereby fulfilling the fundamental assumption of independence of observations in linear regression models. However, it should be noted that the results of the Moran’s I Test are not surprising considering the very small sample size. For the Moran’s I Test to reveal spatial autocorrelation, observed values would need to be highly clustered for the test to suggest spatial autocorrelation. Moran’s I Test results are presented in Appendix III. Therefore, the final calibrated land use regression model for the Triangle Region fulfills the underlying assumptions of linear regression and should be considered a valid model, despite the limited number of observations.
43