Exposure data Aerometric Information Retrieval System

Air pollution data was obtained from the Aerometric Information Retrieval System (AIRS) of the U.S. Protection Agency. AIRS contains information on all of the routine pollution monitoring in the United States. Pollutant exposures were assigned by means of geocoding. Each participant in NHANES III was assigned a longitude and latitude

corresponding to the population centroid of the census block group in which they lived. Block groups are collections of adjoining blocks, selected to be uniform in socio-economic status, with populations (in 1990) of about 1,000 persons. The longitude and latitude of each monitor in the United States was obtained by AIRS. Persons were assigned exposure values equal to the average of measurements from all monitors in their county of residence and adjoining counties, with the average weighted in proportion to the inverse of the square of the distance between their residence and the monitor. I created an exposure variable derived from this measurement value subtracted from the mean of all values (the grand mean). I refer to this as the unpartitioned exposure measure to distinguish it from the partitioned exposure measure described below. Pollution monitor data is missing for some participants in counties where there were no monitors and some counties had pollution data for some pollutants but not others. Therefore, analyses for individual pollutants do not include all of the same subjects.

In the current analysis, the use of the weighted average of pollutant measurements provides geographic variability in exposure within a county to be reflected in the exposure measurement of chronic exposure to PM10, SO2 and NO2. The true geographical variation is driven by local point source and mobile source pollution as well as geographical and

meteorological differences in the diffusion of air pollution.

The variation of air pollution between counties can contribute to average health related outcomes among residents in the county, which is a different level of inference from associations derived from variation within a county. In the absence of significant variation within a county, average county level exposures are the best measurement for individuals residing in the county. In relation to an outcome, analysis of air pollution in these data ought

to reflect the level of analysis from which the variation in exposure is derived. To this end, I parameterized the pollution exposure variables to be used in mixed models in order to arrive at separate estimates reflecting the county level (ecological) effect and the within county effect that has a more individual level of inference.

Because ambient monitors measure different characteristics of pollutants that may relate to the underlying latent pollutant characteristics in different ways, no one pollutant is particularly representative of the pertinent exposure. From the literature, particulate air pollution that is associated with fossil fuel combustion is the most specific characterization of the underlying exposure. PM10, NO2 and SO2 are each indirect measures of this pollution, although each may reflect unique latent characteristics associated with fine PM. We used each in order to explore whether effects were associated with pollution from gasoline engines in urban traffic (NO2) or pollution from combustion from diesel (SO2).

Exposure Variables:

Pollutants obtained from AIRS data, geocoded to participants residence address with inverse variance weighting for multiple monitors. Main exposures:

PM10µg/m3 SO2 ppb NO2 ppb

Mean of prior year measures of pollutant (Manuscript #1 -Lipids) Mean of prior week measures of pollutant (Manuscript #2 -ALT)

Partitioning of air pollution exposure measures - For each pollutant, I created a variable that is the county mean of subjects prior year (week for ALT) concentration exposures to the

air pollutant and subtracted the grand mean of the pollutant over all counties over the same period. The result is a county level average air pollution exposure measure expressed as a deviation from the grand mean. To the extent that true variation in air pollution exposure is derived from between county variation, inference is limited to the population (ecologic or county) level, such that living in a polluted area is associated with increase/decreased average lipid (or ALT) levels. For each individual, I also created a variable that is the individual’s mean pollutant concentration at residence during the previous 12 months (1 week for ALT) minus the county average; this estimates an individual’s air pollution exposure expressed as a deviation from the county mean. Inference on this parameter has a specific individual-level interpretation, allowing for measurement error, independent of its ecological analogue. It also tends to capture more local, rather than regional sources of pollution.

The equations below illustrate the creation of the county level and individual level partitioned pollution exposure measurement for the ith individual in the jth county:

PM j = jth county mean PM- grand mean PM

PM ij =ith individual in county j PM – jth county mean PM

The resulting parameters employed in a mixed model that partitions the variation at the county and within county level as described below allow the other parameters to be

interpreted as the effects at the mean county and mean individual exposure (relative to their county) to air pollution. I refer to this parameterization of the air pollution variable as the partitioned air pollution levels, in reference to the partitioning of the exposure measures to correspond with the analogous between-county and within-county variation

In document Effects of air pollution on liver metabolism with relevance for cardiovascular disease: a multilevel analysis (Page 69-72)