• No results found

Controlling for season and trend Method 1

3=7,2 Causes of death

Chapter 6: Short-term effects of air pollution on mortality

6.3.1 Controlling for season and trend Method 1

In the first method, the seasonal variation in mortality was controlled by using harmonic waves (sinusoidal terms). Although the annual cycle of daily mortality is strongly periodic with higher deaths in the winter and lower deaths in the summer, it is highly unlikely that mortality patterns over time can be explained by just one sinusoidal

function with a period of one year. Some risk factors that vary seasonally may cause the seasonal variation of a six month period in daily mortality while others may cause the seasonal variations of even shorter period for example four months, three months etc.

Thus, in addition to the annual cycle, the pattern of daily deaths is more likely to have a number of other cycles of less than a one year period. This would require the sum of harmonic waves with increasing frequencies to adjust for the complex seasonal pattern of daily deaths.

Seasonal patterns in mortality were modelled using sinusoidal functions

Z ( « * &inkT + ßt cos kT) , 6-2 )

k

Where T = 2^^/365.25, t is the day of the year, a and ßi are the regression coefficients to be estimated from the model fitting. The values of k determine the period of seasonal cycle. For annual cycle, the value of k is set to 1. The integer values of k = 2,3,4,5 approximately correspond to the cycles of 6 months, 4 months, 3 months and 73 days respectively. Modelling seasonal pattern using sinusoidal functions like this is commonly known as a cosinor analysis.

Sinusoidal functions were included in the models to control for significant cycles of up to 2 months (k = 6). The short-term variation of less than 2 months in mortality was used to estimate the association between mortality, air pollution and weather. The maximum number of sinusoidal functions to control for seasonal patterns was selected based on the likelihood ratio test for the model. Sinusoidal functions of one-year period, 6 months cycle and the other smaller cycles were entered into the model in a forward stepwise fashion until the likelihood ratio test between the higher order model and the nested model within it showed that further additions of sinusoidal functions improved the model fit significantly. Statistical significance was set at the 10% level (p-value < 0.1). Both sine and cosine terms of the same period were included simultaneously in the models to create the sinusoidal functions.

Long-term trends in mortality over the 13 year period were modelled using a linear time trend. Influenza or other epidemics may cause higher mortality in the epidemic years than in non-epidemic years. This results in non-monotonic year to year fluctuations in mortality. Analysis was carried out to investigate whether there were year to year fluctuations in mortality by including dummy variables for each year in the models. A statistically significant improvement (p-value < 0.1) in the model fit would indicate year to year fluctuations in mortality. If there was evidence of year to year fluctuations in mortality, dummy variables were included in the models. This would

control for any between-year variations in daily number of deaths that may be caused by influenza epidemics or epidemics of other infectious diseases. Flu or other epidemics may also cause the variations in daily number of deaths within a year. Adding dummy variables for each year would not control for this within-year variations. However, since most epidemics including flu are seasonal, controlling for seasonal variations in daily number of deaths would control for within-year variations to some extent. Because of the lack of data on flu or other epidemics, the variations in daily number of deaths caused by the epidemics could not be completely controlled for.

In order to control for daily variation in mortality across the week, dummy variables for day of the week were included in the models irrespective of whether they were statistically significant or not.

Method 2

While seasonal variation in mortality was modelled with a parametric approach using sinusoidal functions in Method 1, it was modelled semi parametrically in Method 2 using regression spline functions of calendar time. Natural cubic splines of calendar time were used to fit the seasonal variation and a long-term trend in daily mortality.

Natural cubic spline function of calendar time divides the whole time period of analysis into a number of intervals as defined by the degrees of freedom used in the natural cubic spline function and a cubic polynomial is fitted to each interval. Cubic polynomials are fitted such that they join smoothly at the boundaries of the intervals. Beyond the boundary points, a linear function is fitted.

Seven degrees of freedom (df) per year of data was chosen (a total of 81 df) to create natural cubic splines of calendar time. Using 7 df per year of data would adjust for the long-term variation in daily mortality of approximately more than two months leaving only the short-term variation in mortality of less than two months to estimate the association between mortality, air pollution and weather. This would adjust for the confounding effect from a long-term trend in mortality, year to year variations in mortality arising from influenza or other epidemics and seasonal variations in mortality.

Dummy variables for days of the week were included in the models irrespective of whether they were statistically significant or not in order to control for daily variation in mortality across the week.

Diagnostic plots

Several diagnostic plots were examined to assess whether the models reasonably adjusted for seasonal variations and long-term trends in daily mortality data. Time series

plots of residuals were examined for any long wavelength pattern remained in the data.

Time series plots of model prediction against the real data are helpful in assessing if the models are reasonably fit to the seasonal variation and a long-term trend in data. The models adjusted for the variation in daily mortality of approximately more than two months. The spectral analysis on the residuals was carried out to see if residual periodogram had larger values at the periods above two months. Large values at these periods would indicate an insufficient adjustment of seasonal variations. The partial autocorrelation functions (PACF) of the residuals were plotted to check for any large values at the first lags, which would indicate an insufficient adjustment of seasonal variations.

6.3.2 Controlling for weather variables