CHAPTER 3: RESEARCH METHODS
3.7 E MPIRICAL DOWNSCALING METHODS
As stated in Section 1.2.4., multiple methodologies exist for empirical
downscaling. In this dissertation, three methods are developed, applied, and evaluated.
These methods are described in the following sections. The results of the downscaling analysis are presented in Chapter 8.
3.7.1 The analog method
The analog method is the simplest of the three empirical methods applied in this study. As the name implies, in the analog method historical observations are examined to find the most similar ‘state’ (usually defined in terms of the sum of the squared differences over all grid points) to that produced by the GCM. The values of the surface variables are then modeled as those that occurred with the historical analog. While only a few studies have applied the analog method to climate downscaling (Zorita et al. 1995;
Cubasch et al. 1996; Zorita and von Storch 1999), it has long been used in weather
forecasting and short-term climate prediction (Lorenz 1969; Kruizinga and Murphy 1983;
van den Dool 1994).
The analog method requires a long historical record if a suitable analog is to be found for each GCM simulation (van den Dool 1994). Regional focus typically alleviates this problem and useful analogs have been found for most downscaling applications (Zorita and von Storch 1999). Given the simplicity of the analog approach, refinements
have been suggested, including defining the large scale state in terms of multiple variables and including information from the previous day (Cubasch et al. 1996). The specific characteristics of the analog method, as applied in this dissertation, are described in detail in Chapter 8.
3.7.2 Stochastic weather generator
Stochastic weather generators (SWGs) are models of daily weather sequences which can be viewed as statistical representations of the local climate (Wilks 1999).
SWGs have been widely used for agricultural (i.e., crop model) applications when observed data is inadequate due to missing or incomplete data (Wilks and Wilby 1999) and for development of regional climate change scenarios (Wilks 1992; Semenov and Barrow 1997). Typically, SWGs are comprised of three components: a precipitation occurrence component, a precipitation amount component, and an additional algorithm for other variables (often conditioned on precipitation occurrence).
3.7.2.1 Precipitation occurrence
Stochastic weather generators simulate precipitation occurrence using either Markov chain models or spell length distributions. In the first approach, in the case of a first order Markov process, the occurrence of precipitation depends on two parameters:
p01, the probability of a wet day following a dry day, and p11, the probability of a wet day following a wet day. Depending on the wet/dry status and weather type of the previous day, a uniform [0,1] random number is compared to the appropriate transition probability.
If the random number is less than the transition probability, a wet day is simulated.
Otherwise, a dry day is simulated. Wilks (1999) found this type of model correctly reproduced the precipitation occurrence characteristics for stations in the central and eastern USA. For other regions, higher order models have been advocated.
The spell length approach to modeling precipitation occurrence uses distributions fitted to observed lengths of wet and dry spells. Alternating wet and dry spells are then simulated by randomly sampling from the appropriate spell length distributions.
Different sequences of wet and dry spells can be simulated depending on the choice of statistical distribution. For example, if a geometric distribution is used, the resulting series will be identical to that produced by a first-order Markov chain model (Wilks and Wilby 1999).
3.7.2.2 Precipitation amount
Once the precipitation occurrence algorithm produces a wet day, a precipitation amount must be drawn from a chosen statistical distribution. The most common choice for precipitation amount simulation has been the gamma distribution (Wilks and Wilby 1999), given by:
where the distribution parameters are α (the shape parameter) and β (the scale parameter), r is the daily precipitation amount, and Γ is the gamma function, given by
Γ =
∫
∞ − −0
) 1
(α tα e tdt 3.21
The mean wet-day precipitation amount is given by µ= αβ, and the variance is given by σ2= αβ2.
Several studies have indicated that the mixed-exponential distribution provides a better overall fit (and a particularly better fit for large precipitation amounts) than the gamma distribution (Foufoula-Georgiou and Lettenmaier 1987; Wilks 1999). The mixed-exponential distribution is a probability mixture of two single parameter exponential distributions, with probability density function:
⎥⎦
where r is again the daily precipitation amount, with the mean and variance of wet day precipitation amount given by αµ1 + (1-α)µ2 and αµ12 + (1-α)µ22 + α(1-α)(µ1-µ2)2, respectively.
3.6.2.3 Simulation of other variables
SWGs produce variables other than precipitation with a first order multiple
autoregression, first described by Matalas (1967) and given by:
Xi =AXi−1+Bεi 3.23 where Xi is a matrix containing the current day’s standardized values of the variables and
Xi-1 is a matrix containing the previous day’s standardized values of the variables, εi is a vector of independent values from a standard Normal distribution, and A and B are matrices given by where Mo is the matrix of lag-0 cross correlations and M1 is the matrix of lag-1 cross correlations. While A can be directly computed, B is computed by defining a new matrix
Z=BBT (see Greene 2000). Then Z=CLCT, where C is the matrix of eigenvectors of BBT and L has the eigenvalues of BBT on the diagonal and zeros elsewhere. B can then be computed as B=CL1/2CT.
To produce stationary time series, annual cycles of the daily means and standard deviations of the input variables are constructed using harmonic analysis (usually with separate harmonics for wet and dry days). The time series are then reduced to
standardized and stationary residual elements by subtracting the daily means and dividing by the standard deviations, as defined by the harmonics. After generation of the residual series with Eq. 3.23, the daily harmonics described above are used to produce
dimensional values of the input variables, based on wet/dry status.
3.7.3 MOS-based downscaling with multiple linear regression
As stated in the introduction, the MOS (model output statistics) approach to empirical downscale is appropriate even if the agreement between large scale observations and GCM simulations is not perfect (see Chapter 6). MOS-based
downscaling is applied in this study by using multiple linear regression (MLR) to relate free atmosphere GCM output to surface observations. The MLR model is given by:
ε β +
= x
y 3.26
where y is the predictand, x is the matrix of predictors, β are the model coefficients, and ε is the error term.