5. Forecast error modelling for wind integration studies
5.2.2. Correlational structure
We will present two versions of the correlation structure in order of increasing generality. In version 1, correlations can be prescribed between the forecast error and the realisation. In version 2, additional correlations can also be prescribed between the errors in forecasts made at consecutive timesteps.
In version 1 of the model, the correlation between forecast error and realisation is specified using a correlation coefficient linking the innovations of X and those of Y:
ˆ
E[ǫy(k, i)ǫx(k+i)] =ρxy : 1≤i≤ Nf. (5.13) This is achieved in a simulation by first generating the X innovations, ǫx(k), and then generating the Y innovations, ǫy(k, i)according to
ǫy(k, i) =ρxyǫx(k+i) + q
CHAPTER 5. FORECAST ERROR MODELLING FOR WIND INTEGRATION STUDIES
where ξ(k, i)are independent N(0,1) variables.
Specifying a negative ρxy allows us to ensure that there is no correlation between the
X-domain forecast F and the forecast error Z. We can write this condition as
ˆ
E[F(k, i)Z(k, i)] =0. (5.15) If this were not the case then the forecast F would be biased, in the sense that a deter- ministic adjustment could be made to F that would improve its RMS error. For example, if high forecasts tended to be correlated with an overprediction then we could always improve a high forecast by lowering it in a deterministic way. However, such biasing is exactly what results if the innovations of X(k+i)and Z(k, i)are generated indepen- dently, in which case, from Equations (5.7) and (5.8),
ˆ
E[F(k, i)Z(k, i)] =EˆZ(k, i)2
>0. (5.16)
A corollary is that, if the forecast is the sum of the realisation and an independent forecast error process, the increments of the forecast time series will be more volatile than those of the realisation.
Note that the real forecast is made in the power output (P) domain, rather than the normalised wind level (X) domain, and Equation (5.15) does not imply that the P-domain median forecast Pwf(k, i)is similarly unbiased. However, we have assumed that the wind power forecast is in fact a “quantile forecast” [113], characterised by its median Pwf(k, i) but containing the full distribution of power outputs, and that this distribution always transforms to a Gaussian distribution in the X domain with mean F(k, i)and standard deviation σiz. Hence, Pwf contains the same information as F and therefore it is reasonable to constrain the synthesis of forecast errors with Equation (5.15).
The X-domain biasing can be corrected by choosing a negative value for ρxyin Equa-
tion (5.14). In theory, we could ensure that Equation (5.15) holds for all i, by choosing dif- ferent values ρxyi for each i. However, in practice, depending on the choice of the forecast error scale factors syi, the set of values of ρixythat are needed to achieve this may fluctuate wildly or need to take infeasible values for some horizons. Therefore, the approach taken in this thesis is to choose a constant value that leads to near-zero correlations between the forecast and forecast error at all horizons.
The methodology is as follows. We relate ρxy to ρf z
i , the correlation between the
forecast and the forecast error at a horizon i timesteps ahead, using
ρif z= Eˆ[X(k+i)Z(k, i)] +EˆZ(k, i) 2 q ˆ E[F(k, i)2]Eˆ[Z(k, i)2] (5.17)
CHAPTER 5. FORECAST ERROR MODELLING FOR WIND INTEGRATION STUDIES where ˆ EZ(k, i)2 = (syi)2 i−1
∑
j=0 (ψyj)2, (5.18) ˆ E[X(k+i)Z(k, i)] =σxsyi i−1∑
j=0 ψxj ψyj ρxy, (5.19) ˆ E F(k, i)2 = (σx)2 ∞∑
j=0 (ψxj)2+EˆZ(k, i)2 +2 ˆE[X(k+i)Z(k, i)]. (5.20)We then choose the value of ρxythat leads to
NF
∑
i=1ρif z
i =0 (5.21)
so as to attach more importance to near-future forecasts (small i) than to longer horizons, which can be done using a non-linear root finding algorithm.
In addition to the negative correlation between the forecast error and the realised wind power, we also expect a positive correlation between errors in the forecasts made at successive timesteps: if the forecaster makes a prediction at (say) 1pm for the wind power at 6pm, and it turns out that the prediction was too high, then it is likely that the forecast made at 2pm for the wind power at 6pm was also too high. Mathematically, we can express this requirement in the X-domain as
ˆ
E[Z(k, i)Z(k−1, i+1)] >0. (5.22) If we have specified a non-zero value of ρxy then this inequality will always hold to a limited extent because both Z(k, i)and Z(k−1, i+1)are correlated with X(k+i)from Equation (5.14). In version 2 of the model, we allow more precise control over the correla- tion between successive forecasts, while maintaining the ability to control the correlation between the forecasts and the realised wind. We achieve this by specifying an additional correlation ρyy between the innovations of the Y process for a fixed future time, as gen-
erated on consecutive timesteps, while keeping the innovations independent within a forecast time series for fixed k (5.11):
ˆ
E[ǫy(k, i)ǫy(k−1, i+1)] =ρyy : 1≤i< NF. (5.23)
We can then force a prescribed correlation ρzzi between forecast errors Z(k, i) and
Z(k−1, i+1)by choosing ρyyto satisfy ρzzi = qEˆ[Z(k, i)Z(k−1, i+1)] ˆ E[Z(k, i)2]Eˆ[Z(k, i+1)2] =ρyy v u u t1− (ψ y i)2 ∑ij=0(ψyj)2 (5.24) using Equations (5.10) and (5.23). More precise control of this correlation could be
CHAPTER 5. FORECAST ERROR MODELLING FOR WIND INTEGRATION STUDIES
achieved by letting ρyy take different values at different forecast horizons. In this the- sis, we force ρyyto be constant across all horizons for simplicity.
The forecast error innovations ǫy(k, i) can be generated to satisfy Equations (5.13) and (5.23) simultaneously by generating them according to the following scheme, based on Cholesky Decomposition. Let
α= ρ xy(1−ρyy) 1− (ρxy)2 (5.25) β=ρyy−αρxy (5.26) γ= q 1−α2−2αβρxy−β2. (5.27)
Then, having first generated all values of the realised wind innovations ǫx(k), the forecast error innovations are generated in chronological sequence as follows:
ǫy(k, i) = ρxyǫx(k+i) + q 1− (ρxy)2ξ(k, i): (k=0, 1≤i≤ N F)or(k>0, i= NF) αǫx(k+i) +βǫy(k−1, i+1) +γ ξ(k, i): k>0, 1≤i≤NF−1 (5.28) where ξ(k, i)are independent N(0,1) variables as before. Note that version 1 of the model is a special case of version 2 with ρyyset to(ρxy)2.