Chapter 7. Storage value under a high wind penetration scenario
7.2. The impact of increasing wind penetration on storage value
7.2.2. An autoregressive versus static modelling approach
The APX econometric model provides some explanatory power of the effect of the independent variables on the price; however, the presence of heteroscedasticity, autocorrelation, combined with an R-squared value of 0.54, indicate the likelihood of missing variables. Such missing variables, relevant to the model, are forecast errors, transmission constraints and embedded generation, each of which can have significant impacts on the APX market. System constraints influence the market prices and
yet can be almost invisible from aggregate data; for example, a peaking plant which is constrained due to transmission operational requirements is unable to provide its power to the market. This, in turn, would have the effect of raising prices as available supply is reduced and in a highly volatile short-term wholesale power market, this rise in prices can be substantial. Due to the unavailability of data, these elements could not be factored in, and hence they suggest that the accuracy of the model is limited, especially at price extremes.
Working with short time resolution results in a high degree of autocorrelation. Using 2 lags of the dependent variable, an autoregressive model, AR (2), the R-squared value rises to 0.84, indicating a significant improvement of fit. However, under this model, the contribution of the other variables in explaining changes in the dependent variable is substantially reduced. In table 7.2, the coefficient for the lagged dependent variable is approximately 0.99 with a p-value of 0; this leads to the interpretation that at any price at any given point is on average 0.99 times that of the previous price.
The autoregressive model also significantly reduces the explanatory power of the independent variables compared to the static model. In fact, different types of generation should have a strong and significant effect on the wholesale price, even on the short-term markets. Yet clearly the autoregressive model shows a statistically better fit.
This paradox is the focus of Achen (2000) who looks at this problem from a general perspective; the author takes econometric examples to show that models which appear superior statistically due to lagged dependent variables can, under special circumstances, be wrong and have interpretations inconsistent with fundamental principles of the model. They investigate this effect to show that in such cases, the addition of a lagged dependent variable can diminish the explanatory power of the independent variables, especially if the dependent variable is trended. According to statistical theory, the additional of a variable under normal circumstances should not cause bias towards other variables and if relevant should be included to specifically to avoid omitted variable bias (Achen 2000; Dougherty 2011). However in the presence of very high autocorrelation in both the error terms and the independent variables themselves, as is the case in the APX market and BM models, Achen (2000) shows that the coefficients of the independent variables are biased downwards i.e. the greater the correlation the greater the underestimation of the other coefficients. In fact, when both correlation coefficients are equal to 1, the coefficients of the independent variables tend to zero.
Implementing Achen (2000)’s findings, the Static (ST) model should provide consistent estimates of the coefficients and standard errors, when estimated using feasible generalised least squares. However, omitted variable bias can still persist, which means that the impact of wind can be overestimated. On the other hand, the Autoregressive model (AR) shows downward bias on the independent variables, which means that the impact of wind is underestimated. Therefore, in this
thesis, both the Autoregressive (AR) and Static (ST) model results are presented, as the most likely effect of wind lies in between the two approaches.
APX AR (2) Model - OLS
Variable Name Coefficient Std
Error t-statistic p-value
'(Intercept)' -2.5826 0.571 -4.526 0.000 'OIL' 0.0104 0.001 13.284 0.000 'OCGT' 0.0074 0.001 6.276 0.000 'CCGT' 0.0004 0.000 33.984 0.000 'COAL' 0.0001 0.000 12.925 0.000 'NUCLEAR' 0.0000 0.000 0.986 0.324 'NPSHYD' 0.0024 0.000 17.046 0.000 'WIND' -0.0003 0.000 -11.851 0.000 'NETIMBALANCEVOL' 0.0011 0.000 15.189 0.000 'pumping' 0.0001 0.000 0.675 0.499 'britnedimport' 0.0000 0.000 0.681 0.496 'eastwestimport' 0.0000 0.000 0.273 0.785 'frenchimport' 0.0001 0.000 2.706 0.007 'moyleimport' -0.0001 0.000 -0.337 0.736 'APXV' 0.0007 0.000 10.049 0.000 'APXPLAG1' 0.9888 0.004 272.812 0.000 'APXPLAG2' -0.2893 0.004 -81.573 0.000 'imbapricelag2' 0.0373 0.001 27.608 0.000
Table 7.2: An extract of the autoregressive model results.
A comparative predictive performance between the static model and autoregressive model is shown in figure 7.2; it shows the ability to predict the APX market price based on out-of-sample data. More precisely, both models were regressed on 2011-2014 data and their predictive ability evaluated using 2015 data as input. Figure 7.2 shows that the static model performs less well than the autoregressive model; in particular, during the first week of January some peaks are captured by the model while others are exaggerated as seen 138th hour for example. Similarly, some troughs are captured while
others greatly underestimated shown in the figure at the 26th hour. It is clear that the model fails to
Figure 7.2: A comparison of the autoregressive model prediction to the static model prediction
through out-of-sample testing.
Although the autoregressive model appears to present a very good fit, the fact that it uses the APX price lagged by 30 mins, raises questions about the usefulness of the model in identifying the relationship between the dependent and independent variables, especially taking into account Achen (2000)’s earlier findings. Thus although statistically, the AR model is relatively accurate, its usefulness is limited by the fact that the large predictive ability of the model is driven by strong autocorrelation.