• No results found

Chapter 4. Approaches to deriving storage value

4.11. The linear regression model

An econometric model using the absolute values of each variable is developed. Alternative specifications such as a logarithmic model or a mean deviation model were also evaluated but did not demonstrate any substantial improvement. Using variables in their absolute values follows from the underlying hypothesis that changes at any levels directly and equally impact the independent variable, that is half-hourly APX price in this case. In the case of wind power as an independent variable for

example, the unit change is 1 MW or effectively 0.5 MWh when considering the half-hour time resolution.

A simple regression between prices and wind generation is too simplistic; in order to investigate the impact of wind generation on the market prices, the impact of other variables should also be considered otherwise resulting in omitted variable bias (Dougherty, 2011). The independent variables chosen are power generation from open cycle gas turbines (OCGT), power from combined-cycle gas turbine (CCGT), generation from oil power plants (OIL), Nuclear power (NUCLEAR), Non-pumped storage hydro-power generation (NPSHYD), coal power generation (COAL), aggregate electricity demand, net imbalance volume, system buy price (SBP), system sell price (SSP), interconnector power flows (Moyle, East-West, France), coal prices, gas price and oil prices (quarterly).

4.11.1. Dummy and multiplicative variables

Generally, in regression analysis, the impact of seasonal effects can be investigated using dummy variables. These can be additive or multiplicative and both may be added to avoid any implied restrictions; for example, the inclusion of only additive seasonal dummy variables would imply that the coefficients of all other independent variables are the same throughout all seasons. This may not be true in the case (for example if) APX prices are more responsive with respect to changes in other independent variables, such as OCGT in the winter period. This could be due to increasing marginal costs in winter which cause bidding prices to be higher and therefore affect APX prices more.

Thus the econometric model includes dummy variables to capture seasonal effects, weekday compared to weekend effects and time of the day effects – especially peak time. However, the addition of both the additive and multiplicative dummy variables unreasonably complicates the model; the number of variables jumps to 767. Similarly, the addition of all quadratic variables raises this number to 1400. By most standards, this specification of the model is too complex and is not justified as the increase in the number of variables does not substantially improve the model fit; for example, adjusted R2 increases from 0.842 to 0.879 in the APX AR model. An F-test given by equation 4.24 (Dougherty

2011) show that the small improvement in fit is not justified.

𝐹

(𝑚−𝑘,𝑛−𝑚)

= (

(𝑅𝑆𝑆(𝑅𝑆𝑆𝑘−𝑅𝑆𝑆𝑚)/(m−𝑘)

𝑚)/(𝑛−m)

)

(4.24)

Whereby

𝑚: Number of parameters estimated in the unrestricted model. 𝑘: Number of parameters estimated in the restricted model 𝑛: Number of observations

𝑅𝑆𝑆: Residual Sum of Squares

Where RSS is the residual sum of squared errors, k and m are number of parameters estimated in the restricted and unrestricted models respectively. n is number of observations.

More advanced criteria such as Akaike Information Criterion or Bayesian Information Criterion are not formally tested but are unlikely to justify the added complexity. Therefore, a simpler model specification is preferred.

4.11.2. APX and BM econometric model

The APX model includes seven generation variables distinguished by their fuel types/technology. Four interconnectors power flows are also included and refer to those between GB-Netherlands (britnedimport), Wales-Ireland (eastwestimport), GB-France (frenchimport) and Scotland-Ireland (moyleimport). Representing the time of day effect on prices, there are forty-seven half-hour dummy variables (SPintdummy2-48) with the reference period being the half-hour between 00:00-00:30. Similarly, monthly dummy variables are added to investigate seasonality effects, the reference period is January. Finally, a weekday-weekend effect is also accounted for by using an additional dummy variable, with the weekday being the reference period. Coal prices, gas price and oil prices are also included, using quarterly data. The APX market price, as lagged dependent variables (APXPHHLAG1, APXPHHLAG2) and the traded market volume (APXVHH) were added as they are thought to have a direct influence on prices. The regression equation for the APX model is given in (4.25).

APX𝑡= 𝛽0+ 𝛽1OIL𝑡+ 𝛽2OCGT𝑡+ 𝛽3CCGT𝑡+ 𝛽4COAL𝑡+ 𝛽5NUCLEAR𝑡+ 𝛽6NPSHYD𝑡+

𝛽7WIND𝑡+ 𝛽8NIV𝑡+ 𝛽9pumping𝑡+ 𝛽10britnedimport𝑡+ 𝛽11eastwestimport𝑡+

𝛽12frenchimport𝑡+ 𝛽13moyleimport𝑡+ 𝛽14APXV𝑡+ 𝛽15APXPLAG1𝑡+ 𝛽16APXPLAG2𝑡+

𝛽17imbapricelag2𝑡+ 𝛽18quarterlyfuelpriceCOAL𝑡+ 𝛽19quarterlyfuelpriceGAS𝑡+

𝛽20quarterlyfuelpriceOIL𝑡+ 𝛽21SPintdummy2𝑡+ ⋯ + 𝛽67SPintdummy48𝑡+

𝛽68monthintdummyfeb𝑡+ ⋯ + 𝛽78monthintdummydec𝑡+ 𝛽79weekeffdummyweekday𝑡+ 𝜀𝑖

(4.25)

Using a similar specification to the APX econometric model, a BM regression model was evaluated, showing that dummy variables are not significant. This occurs due to the fact that the imbalance price is purely imbalance driven and any seasonality or time of day effects are reflected by the APX price as the independent variable (APXPHH). For this reason, the BM econometric model specification does not include dummy variables.

In the BM, although there are two prices (SBP & SSP), at any one time, the price of interest is the imbalance price since the other price is approximately equal to the APX market price. For example, if at one point, there is a shortage of energy in the BM, causing the SBP (using the imbalance pricing method) to rise to £100/MWh, the SSP (calculated using the reverse pricing method) is essentially an average of APX prices (of different products). In this case, the imbalance price is SBP whereas SSP which is roughly equivalent to the APX price is represented by the APX model above. The purpose of the BM

econometric model is to determine the relationship between the imbalance price and the independent variables.

The BM model chosen was:

Imbaprice𝑡 = 𝛽0+ 𝛽1OIL𝑡+ 𝛽2OCGT𝑡+ 𝛽3CCGT𝑡+ 𝛽4COAL𝑡+ 𝛽5NUCLEAR𝑡+ 𝛽6NPSHYD𝑡+

𝛽7WIND𝑡+ 𝛽8NIV𝑡+ 𝛽9pumping𝑡+ 𝛽10britnedimport𝑡+ 𝛽11eastwestimport𝑡+

𝛽12frenchimport𝑡+ 𝛽13moyleimport𝑡+ 𝛽14APXP𝑡+ 𝛽15APXV𝑡+ 𝛽16APXPLAG1𝑡+

𝛽17APXPLAG2𝑡+ 𝛽18imbapricelag1𝑡+ 𝛽19imbapricelag2𝑡+ 𝛽20quarterlyfuelpriceCOAL𝑡+

𝛽21quarterlyfuelpriceGAS𝑡+ 𝜀𝑖

(4.26)

4.11.3. Autoregressive vs Static models

The APX and BM econometric models in (4.25) and (4.26) have lagged dependent variables and represent an Autoregressive – AR (2) model. Since the data uses half hourly resolution, it is very likely that markets show strong effects of autocorrelation. In other words, the current disturbance term is correlated with that of the previous period. Autocorrelation can be problematic under the Ordinary Least Squares (OLS) method used to evaluate the econometric models; while the coefficients themselves are not biased, asymptotically converging towards the true value, the standard errors of the coefficients are biased (Dougherty 2011, chap.12). In order to generate unbiased standard errors, the regression model can be estimated using Feasible Generalised Least Squares (FGLS) method. In FGLS, the OLS regression is run first. The error term is then regressed on its lag to estimate the magnitude of autocorrelation. It is this relationship that is used to eliminate autocorrelation from the original equation. Dougherty (2011, pp.441–445) gives a good description of the procedure using the Cochrane-Orcutt iterative method. FGLS also reduces bias from heteroscedasticity. In time series analysis the addition of lagged dependent variables can often eliminate autocorrelation.

The addition of lagged dependent variables substantially improves the model fit; the APX model without lagged dependent variables is referred to as the Static (ST) model and has an R2 value of 0.54

which rises to 0.84 under the AR model specification. However, the AR model also dramatically reduces the explanatory power of the other variables. Achen (2000) investigated this problem to show that in the presence of autocorrelation and trended independent variables, the inclusion of lagged dependent variables provides a false sense of superior fit. In fact, he shows that even when the lagged dependent variables do not belong to the model at all, the autocorrelation and trending in independent variables cause the regression results to show a strong and significant relationship.

Nevertheless, when the ST model is evaluated using FGLS it suffers from a lower R2 value suggesting

that a substantial variation in prices is not explained. Omitted variable bias, where the independent variables attempt to compensate for missing variables, is also of concern. Therefore, the bias likely

overestimates the impact of wind under the ST model and underestimates its impact under the AR model. The results of the two types of regressions are discussed further in Chapter 7, with reference to these biases. However, with the knowledge of the extent of the bias, the true impact of wind power on prices should lie in between the two models and hence meaningful information can still de derived. Further diagnostic tests have been used to refine the model; the Breusch-Pagan test was used to detect heteroscedasticity and the Breusch-Godfrey test for serial correlation. In order to detect possible misspecification, the Ramsey test was performed. Using the Augmented Dickey-Fuller test, the data was found to be stationary. Residual error plots are shown Appendix C. The models are evaluated using 2011-2014 data and validated using an out-of-sample testing is carried out on data in 2015.