• No results found

Big Data Techniques Applied to Very Short-term Wind Power Forecasting

N/A
N/A
Protected

Academic year: 2021

Share "Big Data Techniques Applied to Very Short-term Wind Power Forecasting"

Copied!
17
0
0

Loading.... (view fulltext now)

Full text

(1)

Wind Power Forecasting

Ricardo Bessa

Senior Researcher ([email protected]) Center for Power and Energy Systems, INESC TEC, Portugal

Joint work with Laura Cavalcante and Marisa Reis

EWEA Technology Workshop: Wind Power Forecasting 2015 1-2 October 2015, Leuven, Belgium

(2)

Introduction

Vector Autogression (VAR) models can be applied to combine wind power time series distributed in space

Two important requirements for a practical implementation Reduce the number of non-null coefficients

Low computational time in large datasets

This work provides the following original contributions Explores a set of sparse structures for the VAR model Applies the alternating direction method of multipliers (ADMM) to estimate the VAR coefficients

Explores parallel computing

(3)

Autoregressive Model

Univariate model: uses past observations from the same time series

AR(p) - Autoregressive Model of order p

→ forecasts the variable yt given the past p values yt = c + b1yt−1+ b2yt−2+ · · · + +bpyt−p+ εt

VAR(p) - Vector Autoregressive Model of order p

→ forecasts the vector of k variables Yt = (Y1,t, Y2,t, . . . , Yk,t)

Yt = c + B1Yt−1+ B2Yt−2+ · · · + +BpYt−p + ut

(4)

Least Absolute Shrinkage and Selection Operator (LASSO)-VAR Model

The Lasso-VAR estimation minimizes the residual sum of squares subject to an L1 constraint

1

2kY − BZ k2F s.t. kBk1≤ t

Equivalently, it can be defined in the Lagrangian form as 1

2kY − BZ k2F + λ kBk1, where kX kp= (Pn

i=1|xi|p)1/p,kX k2F =Pm i=1

Pn

j=1|xij|2is the Frobenius norm and the regularization parameter λ≥ 0is inverse related to t

Fits the regression model and simultaneously performs variable selection by shrinking regression coefficients to zero

(5)

Lasso-VAR Model: Extensions and Generalizations

Lasso

Extensions Penalty Illustration

Row Lasso λ

Bi 1

Matricial Lasso λkBk1

Lag Lasso λPp

l=1kBlk1

Group Lasso λ

P

i6=j

k(B1)ij. . .(Bp)ijk2 Sparse Group

Lasso

(1 − α)λPp l=1kBlkF +αλ kBk1

(6)

Parameter Estimation and the ADMM Algorithm

The goal is to estimate the sparse matrix of coefficients with a simple and powerful algorithm

ADMM framework has several advantages

Combines the problem separability offered by the dual ascent method with the convergence properties of the method of multipliers

Convex problems with nondifferentiable constraints (as LASSO) can be easily addressed

Parallel Optimization: break up large datasets into blocks and carry out the optimization over each block

(7)

ADMM Algorithm

Lasso-VAR:

minimize 12kY − BZ k2F + λ kBk1

ADMM problem form:

minimize 1

2kY − BZ k2F

| {z }

f(B)

+ λ kHk1

| {z }

f(H)

s.t. B− H = 0

Augmented Lagrangian

Lρ(B, H, W ) = 1

2kY − BZ k2F+λ kHk1+WT(B−H)+ρ

2kB − Hk2F

(8)

Parallel Computing

The goal is to split data and use ADMM to solve the problem in a distributed manner (with N objective terms)

Z1 Z2 . . .ZN → Split data across features and use ADMM sharing problem

Z1

Z2

...

ZN

→ Split data across examples and use ADMM consensus optimization

















(9)

ADMM and Parallel Computing

Splitting Across Examples Splitting Across Features

minPN

i=11/2 kYi− BiZik2F

| {z }

fi(Bi)

+ λ kBik1

| {z }

g(Bi)

min 1/2 Y −

PN i=1BiZi

2 F

| {z }

g(PN i=1BiZi)

+PN

i=1λ kBik1

| {z }

fi(Bi)

minPN

i=1fi(Bi) + g (H) s.t Bi− H = 0

minPN

i=1fi(Bi) + g (PN i=1Hi) s.t BiZi− Hi= 0

Bk+1i := arg min Bi

 fi(Bi) +ρ

2 Bi− H

k+ Uik 2 F



Hk+1:= arg min H

g(H) + 2

H − Bk+1− U k

2 F



Uk+1i := Uki + Bk+1i − Hk+1

Bik+1:= arg min Bi

 fi(Bi) +ρ

2

BiZi− Hki + Uki 2 F



Hik+1:= arg min H

 g(PN

i=1Hi) +ρ 2

N X i=1

Hi− Uik− Bik+1Zi 2 F



Uk+1:= Uk+ Bk+1i Zi− Hik+1

(10)

Case Study description

Apply ADMM algorithm to several LASSO-VAR(2) variants in order to produce wind power forecasts from 1 to 6 hours ahead Dataset

68 wind farms (same control area) Training period: 9 months Test period: 3 months Time resolution: 1 hour

LASSO and ADMM parameters estimated by 5-fold cross-validation

Calculate the improvement in terms of Root Mean Squared Error (RMSE) compared to an Autoregression model - AR(2)

(11)

RMSE Improvement over AR results

1 2 3 4 5 6

7 8 9 10 11 12 13

Wind Farm with best improvement

Time Horizon (h)

Improvement over AR (%)

Row L−V Matricial L−V Lag L−V Group L−V Sparse L−V No Sparsity

(12)

RMSE Improvement over AR result

1 2 3 4 5 6

4 5 6 7 8 9

Wind Farm with intermediate improvement

Time Horizon (h)

Improvement over AR (%)

Row L−V Matricial L−V Lag L−V Group L−V Sparse L−V No Sparsity

(13)

RMSE Improvement over AR result

1 2 3 4 5 6

−8

−6

−4

−2 0 2

Wind Farm with worst improvement

Time Horizon (h)

Improvement over AR (%)

Row L−V Matricial L−V Lag L−V Group L−V Sparse L−V No Sparsity

No of wind farms with negative imp. (average over the time horizon): 3 No of wind farms with negative imp. in at least one lead-time: 13 Group LASSO does not have negative imp. in the first two lead-times

(14)

RMSE Improvement over AR result

1 2 3 4 5 6

2 3 4 5 6 7

Global

Time Horizon (h)

Improvement over AR (%)

Row L−V Matricial L−V Lag L−V Group L−V Sparse L−V No Sparsity

(15)

Running Time

Lasso Extensions

Not distributed

Distributed over Examples

Row Lasso 5.3 1.6

Matricial Lasso 1.6 0.5

Lag Lasso 1.1 0.4

Group Lasso 7.8 1.1

Sparse Lasso 11 5.5

Table: Time (in sec) to run data divided by a i7 8-cores processor

The same tolerance (1e-3) was used for the ADMM

The error results for each LASSO extension are very similar

(16)

Final Remarks and Future Work

The adequate choice of a sparse structure can improve the forecast skill of the VAR model

The case-study results indicate that

Information from selected distributed time series can improve the forecast error compared to an AR model

The Group LASSO-VAR model achieves the highest global improvement and the Lag LASSO-VAR model provides the lowest improvement (mainly for the first lead times)

Future Work

Explore more complex sparse structures

Extend the statistical model to the probabilistic forecast framework

Apply this framework to other smart grid related problems

(17)

Acknowledgements

This work was made in the framework of the SusCity project (“MITP-TB/CS/0026/2013”) financed by national funds through

Fundação para a Ciência e a Tecnologia (FCT), Portugal.

References

Related documents

• Independent analytical lab results of ADVASEAL® samples collected at trial site, in the 3-week period following film being laid on the ground, indicate active ingredients

The information for the empirical research in this thesis has been gathered from two business schools and seven people. Thus, its findings cannot be generalized to

Keywords: Open Source, Linux Cluster, Optimization, OpenMosix, Dynamic Load Balancing..

Several interesting facts emerge; (1) all four sub-CFSIs fluctuates substantially over time, with their peaks occurred in the periods corresponding to crisis events;

regiment’s assistance. 339 He also assumed command of the train escort as the ranking officer in direct defiance of General Blunt’s orders not to unite the commands. The union of

This loss of water, known colloquially in UK as ‘total leakage’, consists of distribution leakage on water company pipes up the point of delivery and underground supply

received Application assigned to MQA Officer Application evaluated Approval GRANTED with Conditions Approval REFUSED Meets criteria Meets criteria START N N Y

An important, specific factor in Brazil’s sugar cane crops is the recycling of nutrients by the application of two items of industrial waste, namely, vinasse and filtercake.. Vinasse