• No results found

Experimental Setup

In document Solar Power Forecasting (Page 47-50)

2.5 Discussion

3.1.2 Case Study

3.1.2.1 Experimental Setup

The goal for this study is to make half-hourly PV power output prediction for the next day, given historical PV power data, weather data and weather forecasts.

More specifically, given: (i) a time series of historical PV power outputs up to the day d: [p1, p2, p3, ..., pd], wherepi = [pi

1, pi2, pi3, ..., pi20] is a vector of 20 half- hourly power outputs for the day i, (ii) a time series of weather vectors for the same days:

[W1, W2, W3, ..., Wd], and (iii) the weather forecast for dayd+1: W Fd+1, our goal is to forecastP Vd+1, the half-hourly power output for the next dayd+1.

A. Data Sources

The data from the three different sources that we used is summarized in Table 3.1. The PV power data is collected from one of the largest roof-top flat-panel PV plants in Australia, located at the University of Queensland, Brisbane. It has over 5000 poly- crystalline silicon solar panels across four buildings and produces up to 1.22MW elec- tricity.

CHAPTER 3. INSTANCE-BASED METHODS 34

Table 3.1: Data sources and feature sets for DWkNN study

Data source Feature

set

Num. features

Description

PV data for the current daydand previous days

PV 20 Half-hourly PV values be- tween 7am and 5pm

Weather data for the cur- rent day d and previous days

W 14

(1-6) Daily: min temperature, max temperature, rainfall, sun hours, max wind gust and av- erage solar irradiance;

(7-14) At 9 am and 3 pm: tem- perature, relative humidity, cloudiness and wind speed. Weather forecast for the

next dayd+1

WF 3 Daily: min temperature, max temperature and rainfall

Jan 2015 Jul 2015 Dec 2015 Jul 2016 Dec 2016

Time

0 500 1000

PV Power (kW)

Figure 3.6: PV power data from Jan 2015 to Dec 2016

PV data for two years was collected: from 1 January 2015 to 31 December 2016 (as shown in Fig. 3.6). As expected, we can see that the PV solar power is highest for the summer months (from December to February in Australia) and lowest for the winter months (from June to August), consistent with the amount of solar irradiation. Further, only the data for a 10-hour interval during the day, from 7am to 5pm, is collected. Outside this time interval the PV power data is zero or close to zero due to the absent or very low solar irradiation, or was not available for all days. The PV power data was obtained from [93].

Weather data for these two years was also collected from the Australian Bureau of Meteorology. For each day, we collected 14 meteorological variables, as shown in Table 3.1. They include six daily measurements (minimum and maximum temperature,

CHAPTER 3. INSTANCE-BASED METHODS 35

rainfall, sunshine hours, maximum wind gust and average solar irradiance) and also eight measurements taken at two times during the day (9am and 3pm) - temperature, relative humidity, cloudiness and wind speed. The weather data is available from [94]. The third information source includes weather forecast data for the future days that are predicted. The weather forecast feature set WF shown in Table 3.1 includes three variables: the daily minimum temperature, the daily maximum temperature and the daily rainfall. All these are core measurements, widely available and included even in the most basic weather forecasts.

Since the weather forecasts were not available retrospectively for 2015 and 2016, we used the actual weather data with added 10% noise.

B. Data Pre-processing

The PV data was measured at 1-min intervals. As we predict at 30-min intervals, the 1- min data was aggregated into 30-min data by taking the average of the interval. It was then normalized to [0, 1]. In total there are 14,620 measurements(= (365 + 366)×20)

for the PV data.

The weather data consists of 14 daily measurements, thus there are(365 + 366)×

14 = 10,234measurements. It was also normalized to the range [0,1].

There was a small number of missing values (0.02% in the PV data and 0.82% in the weather data). To replace the missing values, the following method is applied, firstly to the weather data and then to the PV data. For missing values in the weather profile vector for dayd, daysis found, the nearest neighbor ofd, which does not have missing values. The similarity is calculated using the Euclidean distance and only the available weather features in Wd. We then replace the missing values inWdwith the

corresponding values inWs.

A similar approach is followed for the missing values in the PV data. If daydhas missing values in its PV vector, we firstly find days, the most similar day todin terms of weather by using the weather vectors and then replace the missing values in P Vd

with the corresponding values ofP Vs.

C. Training, Validation and Testing Sets

The solar data and the corresponding weather data are divided into three non-overlapping subsets: training - the first 70% of the 2015 data, validation - the remaining 30% of

CHAPTER 3. INSTANCE-BASED METHODS 36

the 2015 data and testing - the 2016 data. The training dataset is used to build the pre- diction models; the validation dataset is used for parameter selection, and the testing dataset is used to evaluate the performance of the proposed method and other methods used for comparison.

D. Performance Measures

To evaluate the performance, we used two standard performance measures: Mean Ab- solute Error (MAE) and Root Mean Squared Error (RMSE), defined as follows:

M AE = 1 D×n Xn i=1 P i b Pi RM SE = s Pn i=1(Pi−Pbi)2 D×n wherePi and b

Pi are the actual and forecast half-hourly PV power outputs for day

i, D is the number of days in the testing set and n is the number of predicted daily values (n=20).

These performance measures were used in all case studies in the thesis.

E. Distance Measure

To find the nearest neighbors, we evaluated the performance of two distance mea- sures on the validation set: Euclidean and Manhatten. The results showed that the Euclidean distance performed better. Therefore, this case study uses Euclidean dis- tance.

In document Solar Power Forecasting (Page 47-50)

Related documents