• No results found

A comparison of forecast models to predict weather parameters

N/A
N/A
Protected

Academic year: 2021

Share "A comparison of forecast models to predict weather parameters"

Copied!
9
0
0

Loading.... (view fulltext now)

Full text

(1)

A comparison of forecast models to predict weather parameters

GUIDO GUIZZI

1

, CLAUDIO SILVESTRI

1

, ELPIDIO ROMANO

2

, ROBERTO REVETRIA

3

1

Dipartimento di Ingegneria Chimica, dei Materiali e della Produzione Industriale

1

University of Naples Federico II

P.le Tecchio 80, 80125, Naples

ITALY

2

Industrial Plants, Logistics and Transportation

International Telematic University UniNettuno

Corso Vittorio Emanuele II, 39, 00186 Roma

ITALY

3

Dipartimento di Ingegneria Meccanica, Energetica, Gestionale e dei Trasporti

University of Genoa

Via dell’Opera Pia 15, 16145, Genoa

ITALY

[email protected], [email protected], [email protected],

[email protected]

Abstract: - Weather forecasting is a really important matter for who is involved in the electrical energy market. The target of this study is to give concrete results about weather forecasting methods to people who needs an accurate estimate of weather parameters to forecast to sell electrical energy in a day-ahead market contest. Approximately one month of meteorological data are analyzed to forecast temperature, pressure and humidity of a month. ARIMA models and exponential smoothing models have been compared to comprehend which method is more adapted to model the temperature behaviour in Caserta, Italy.

Key-Words: - ARIMA models, exponential smoothing, time series, R, forecasting, weather parameters

1 Introduction

The accuracy of meteorological forecasts is having major economic impact in recent years during which the energy companies need to predict the amount of energy that may be sold. Although during this decades the sensibility about a world with more clean energies is increased, in Italy over the 75% of the energy is produced in fossil fuel based stations. The amount of energy produced by turbines of these stations is often function of meteorological parameters, which may be predicted using statistical methods. The main parameters that determine the significant quantity of produced energy are three: temperature, pressure and humidity. While the behaviour of climate changes is irregular and chaotic, if we focus the attention on the forecast of these three parameters it's possible to use linear methods to reach great results. The crucial point for many energy companies is the choosing of the

method that assures a better performance in terms of accuracy.

In this study, three methods to analyze and forecast the time series of temperature, pressure and humidity are compared. The first method is the ARIMA model, approached to the seasonally adjusted data. The second method is the Holt-Winters additive seasonal model. The third method is a more general exponential smoothing technique, the ETS model described in Hyndman et al. (2008). The available dataset includes 4 years of observations that was obtained from an energy station built in Caserta, Italy. This station continuously collects meteorological data every 15 minutes. This high frequency ensures a great performance of the seasonal exponential smoothing as shown in the on-going study.

(2)

In literature, there are many methods used to analyze and forecast time series. A brief illustration of the mentioned models is useful to understand the results of the forecasts.

2.1 ARIMA model

The autoregressive integrated moving average (ARIMA) model is a generalization of an autoregressive moving average (ARMA) model. The difference between these two models is the use of an integration term that needs to differentiate the time series, letting them become stationary. The stationary condition is necessary to operate with an ARIMA model. The parameters of an ARIMA model are three (p,d,q) and represent respectively the autoregressive parameter, the integration parameter and the moving average.

2.1.1 Differencing

Differencing is an excellent way of transforming a non-stationary series to a stationary one. This is obtained by subtracting the observation in the current period from the previous one. If this transformation is done only once to a series, the data has been first differenced.

𝑦𝑦′𝑡𝑡 = 𝑦𝑦𝑡𝑡− 𝑦𝑦𝑡𝑡−1 (1) This process essentially eliminates the trend if your series is growing at a fairly constant rate. If it is growing at an increasing rate, you can apply the same procedure and difference the data again, obtaining a second differenced data.

𝑦𝑦′′ = 𝑦𝑦′𝑡𝑡− 𝑦𝑦′𝑡𝑡−1 (2)

2.1.2 ARMA model

Once differenced the time series it's possible to describe the obtained time series with an ARMA model:

𝑦𝑦′𝑡𝑡 = 𝜙𝜙1𝑦𝑦′𝑡𝑡−1+ ⋯ + 𝜙𝜙𝑝𝑝𝑦𝑦′𝑡𝑡−𝑝𝑝+ 𝜃𝜃1𝑒𝑒𝑡𝑡−1+ ⋯ + 𝜃𝜃𝑞𝑞𝑒𝑒𝑡𝑡−𝑞𝑞+ 𝑒𝑒𝑡𝑡 (3) where 𝑦𝑦′𝑡𝑡 is the differenced series. The terms on the right side of the expression include both lagged values of 𝑦𝑦𝑡𝑡 and lagged errors. The choosing of appropriate parameters p, d and q can be difficult. The best way to choose them is to use the

Box-Jenkins technique (1971) analyzing the autocorrelation function (ACF) and the partial autocorrelation function (PACF) of the differenced time series.

2.2 Exponential Smoothing

Exponential smoothing is a forecasting method used when the single components of the time series (trend and seasonal factors) may be changing over time. More recent observations are weighted more heavily than remote observations. This unequally

consideration of the observations is possible by using smoothing constants. There are many studies about exponential smoothing, from the simple exponential smoothing (SES) to the Holt's trend corrected exponential smoothing and lasting the Holt-Winters method performed by Holt (1957) and Winters (1960).

2.2.1 Holt-Winters model

The additive Holt-Winters method is used for time series that exhibit a linear trend and a fixed seasonal pattern, with constant variation. The time series may be described by this model:

𝑦𝑦𝑡𝑡 = (𝛽𝛽0+ 𝛽𝛽1𝑡𝑡) + 𝑆𝑆𝑆𝑆𝑡𝑡+ 𝜀𝜀𝑡𝑡 (4) where 𝛽𝛽0 is the point estimate of the mean in time period t-1, 𝛽𝛽1 is the growth rate in time period t-1, 𝑆𝑆𝑆𝑆𝑡𝑡 the seasonal factor and 𝜀𝜀𝑡𝑡 the error term in time t.

To implement the additive Holt-Winters method, we let ℓ𝑇𝑇−1 denote the estimate of the level (the mean) in time 𝑇𝑇 − 1 , and 𝑏𝑏𝑇𝑇−1 the growth rate in the same time. If we use 𝑆𝑆𝑆𝑆𝑇𝑇−𝐿𝐿 as the most recent estimate of the seasonal factor corresponding to the time period T, we may describe iteratively the level of the time series at the time T:

ℓ𝑇𝑇 = 𝛼𝛼(𝑦𝑦𝑇𝑇− 𝑠𝑠𝑠𝑠𝑇𝑇−𝐿𝐿) + (1 − 𝛼𝛼)(ℓ𝑇𝑇−1+ 𝑏𝑏𝑇𝑇−1) (5) where α is a smoothing constant between 0 and 1, (𝑦𝑦𝑇𝑇− 𝑠𝑠𝑠𝑠𝑇𝑇−𝐿𝐿) is the deseasonalized observation in time period T, and (ℓ𝑇𝑇−1+ 𝑏𝑏𝑇𝑇−1) is the estimate of the level of time series in time period T. The estimate of the growth rate in time period T uses the smoothing constant γ and is

(3)

The new estimate for the seasonal factor 𝑆𝑆𝑆𝑆𝑇𝑇 in time period T uses the smoothing constant δ and is 𝑠𝑠𝑠𝑠𝑇𝑇 = 𝛿𝛿(𝑦𝑦𝑇𝑇− ℓ𝑇𝑇) + (1 − 𝛿𝛿)𝑠𝑠𝑠𝑠𝑇𝑇−𝐿𝐿 (7) where (𝑦𝑦𝑇𝑇 − ℓ𝑇𝑇) is an estimate of the newly observed seasonal variation.

2.2.2 ETS model

Exponential smoothing models may be different depending on the presence of the smoothing parameters α, γ and δ, already mentioned, but mainly on the type of the components of a time series: error, trend and seasonality (ETS). The seasonality, if it's present in time series, may be differenced in two types: additive (A) and multiplicative (M). Whereas the trend, if it's present, may be differenced in: additive (A), additive damped (Ad), multiplicative (M), and multiplicative damped (Md). The combination of the types of these two components, seasonality and trend, are illustrated in Table 1.

Table 1 Exponential smoothing methods

Trend Component

Seasonal Component

N A M N (N,N) (N,A) (N,M) A (A,N) (A,A) (A,M) Ad (Ad,N) (Ad,A) (Ad,M)

M (M,N) (M,A) (M,M)

Md (Md,N) (Md,A) (Md,M)

The most recent studies of Hyndman et al. (2002), Taylor (2003), Hyndman et al. (2008), about exponential smoothing developed these fifteen methods. The presence of an additive or multiplicative error component (A,M) causes an increment of the possible methods, bring them from fifteen to thirty. Then it's possible to describe all the types of the exponential smoothing models by using three letters. For example the additive error model ETS(A,Ad,N) represent the damped Holt's method. This method uses the maximum likelihood function to estimate the starting parameters and then it may estimate iteratively all the parameters to forecast future values of time series. In addition with the ETS method we may have linear and non-linear models, and then time series that exhibit non-linear characteristics can be fitted well by these models. The ETS framework provides an automatic way of selecting the best exponential smoothing model, including Holt's model, Holt-Winters method (additive and multiplicative), damped trend method of Gardner and McKenzie.

3 Analysis and results

The software used to verify the performance of the mentioned models is R-studio. The available dataset includes temperature, pressure and humidity values

(4)

from 19 January 2011 to 12 February 2015, thousands of observations (N=142,656). The target of this study is to determinate the amount of energy that may be sold in the italian day-ahead market (in italian MGP), so we focus the attention on the forecast parameters (the temperature in particular) of next day. The temperature exhibits a daily seasonal behaviour as shown in Fig.1. examining a random monthly dataset. This seasonal time series is not stationary, then it's necessary to make a decomposition and a deseasonalization of the time series before the application of the models. Moreover we have to underline the existent, but not much relevant, of a correlation factor between the three parameters (Table 2).

Table 2 Correlation of parameters

Correlation Temperature Pressure Humidity Temperature 1.000 -0.060 -0.398 Pressure -0.060 1.000 -0.179 Humidity -0.398 -0.179 1.000

3.1 Dataset size

Because of the great amount of available datas and the cyclical daily pattern of the time series, it's important to decide the best dataset size on which models reach better performances. The behavoiur of the time series during a single day is surely more comparable with the pattern of more recent previous days than more remote datas. If we have the real observations of a day k, it's possible to see what's the best dataset size (using datas until the day k-1) for a model to reach best forecast values for the day k. We may use the same dataset size to forecast the values of the day k+1 (in which we haven't real observations yet), and then day-to-day.

3.2 Decomposition of time series

The decomposition of time series is useful to transform time series into deseasonalized time series by subtracting the seasonal component from the original time series. The software R uses a function, stl( ), that provides to decompose time series, as shown in Fig. 2.

(5)

3.2 Forecast models in R

It's necessary to transform vectors of parameters temperature, pressure and humidity in time series by the function ts( ), specifying the frequency of the time series (96 in this case). The forecast package needs to use the functions of the forecast models. The function used to fit datas are mainly two: HoltWinters( ) and stlm( ). The first function needs to use the Holt-Winters method, and it's possible to use the additive Holt-Winters method or the multiplicative Holt-Winter method by setting an input parameter of the function.

The second function, instead, needs to use both models ARIMA and ETS, by setting the right input parameters. the function stlm( ) works with stl objects: it takes a time series, applies the STL decomposition, models the seasonally adjusted data

using the specified model (in this case ARIMA or ETS). For ARIMA models it's possible to improve the fitting wellness by setting in input regressors of the time series (pressure and humidity), whereas for ETS models it's possible to choose which model to use by setting the three letters ("Z" denotes an automatic selection of the component type by the function).

The forecast( ) function needs to forecast parameters, setting the model used to fit the dataset and the number of forecast values (in this case 96), and then reseasonalizes the results by adding back the last period of the estimated seasonal component. To complete the analysis we used a last R function, stlf( ). This function combines the functions stlm( ) and forecast( ), but it's not possible to choose the model: the function selects automatically the model that is considered the best choice.

(6)

3.3 Results of check

Determining the performances of the models is the most significant stage in forecasting. Although there are many performance measures that evaluate forecast models, the mean absolute error (MAE) and the mean absolute percentage error (MAPE) are the most common and revealing ones in this case, and these are computed below:

𝑀𝑀𝑀𝑀𝑀𝑀 =𝑆𝑆1∑ |𝑦𝑦𝑆𝑆𝑖𝑖=1 𝑖𝑖 − ŷ𝑖𝑖| (8) 𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀 =𝑆𝑆1∑ �𝑦𝑦𝑖𝑖−ŷ𝑖𝑖

𝑦𝑦𝑖𝑖 �

𝑆𝑆

𝑖𝑖=1 (9) To compare the models we choosed a random month (July 2014) to forecast day by day, with a 5-possible dataset size (7,15,30,45,60 previous days). The plot illustrated in Fig. 3 is related to a random day of the selected month, 2 July 2014.

Table 4 Forecasting error parameters of temperature in July 2014

Performance parameters ARIMA MAE Holt-Winters MAE STLF MAE ETS MAE ARIMA MAPE Holt-Winters MAPE STLF MAPE ETS MAPE Min value 0.65 0.49 0.40 0.29 2.68 2.00 1.74 1.24 Mean value 1.68 2.15 1.42 1.41 7.27 9.23 6.13 6.10 Max value 4.00 9.20 3.28 3.07 19.06 29.13 14.12 14.70

As shown in the plot, the ETS and stlf( ) functions reach great forecast results, with a really small MAE. The models selected by the four functions are reported in the Table 3.

Table 3 Temperature forecast model in July 2 2014

Model used ARIMA Holt-Winters STLF ETS Model STL + ARIMA (3,1,1) Holt-Winters model STL + ETS (A,N,N) STL + ETS (A,Ad,N)

A summary of the monthly forecast in July for all the models used may be seen in the Table 4 on top.

3.3.1 Pressure and humidity forecast

The same comparison with same methods was done to forecast pressure and humidity values in the same month (July 2014). The time series plots are

illustrated in Fig. 4 (humidity) and in Fig. 5 (pressure).

(7)

Fig. 4 Humidity values of a random monthly dataset

(8)

Table 5 Forecasting error parameters of humidity in July 2014 Performance parameters ARIMA MAE Holt-Winters MAE STLF MAE ETS MAE ARIMA MAPE Holt-Winters MAPE STLF MAPE ETS MAPE Min value 2.21 4.44 2.95 3.13 3.34 5.90 4.97 4.75 Mean value 8.22 10.00 7.88 7.72 12.52 15.24 12.40 11.96 Max value 23.13 29.77 17.86 19.27 27.36 35.23 29.04 27.23

Table 6 Forecasting error parameters of pressure in July 2014

Performance parameters ARIMA MAE Holt-Winters MAE STLF MAE ETS MAE ARIMA MAPE Holt-Winters MAPE STLF MAPE ETS MAPE Min value 0.17 0.25 0.25 0.15 0.02 0.03 0.02 0.01 Mean value 1.04 1.42 1.01 0.95 0.10 0.14 0.10 0.09 Max value 2.72 4.35 2.93 2.78 0.27 0.43 0.29 0.27

Even if the pressure behaviour seems to be less predictable than the humidity behavoiur it's important to underscore that the scale between these time series is different. In fact the humidity daily fluctuations and the resulting variance are higher in the humidity time series. This is why the humidity performance parameters are worse than the pressure ones, as it's possible to see in Table 5 and Table 6.

4 Conclusions

In this study, meteorological data of Caserta was analyzed using different models which belong to two forecasting classes: ARIMA and exponential smoothing. The additive Holt-Winters model is a model that may be represented by an (A,A,M) ETS model, so it's normal that Holt-Winters has worse performances than more general ETS function. The real comparison is between the ETS model and the ARIMA model. The linear exponential smoothing models have an ARIMA counterpart, but the non-linear exponential smoothing models don't have an ARIMA counterpart. The reason why exponential smoothing models (using ETS function) have better performances is that a non-stationary model to fit the values is required and the behaviour

of the time series has a non-linear tendency. ARIMA models work better if the model needed has to be stationary. Moreover, although many ARIMA models haven't got an exponential smoothing counterpart, they can't be non-linear models. Another proof of better performances of exponential models to forecast temperature is given by the results of the STLF method choice: in all days of July 2014 the function stlf( ) selected always the ETS model as the best temperature forecast method. There is also a remark that we have to do: ETS function doesn't allow inclusion of regressors data, differently to ARIMA function. In this case the correlation between meteorological parameters is low, especially between temperature and pressure. But there can be parameters with a higher correlation and, consequently, ARIMA function may reach better performances.

References

[1] Gardner Jr, E. S. (1985). Exponential smoothing: The state of the art. Journal of Forecasting 4(1), 1–28.

(9)

[2] Gardner Jr, E. S. (2006). Exponential smoothing: The state of the art—Part II.International Journal of Forecasting 22(4), 637–666.

[3]

Hyndman, R. J., A. B. Koehler, J. K. Ord and R. D. Snyder (2008). Forecasting with exponential smoothing: the state space approach. Berlin: Springer-Verlag.

[4] Box, G. E. P., G. M. Jenkins and G. C. Reinsel (2008). Time series analysis: forecasting and control. 4th. Hoboken, NJ: John Wiley & Sons.

[5] Brockwell, P. J. and R. A. Davis (2002). Introduction to time series and forecasting. 2nd ed. New York: Springer. [6] Chatfield, C. (2000). Time-series forecasting.

Boca Raton: Chapman & Hall/CRC.

[7] Pena, D., G.C. Tiao and R.S. Tsay, eds. (2001). A course in time series analysis. New York: John Wiley & Sons.

[8]

Shumway, R. H. and D. S. Stoffer (2011). Time series analysis and its applications: with R examples. 3rd ed. New York: Springer.

[9] Cleveland, R. B., W. S. Cleveland, J. E. McRae and I. J. Terpenning (1990). STL : A seasonal-trend decomposition procedure based on loess. Journal of Official Statistics 6(1), 3–73.

[10] Gomez, V. and A. Maravall (2001). “Seasonal adjustment and signal extraction in economic time series”. In: A course in time series analysis. Ed. by D. Pena, G.C. Tiao and R.S. Tsay. New York: John Wiley & Sons. Chap. 8, pp.202–246.

[11] Ladiray, D. and B. Quenneville (2001). Seasonal adjustment with the

X-11 method. Lecture notes in statistics. Springer-Verlag.

[12] Miller, D. M. and D. Williams (2003). Shrinkage estimators of time series seasonal factors and their effect on forecasting accuracy. International Journal of Forecasting, 19(4), 669–684.

[13] Theodosiou, M. (2011). Forecasting monthly and quarterly time series using STL decomposition. International Journal of Forecasting, 27(4), 1178–1195.

References

Related documents

Performance & Availability Monitoring Anomaly Detection IT Operations Management Event Correlation Business Line Impact Analysis Virtualization & Cloud Monitoring False

Also there is the Invasive Squamous cell carcinoma that refers to cancer cells that have grown into deeper layers of the skin, the dermisC. So I guess, all statements could

The objective of our study was to determine whether lumbar PSO performed to treat post-operative flat-back syndrome induced statistically significant changes in acetabular version.. As

between rainfall patterns and the spatial distribution of mangrove forests at study sites in 16.. Moreton Bay, southeast Queensland, Australia, over a 32-year period from 1972

Objectives: The present study sought to assess the relationship between depressive symptomatology and resilience among women infected with HIV and to investigate whether trauma

Similar integration strategies have also proven effective in introducing new contraceptive methods (UNAIDS & WHO, 2005). By integrating introduction efforts into

Age (y) Sex Pattern PCR results Flow cytometry results Clinical history Diagnosis Comments 14 87 M 4 Positive No evidence of a monoclonal B-cell population Acute myeloid leukemia

Les acceptations réalisées dans ces manifestations ne sont pas prises en compte pour l'attribution des distinctions de la FIAP (AFIAP, AV-AFIAP, EFIAP, AV-EFIAP) The