• No results found

SINGULAR SPECTRUM ANALYSIS HYBRID FORECASTING METHODS WITH APPLICATION TO AIR TRANSPORT DEMAND

N/A
N/A
Protected

Academic year: 2021

Share "SINGULAR SPECTRUM ANALYSIS HYBRID FORECASTING METHODS WITH APPLICATION TO AIR TRANSPORT DEMAND"

Copied!
19
0
0

Loading.... (view fulltext now)

Full text

(1)

SINGULAR SPECTRUM ANALYSIS – HYBRID FORECASTING

METHODS WITH APPLICATION TO AIR TRANSPORT DEMAND

K. Adjenughwure, Delft University of Technology, Transport Institute, Ph.D. candidate V. Balopoulos, Democritus Thrace University, Dep. of Civil Engineering, Associate Professor G. Botzoris, Democritus Thrace University, Dep. of Civil Engineering, Assistant Professor

(2)

TITLE OF THE SLIDE

TRANSPORTATION DEMAND FORECASTING

Transportation demand forecasting is the process of estimating

the number of people or vehicles that will use a specific

transport facility over a particular time interval.

Accurate forecasting of demand is particularly important in air

transport, influencing decisions such as ticket pricing, operation

of new or closing of existing routes, aircraft purchase, building

of new or abandoning of old terminals, etc.

The numerous methods that have been developed for or

employed in air transport demand forecasting may be classified

as qualitative (such as market surveys, Delphi method, and

expert meetings), or quantitative (such as econometric, time

series, etc.).

(3)

TITLE OF THE SLIDE

Statistical time-series prediction methods, such as Autoregressive

Integrated Moving Average, have long been preferred for modeling

of airport passenger demand, but recently artificial intelligence

methods, such as Artificial Neural Networks, Fuzzy Logic, and the

Adaptive Neuro-Fuzzy Inference System, have gained recognition

and have been applied to the same task.

All time-series prediction methods are reasonably accurate, but are

inherently sensitive to noise. To increase the accuracy of

time-series prediction, various methods have been developed to remove

noise from raw data and to decompose any time series into its

trend, its oscillatory components and its noise components. One of

these methods is the Singular Spectrum Analysis which

decomposes any time series into various components.

(4)

TITLE OF THE SLIDE

The Singular Spectrum Analysis (SSA) has been combined with

other classical time-series prediction methods to help improve

their results. Most related research use the SSA as a noise

removal. A very recent hybrid approach, however, is to first use

SSA to decompose a time series into many component time

series (trend, seasonal and noise), then predict each non-noise

component separately by a chosen time-series prediction model,

and finally employ SSA to aggregate the predicted components

into predictions for the original time series.

SINGULAR SPECTRUM ANALYSIS

trend cyclical variation

Y

t

= T

t

+ C

t

+ S

t

+R

t seasonal variation random variation
(5)

TITLE OF THE SLIDE

0 1000 2000 3000 4000 5000 6000 7000 8000 Jan -05 M ay -05 Sep -05 Jan -06 M ay -06 Sep -06 Jan -07 M ay -07 Sep -07 Jan -08 M ay -08 Sep -08 Jan -09 M ay -09 Sep -09 Jan -10 M ay -10 Sep -10 Jan -11 M ay -11 Sep -11 Jan -12 M ay -12 Sep -12 Jan -13 M ay -13 Sep -13

Heathrow airport, monthly passenger demand (thousands)

TIME-SERIES OF A VARIABLE – SINGLE DECOMPOSITION

=

0 1000 2000 3000 4000 5000 6000 7000 8000 Jan -05 Sep -05 M ay -06 Jan -07 Sep -07 M ay -08 Jan -09 Sep -09 M ay -10 Jan -11 Sep -11 M ay -12 Jan -13 Sep -13 TREND

+

Jan -05 Jul -05 Jan -06 Jul -06 Jan -07 Jul -07 Jan -08 Jul -08 Jan -09 Jul -09 Jan -10 Jul -10 Jan -11 Jul -11 Jan -12 Jul -12 Jan -13 Jul -13 OSCILLATION
(6)

TITLE OF THE SLIDE

The contribution of this paper is to show that SSA decomposition of a

time series and the subsequent prediction of its components can

improve forecasting results. ANFIS was chosen as a method to allow

easy comparison with the work of Xiao et al. (2014). We demonstrate

this fact by using the statistical data of two international airports

(Heathrow, London and El. Venizelos, Athens), with very different

traffic volume and characteristics.

SCOPE OF THE PAPER

2005 2007 2009 2011 2013 4,000 4,500 5,000 5,500 6,000 6,500

7,000 Passengers (in thousands), LHR airport

Training Testing

(7)

TITLE OF THE SLIDE

ANFIS = ANN + FIS

The acronym ANFIS derives its name from

adaptive neuro-fuzzy

inference system

. Using a given input/output data set, the anfis

constructs a Fuzzy Inference System (FIS) whose membership function

parameters are tuned (adjusted) using either a back propagation

algorithm (i.e. a Artificial Neural Network) alone or in combination

with a least squares type of method. This adjustment allows your

fuzzy systems to learn from the data they are modeling.

ADAPTIVE NEURO-FUZZY INFERENCE SYSTEM (ANFIS)

A1 A2 B1 B2 x y

Layer 1 Layer 2 Layer 3 Layer 4 Layer 5

1 2 1 w 2 w 1 w 2 w 1 1 w f 2 2 w f f x y Layer 0

Layer 1: Fuzzification Layer Layer 2: Rule Layer

Layer 3: Normalization Layer Layer 4: Defuzzification Layer Layer 5: Summation Layer

(8)

TITLE OF THE SLIDE

To improve the generalization capability of an ANFIS model, a method

known as cross-validation is used. In this method, all the available

data is split into three sets: a training set, a validation or checking set

and a testing set. The data in the training set is used to train the

model while the validation data set is used to prevent the model from

overfitting by monitoring the error in their output. The training of the

model is stopped when the error of the validation set is minimized.

Note that the validation data is only used after the model have been

trained and is not part of the training. Thus this can be considered as

an independent check on how well the trained model is doing. After

training and validation, the test set is then used as a second

independent test of the generalization ability of the model. The final

model chosen is the model that gives the minimum error in the

output of the test set.

(9)

TITLE OF THE SLIDE

The first stage is the decomposition of the series and the second stage is

the reconstruction of the decomposed series to get the original series.

The three parameters to be selected for the SSA algorithm are the

window length L, the number of elementary matrices to use for the

reconstruction r, and the number of groups m. The most important

parameter is the window length L. The other two parameters can be

omitted, depending on the way the SSA will be used (for pure

decomposition only the window length is required, and for noise removal

the grouping stage can be omitted).

The window length is the only parameter needed for the decomposition

of the time series. There is currently no algorithm for selecting the

window length but many researchers have suggested choosing L<(N/2) as

a general rule, where N is the number of available time series data.

(10)

1

TITLE OF THE SLIDE

For a time series data with a known period T, Golyandina et al.

(2001) recommend choosing L such that L/T is an integer. For

instance, if the time series data is seasonal and the period is

4, then choosing L to be multiples of 4 (4, 8, 12, 16,...) will help

capture the periodic components with periods 4. If the series has

multiple periods (T

1

, T

2

, T

3

…), then L should be chosen such that

L/T

i

is an integer for all i.

To extract only a trend component, L should be chosen large

enough so that the trend is

separable

from other components

such as the noise but not too large because large values of L

mix-up the trend with other components. In conclusion, L should be

chosen such that all the components from the decomposition of

the time series are separable or non-correlated.

(11)

1

TITLE OF THE SLIDE

• The proposed hybrid models combine the SSA with ANFIS. The goal is to

improve the performances of the ANFIS model by first decomposing the time series into a sum of simple components (time series) which are easier to predict using these methods and then combining the predictions of each component.

THE HYBRID MODELS

GC1 GC2 GCm prediction with ANFIS PGC1 PGC2 PGCm Summation with Singular Spectrum Analysis (SSA) Predicted time series Grouped

components componentsPredicted

Original

time series Decompositionwith Singular Spectrum Analysis (SSA) PC1 PC2 PC3 PC4 PCL-1 … PCL … Time series components prediction with ANFIS prediction with ANFIS

(12)

1

TITLE OF THE SLIDE

LHR

ATH

THE TIME SERIES CHARACTERISTICS OF THE LONDON

HEATHROW (LHR) AND ATHENS (ATH) AIRPORT

(13)

1

TITLE OF THE SLIDE

(14)

1

TITLE OF THE SLIDE

(15)

1

TITLE OF THE SLIDE

COMPARISON OF RESULTS BETWEEN PURE ANFIS AND

HYBRID SSA – ANFIS MODELS

(16)

1

TITLE OF THE SLIDE

The results of the prediction of the pure ANFIS model re-emphasise the advantages in using the hybrid models. Although the pure models did not perform well on

average on two airports with MAPE between 4.38% and 8.69%, the hybrid SSA– ANFIS models gave far better predictions with MAPE less than 2% for both airports. In terms of the RMSE, the predictions made by the hybrid models were an average 5.3 times better than the pure ANFIS. Also the coefficient of determination R2 had

an average improvement of 21% across both airports

IMPROVEMENT OF THE FORECASTING ABILITY BY USING

THE HYBRID SSA - ANFIS MODEL

Statistics Pure ANFIS model Hybrid SSA –

ANFIS model Airport Root Mean Square Error (RMSE) 335.49 89.68 Heathrow

112.96 16.26 Athens Mean Absolute Error (MAE) 263.99 72.27 Heathrow 73.70 14.32 Athens Mean Absolute

Percentage Error (MAPE)

4.38 1.21 Heathrow 8.69 1.52 Athens Coefficient of determination, R2 0.77 0.98 Heathrow 0.85 0.98 Athens

(17)

1

TITLE OF THE SLIDE

(18)

1

TITLE OF THE SLIDE

• Although econometric methods are currently being used to forecast

transport demand, the success of time series forecasting models, especially for short-term demand forecasting, has shifted research focus to

development of methods to improve the forecasting ability of these

models. Consequently, specialized statistical models like ARIMA and more recently artificial intelligence (AI) methods like ANN and ANFIS have been applied successfully to forecast air transport demand time series.

• Despite the success of AI models, their poor performance when used to predict noisy and seasonal time-series data, like monthly passenger

demand of airports, has necessitated better forecasting models that can forecast in the presence of noise and also exploit the seasonality of the data to improve forecasting results. Methods like seasonal ARIMA have been

used to forecast seasonal data, while Singular Spectrum Analysis (SSA) has been used as a noise removal tool to forecast noisy data.

(19)

1

TITLE OF THE SLIDE

• In this paper, hybrid models that combine SSA and ANFIS have been calibrated to forecast the passenger demand of two international

airports, London Heathrow and Athens. Forecast results have shown that decomposing a time series by means of SSA into simpler

components, predicting the future values of the components using any established prediction method, and then summing the predictions using SSA, can greatly improve forecasting performance.

• The main reasons for the remarkably improved forecasting achieved by the SSA-hybrid prediction methods are the simplicity, since the component time series are simpler and, hence, easier to predict, the exploitation of

seasonality, since each seasonal component is predicted separately and the noise removal, since noise in the data is reduced by removing components

with no seasonality or no significant contribution.

References

Related documents