CHAPTER 2 LITERATURE REVIEW
2.2 Reliability for Traffic System Users
2.2.1 Prediction Models in Traffic Analysis
Traffic parameter measurements are typically time series data. A fundamental property of time series data is its dependency among adjacent measurements. Analyzing the data generating mechanism and forecasting future outcomes can help traffic engineers manage a traffic system more efficiently. A number of macroscopic traffic parameters such as traffic flows, speeds, and occupancies are basic inputs for congestion
management. For example, traffic flow forecasting will support the development of proactive traffic control strategies in advanced traffic management systems (ATMSs) and the evaluation of these real-time traffic control strategies. The microscopic parameter, individual travel time, is of great importance in advanced traveler information systems (ATISs). A report by Cheslow et al. (1992) about the architecture of intelligent
transportation systems indicated that the ability to make continuous predictions of traffic flows and link travel times for several minutes into the future based on real-time traffic data is a major requirement for providing dynamic traffic control and guidance.
This dissertation aims to predict travel time together with its reliability
information. The parameter of interest in this dissertation is travel time. According to the input data, prediction models can be separated as direct and indirect models. Indirect travel time prediction models derive predicted travel times based on other quantities such as flows and/or speeds (Van Lint 2005). Indirect models are applied due to the limitation of available travel time observations. In contrast, direct models predict travel times using previous travel time observations. Generally, prediction methodologies developed for
other parameters (e.g., volume) can also be applied to travel time. Thus, literatures of prediction models for other traffic parameters are also included in this part.
Generally, the models developed for prediction purposes in traffic engineering can be categorized as statistical models and artificial intelligence models. Statistical models include the historical average model, time-series models, dynamic linear models, Kalman filter, and non-parametric models. Artificial models include various neural network models.
Historic Average Model
A historic average method predicts future travel time for each time interval as the average value of all the past travel times at the corresponding time interval. This method is easy to calculate, and the prediction can be refined continuously by updating the historical average with new available observations added. This model, however, depends heavily on the repeatable nature of the traffic flow and cannot reflect sudden changes such as incidents.
Time-series Models
Commonly used time series models include the moving average (MA) model, exponential smoothing filter, auto-regression (AR) model, and auto-regressive integrated moving average (ARIMA) model. For a time-series dataset, adjacent values are usually related to one another. The MA(1) model represents the relationship between adjacent
values in a time-series dataset by a process that a random error (ππ‘β1) at the previous time
interval (t-1), plus a random error (ππ‘) at the current time interval (t), drives the series to
yield the output for the mean centered series (ππ‘) at current time interval (t), where (ππ‘) is
white noise error term (McCleary and Hay 1980). The MA model assumes a stationary time-series dataset. The exponential smoothing filter is a direct method to forecast time series with no trend, assuming that the average level changes slowly over time.
Exponential smoothing resembles the moving average method, but it gives higher weight to the most recent observation (Bowerman and OβConnell 1979).
In an auto-regression model, the predicted value of the time series is regressed, or expressed as a function of previous observation in the time series. For example, a pilot study by the Northern Region Operation (NRO) of the Virginia Department of
Transportation (VDOT) applied an AR(2) model that assumes the predicted travel time at time t takes a linear, weighted form of the observed travel times in the previous two time intervals (Fei et al. 2011). To analyze the data-generating mechanism with both the auto- regression and moving average processes, an ARMA model is proposed to integrate the two models together with the assumption of the stationary characteristic. To relieve the constraints of this assumption, an ARIMA model predicts future data points based on the trends and variations from the previous data points by modeling the autocorrelation in a time series mathematically (Washington 2011). An ARIMA model is identified by (p, d,
q), where p is the auto-regressive component, d is the integrated component, and q is the
moving average component. The process to develop an ARIMA model consists of model identification, parameter estimation, and validation. Based on the ARIMA modelling philosophy, a seasonal autoregressive integrated moving average (SARIMA) model can account for the seasonality in data. In addition, explanatory variables can be incorporated into an ARIMAX or SARIMA(X) model. Cools et al. (2009) applied ARIMAX and SARIMA(X) models to daily traffic count data to study the impact of holidays on the
variability in daily traffic counts. Xia (2006) studied dynamic travel time prediction based on available data from single-loop detectors and incident reports using a SARIMA model with an embedded adaptive predictor. This method involves multiple-step ahead predictions for flow rate and occupancy in real time. The embedded predictor modifies prediction error based on updated traffic data for every five-minute interval.
One limitation of these linear based time series models (AR, MA, ARMA, and ARIMA) is their inability to deal with large variations and nonlinearity. This usually happens in transportation systems under recurrent and non-recurrent congestions when the system fails to generate continuous and stationary data series. In addition, there is a major difference between the typical time series prediction problems and the task of predicting travel times throughout a day. Most time series prediction models assume that the exogenous factors acting upon the dynamical system either remain constant or can be measured in the model, if they are time-varying (Amani et al. 2011). In a travel time prediction problem, however, the main exogenous factors β the demand and supply on a transportation network β varies widely throughout the day. These factors are typically difficult to quantify due to unpredictable reasons such as work zones, weather, and incidents.
Dynamic Linear Model
ARMA models can be usefully regarded in terms of dynamic linear models, but dynamic linear models can provide flexible framework in treating a non-stationary time series (Petris et al. 2007). A more general class of state-space models can be applied to analyze non-Gaussian and non-linear dynamic systems. Fei et al. (2011) applied a Bayesian dynamic linear model to predict short-term freeway travel time. To estimate
dynamic linear models, the Bayesian approach has both methodological and computational advantages. Bayesian forecasting is a learning process that revises sequentially the state of a priori knowledge based on newly available information. In a good forecasting model, there should be a routine way of learning during phases when predictions and decisions appear adequate, and an exceptional way when they seem unsatisfactory (West and Harrison 1997).
Kalman Filter
The Kalman filter is essentially a set of mathematical equations which can estimate the state of a dynamic process recursively while minimizing the mean of the squared error. To predict travel time, the process can be written into a state-space model by equations 2.1 and 2.2. The model was originally designed in a space tracking setting, where the state equation defines the motion equation for the position or state of a
spacecraft, with the location represented as π₯π‘, and π¦π‘, which reflects the information
observed such as velocity and azimuth (Shumway and Stoffer 2011).
State equation: π₯π‘ = Ξ¦π₯π‘β1+ π€π‘ (2.1)
Observation equation: π¦π‘= π΄π‘π₯π‘+ ππ‘ (2.2)
where:
π₯π‘ = the state at time interval t,
π¦π‘ = the observation at time interval t,
Ξ¦ = the state transition parameter,
π€π‘ , ππ‘ = the white noises.
For travel time prediction, the state equation is an order-one, auto-regression
model, determining the rule for the generation of the average travel time π₯π‘ (i.e., state of
the traffic system) at time interval t from the past state π₯π‘β1. The state transition
parameter is represented by Ξ¦. The variable π¦π‘ is the noisy observation(s) under the
current system state, and π΄π‘ is the observation matrix. Both Ξ¦ and π΄π‘ can be defined in
various versions as required in specific modeling problems. For example, they could be
time-dependent or time-independent. The variables π€π‘ and ππ‘ are assumed to be white
noises. The problem is to produce estimators for the underlying, unobserved π₯π‘, given the
data ππ = {π¦1, β¦ , π¦π }. The Kalman filter provides the solution to this problem with the
advantage that it specifies how to update the filter from π₯π‘β1π‘β1 to π₯
π‘π‘ once a new
observation π¦π‘ is obtained, without having to reprocess the entire data set π¦1, β¦ , π¦π‘
(Shumway and Stoffer 2011). The on-going Kalman filter is composed of predictor equations 2.3 and 2.4 and correction equations 2.5-2.7. Before the observation in time interval t becomes available, the average travel time at interval t is predicted based on
previous observations π¦1, β¦ , π¦π‘β1 by equation 2.3. The error covariance ππ‘π‘β1 is estimated
by equation 2.4. The predictor equations yield a priori estimates for the next time
interval. After the observation in time interval t becomes available, the current state at t is calculated by equation 2.6, accounting for both the a prior estimate and the inconsistency between the predicted observation and the actual observation. This logic estimates the current state (i.e., average travel time) adaptively by adjusting the a priori estimate with
is lower than the actual observation π¦π‘, the a posterior estimate of state π₯π‘π‘ will be
adjusted lower than the a priori estimate π₯π‘π‘β1, and vice versa. This feedback control will
enable a prediction to account for the information from recent observations, which is very important in transition periods with unstable traffic conditions. During such a period, recent observations count for more information than the original trend to predict the system state. π₯π‘π‘β1= Ξ¦π₯π‘β1π‘β1 (2.3) ππ‘π‘β1= Ξ¦π π‘β1π‘β1Ξ¦β²+ π (2.4) πΎπ‘= ππ‘π‘β1π΄π‘β²[π΄π‘ππ‘π‘β1π΄β²π‘+ π ]β1 (2.5) π₯π‘π‘ = π₯π‘π‘β1+ πΎπ‘(π¦π‘β π΄π‘π₯π‘π‘β1) (2.6) ππ‘π‘ = [πΌ β πΎ π‘π΄π‘]ππ‘π‘β1 (2.7) where: π₯π‘π‘β , π
π‘π‘β1 = a priori estimate of state and covariance for time interval t,
π₯π‘β1π‘β1 , π
π‘β1π‘β1 = state and covariance estimates from time interval t-1,
Ξ¦ = transition matrix,
π΄π‘ = matrix relating the state at previous time interval to the current time
interval,
πΎπ‘ = the Kalman gain for time interval t,
π, R = noise covariance matrices,
π₯π‘π‘, ππ‘π‘ = state and covariance estimates at time interval t, and
The first application of the Kalman filter in traffic engineering was for traffic volume prediction (Okutani and Stephanedes 1984). The model based on the most recent prediction error and inputs from multiple links demonstrated a high degree of accuracy. Yang et al. (2004) presented a recursive algorithm based on the Kalman filter to
dynamically predict short-term traffic volume. Kuchipudi and Chien (2003) developed a hybrid model based on the Kalman filter algorithm for dynamic travel time prediction. Xie et al. (2007) improved the performance of the Kalman Filter on volume prediction by adding discrete wavelet decomposition analysis to divide the original data into several approximate and detailed data. In this way, the noise in the original data was removed and the prediction accuracy was increased, compared to the direct Kalman filter model measured by mean absolute percentage errors and root mean square errors.
The standard Kalman filter (equations 2.3-2.7) was designed to estimate the state of a discrete-time controlled process that is governed by a linear stochastic difference equation. The Extended Kalman Filter (EKF), linearizing the current mean and
covariance, can be used to address a non-linear process (Welch and Bishop 1995). One constraint of applying the Kalman filter directly in route travel time prediction is the delay of real-time observations. The filter has to wait until the trip is completed to receive input of new observations.
Nearest Neighbor Model
This non-parametric method is based on the hidden relationship between a large historical database and the current system state. After locating the current system state as a past time neighborhood with similar status, the states of the past systems in this
neighborhood are used to estimate the current state. This method aims for a satisfactory, rather than an optimal result. However, Karlsson and Yakowits (1987) stated that the nearest neighbor approach will result in an asymptotically optimal forecaster, meaning that for an input state vector containing m values, the nearest neighbor will
asymptotically be at least as good as any mth order parametric model. The state vector for travel time prediction can include the travel times, traffic volumes, occupancies, and speeds in previous time intervals. The general methodology of a K nearest neighbor (KNN) model consists of five steps: 1) Build a historical database including traffic patterns such as free-flow, recurrent congestion, and non-recurrent congestion. 2) Define the neighborhood: Two basic approaches can be used to define the neighborhood. The kernal neighborhood has a fixed radius while the nearest-neighbor algorithm has a fixed sample size. 3) Calculate the distance through the absolute value distance or the
Euclidean distance. 4) Find K for the nearest neighbor method. 5) Define the prediction function based on the average of the neighborhoods or the weighted average.
The KNN model has been applied to predict traffic volume and travel time (Davis and Nihan 1991; Bajwa et al. 2004; Robinson and Polak 2005). Kim et al. (2005)
improved the KNN model by adding a traffic flow pattern recognition technique that uses the signs of changes in the past sequences of traffic volume to overcome the memory-less property of the KNN model. This improvement made by considering the sign changes, however, only reflects the qualitative trend. Furthermore, Qiao et al. (2012) developed a modified KNN model with trend adjustment (KNN-T model) so that the traffic trend effects can be included into the model for short-term travel time prediction. The trend adjustment in this study considered travel time trends both qualitatively and
quantitatively in terms of the signs of changes and the magnitudes of changes in travel times, respectively. For the case study using Bluetooth data on Route I-66 in Virginia, the KNN-T model outperformed the historical average method, ARIMA, Kalman filter, and KNN model for both all-day and peak-hour periods, evaluated by absolute percentage errors (MAPE).
Neural Network
Neural networks (NN) are popular in transportation problems mainly because of their ability to perform self-learning, work with multi-dimensional data, deal with a non- linear problem regarding a flexible model structure, and generate good prediction results (Karlaftis and Vlahogianni 2011). The topology generally used for prediction is a basic and fully connected back propagation multilayer perceptron (MLP) structure, consisting of one input layer, one hidden layer, and one output layer. The hidden layer is used to capture nonlinearity. Neural networks essentially train the connection weights as the hidden neurons learn to recognize different features of the total input space. The training is performed iteratively until the squared error between the computed and the desired output over all the training patterns is minimized (Washington et al. 2011). The trained neural network is capable to predict output values for future inputs. This topology was applied in various transportation prediction problems (Clark et al. 1993; Smith and Demetsky 1994; Park and Rilett 1998; Naik 2010).
This conventional NN structure using arbitrary squashing functions was shown to be theoretically able to approximate any measurable function from a finite-dimensional space to another finite-dimensional space, and to any desired degree of accuracy as long as the hidden layer has sufficient hidden neurons (Hornik et al. 1989). In practice,
however, this NN structure encounters the greatest difficulty in approximating functions with the input features that are not linearly separable. Two techniques were proposed to solve this problem associated with the conventional NN: 1) input feature transformation or a spectral basis neural network (Park et al. 1999), and 2) input-partitioning or a modular neural network is used to approximate with a combination of relatively simpler functions instead of one complex function (Park and Rilett 1998; Kisgyorgy and Rilett 2002).
Some researchers integrated two or more methods to generate hybrid models to pursue higher accuracy and efficiency. One example is to combine some clustering techniques with either time-series analysis or neural network models for integrated prediction. Yin et al. (2002) applied a two-module fuzzy-neural model to predict the traffic flows for an urban street network. A fuzzy approach was used to classify the input data into clusters. For each cluster, a conventional neural network approach was used to model the input-output relationship. Zheng et al. (2006) presented a Bayesian combined neural network approach for short-term freeway traffic flow prediction. In these studies, an appropriate method was applied to classify traffic flow patterns first. Then, a neural network model suitable for each traffic pattern was selected for modeling and prediction. It is important to note that a predictor trained for certain patterns (e.g., peak period traffic) will generate deficient prediction when the traffic pattern changes. Therefore, hybrid models were generally found to be more efficient than a singular predictor.
A feed-forward neural network, where the information flows in a unidirectional way from the input layer to the output layer, is not sensitive to previous processes, making it suitable for recognizing spatially and temporally separated patterns. Recurrent
neural networks can deal with the spatiotemporal dynamics through one or more feedback loops to feed predicted output signals or new output observations back to the input neurons. Van Lint et al. presented a recurrent neural network topology derived from a state-space formulation to predict freeway travel times.
Summary
In this dissertation, the requirement for a nonlinear predictor that can work under both stable and unstable state scenarios led to the application of one class of recurrent neural networks β the nonlinear autoregressive with exogenous inputs Model (NARX) neural network β to predict both mean and variance of corridor travel times on an urban arterial road.