VI. Introduction to Time Series – AR(1) Model and Forecasting
a. Introduction to Dependent Observations b. Checking for Independence
c. Autocorrelation d. The AR(1) Model e. Random Walks f. Trend Models
a. Introduction to Dependent Observations
In autoregressive models, we consider observations taken over time.
To denote this, we will index the observations with the letter t rather than the letter i.
Our data will be observations on Y1, Y2, ...Yt, ...where t indexes the day, month, year, or any time interval.
Key new idea:
a. Introduction to Dependent Observations
We will NOT assume that Yt-1 is independent of Yt
Example: Is tomorrow’s temperature independent of today’s?
Suppose y1 ...yT are the temperatures measured daily for several years. Which of the following two predictors would work better:
i. the average of the temperatures from the previous year ii. the temperature on the previous day?
If the readings are iid N(μ, σ2), what would be your prediction for
YT+1?
a. Introduction to Dependent Observations
a. Introduction to Dependent Observations
Monthly US Beer Production (millions of barrels)
Strong
Seasonality
a. Introduction to Dependent Observations
b. Checking for Independence
It is not always easy just to look at the data and decide whether a time series is independent.
So how can we tell?
Plot Yt vs. Yt-1 to check for a relationship or
Plot Yt vs. Yt-s for s = 1, 2, …
Knowing Yt does not help you in predicting Yt+1
b. Checking for Independence
How do we do this in R? – Use the “back” command
18.22 18.19 6 17.62 18.22 5 17.65 17.62 4 15.00 17.65 3 15.19 15.00 2 * 15.19 1 b_prod(t-1) b_prod
Now each row has Y at time t, and Y one
b. Checking for Independence
Now let’s return to the lake data…
First, let’s plot Levelt vs. Levelt-1 Corr = .794
Each point is a pair of adjacent years. e.g. (Level1929, Level1930)
c. Autocorrelation
Time series is about dependence. We use correlation as a measure of dependence.
Although we have only one variable, we can compute the correlation between Yt and Yt-1 or between Yt andYt-2.
The correlations between Y’s at different times are called
autocorrelations.
c. Autocorrelation
We will assume what is known as stationarity.
Roughly speaking this means:
– The time series varies about a fixed mean and has constant
variance
– The dependence between successive observations does
not change over time
Let’s define the autocorrelations for a stationary time series.
t t s t t s
s
t
t t s
cov Y ,Y
cov Y ,Y
Var Y
Var Y
Var Y
c. Autocorrelation
We estimate the theoretical quantities by using sample averages (as always).
The estimated or sample autocorrelations are:
Tt t s t s
s T
2 t
t 1
(Y
Y)(Y
Y)
=
r
c. Autocorrelation
The ACF command in R computes the autocorrelations
There is a strong dependence
between
observations spaced close together in
time (e.g only one or two years apart). As time passes, the
c. Autocorrelation
Let’s look at the autocorrelations for the IID series.
In contrast to the ACF for the
‘level’ series, the sample
autocorrelations are much
c. Autocorrelation
How do we know if the sample autocorrelations are good estimates of the underlying theoretical autocorrelations? and
How do we know if we have enough sample information to reach definitive conclusions?
If all the true autocorrelations are 0, then the standard
deviation of the sample autocorrelations is about 1/sqrt(T).
T
1
r
Err
Std
s
c. Autocorrelation
For the IID series
All of the sample autocorrelations are within 2 standard
deviations of 0 -- no evidence of positive autocorrelation in the data.
For the level series
d. The AR(1) Model
A simple way to model dependence over time is with the “autoregressive model of order 1.”
This is a SLR model of Yt regressed on lagged Yt-1.
t 0 1 t 1 t
AR(1) : Y
Y
What does the model say for the T+1 st observation?
1 T T 1 0 1 T
Y
Y
β
β
ε
d. The AR(1) Model
How should we predict YT+1?
T 1 T
0 1 T
T 1 T
0 1 TE Y |Y
Y
E
|Y
Y
How do we use the AR(1) model? We simply regress Y on lagged Y.
d. The AR(1) Model
d. The AR(1) Model
Now let’s look at the ACF of the residuals…
d. The AR(1) Model
d. The AR(1) Model
Now let’s look at the ACF of the residuals…
d. The AR(1) Model
To gain a better feel for this model, let’s simulate data series from the model with various parameter settings…
The series fluctuates around a mean level with fairly long “runs”. 0
1
0
.8
d. The AR(1) Model
d. The AR(1) Model
Now let’s look at a series generated with a negative slope value…
Because β1 is negative, an above average Y tends to be followed by a below average Y (and vice versa) - hence the jagged
appearance of the plot.
0
1
0
.8
d. The AR(1) Model
d. The AR(1) Model
Now let’s look at a series generated with a slope value of 1…
Wanders around quite a lot! Can exhibit what appears to be trends.
0
1
0
1.0
d. The AR(1) Model
d. The AR(1) Model
Some Intuition on Mean Reversion
We have seen that the slope parameter governs the rate at which the AR(1) model “returns” or “reverts” to the mean level of the series.
Fact for the AR(1) model:
0 t
1
E Y
d. The AR(1) Model
If we subtract μ from both sides of the AR(1) model equation, we can write the model in terms of deviations from the mean.
t 1
t1
t
Y
Y
μ
β
μ
ε
Thus, β1 governs the rate at which you “revert” to the mean level of the series.
e. Random Walks
The case of β1 = 1 deserves special attention because of it's importance in economic data series.
Many economic and business time series display a "random walk character."
A random walk is an AR(1) model with β1 = 1
Random Walk:
t 1
t o
t
Y
e. Random Walks
The intercept, β0 , is called the drift parameter for the random walk. Let's first consider the case of β0 = 0.
This called a random walk with zero drift:
t 1
t
t
Y
Y
ε
e. Random Walks
The random “walk” get its name from the idea of a random walker on the number line. A random walker is someone who has an equal chance of taking a step forward or a step backward. The size of the steps are random as well.
To see this, it is very useful to re-express the random walk in term of increments or steps. Subtract Yt-1 from both sides,
The increments are an random sample (iid collection of rvs)! t
1 t t
t
Y
Y
e. Random Walks
A random walk with zero drift:
– "meanders" around zero with no particular trend.
– can take long "excursions" away from zero that look like
trends. Don’t get fooled!
– but random walk will always return to zero.
Random Walks with Non-zero Drift:
If β0 is positive, we have a random walk with positive drift.
Here the average step size is β0 :
f. Trend Models
Many times we want to allow for shifts in the mean of series over time.
Linear Trend Model:
50 40 30 20 10 30 20 10 0 Index C 3 50 40 30 20 10 30 20 10 0 Index C 3 t 1 0 t
t
f. Trends and Random Walks
f. Random Walks and Trends
Let’s run the regression for the trend fit and look at residual acf. Looks pretty auto-correlated! Trend Model is not
f. Random Walks and Trends
g. Stock Prices and Market Efficiency
g. Stock Prices and Market Efficiency
g. Stock Prices and Market Efficiency
It turns out that a model that fits many stock prices series is a random walk in the log of prices.
log p
tα
log p
t−1
ε
to
log p
t−
log p
t−1
α
ε
tlog 1
Δ
p
tp
t−1⎛
⎝⎜
⎞
g. Stock Prices and Market Efficiency
Thus, if the log of stock prices follows a random walk, the changes in the log of the price are independent.
This has profound implications for the ability to predict future changes in stock prices. This says that stock price changes are completely independent of past changes.
This strongly suggests that any trading strategy that involves the past history of price changes can’t work (momentum
g. Stock Prices and Market Efficiency
What is the economic meaning of this finding?
One possible explanation is that stock prices reflect all available information at the time of trade. Competitive markets provide a sort of information aggregation
mechanism by which information relevant to the stock price (e.g. future profitability of the firm) is incorporated into price.
This idea is often called the weak market efficiency hypothesis. This idea goes back to the fundamental principle of conditional prediction – i.e. forecast errors
h. Example of a Time Series Regression
With time series data, we want to use information available now (time t) to forecast some variable of interest in the
future. In the AR(1) model, we use the current value of y to forecast future values of y.
We might wish to use some other time series to forecast the future value of variable of interest. A nice example of this is the relationship between sales and advertising.
data(persistence)includes data on monthly sales ($)
t 1 t
0 1 th. Example of a Time Series Regression
Peaks at the quarterly frequency.
Let’s model this with lags at frequencies related to
h. Example of a Time Series Regression
We have three lags, a semi annual and an annual effect.
h. Example of a Time Series Regression
Residuals are pretty clean – other than at lag 1, we see little
auto-correlation.
h. Example of a Time Series Regression
h. Example of a Time Series Regression
h. Example of a Time Series Regression
Wrong way: model using only lags of ADVINF!
h. Carry-over Effects of Advertising
In these models, advertising has an immediate effect and then a carry-over effect from the lagged sales terms. These terms can proxy for repeat sales, word-of-mouth, etc.
Consider a simple model:
There is an immediate effect of δ and a carry-over effect of λ.
The carry-over effect means that there is a larger long-run effect of advertising that the short-run effect. Let’s trace the
h. Carry-over Effects of Advertising
t 0 δ
t 1 λ × δ
t 2 λ × λ × δ
M M
t=s λs ×δ
long-ru n: δ(1+λ +λ2+λ3+K)= 1 1−λ ( )δ
h. Carry-over Effects of Advertising
h. Carry-over Effects of Advertising
Let’s write the model down.
Glossary of Symbols
ρs - sth order autocorrelation
Important Equations
t t s t t s
s
t
t t s
cov Y ,Y
cov Y ,Y
Var Y
Var Y
Var Y
Tt t s t s
s T
2 t
t 1
(Y
Y)(Y
Y)
=
r
(Y
Y)
T
1
r
Err
Std
s
Population and Sample Autocorrelations
Important Equations
t 0 1 t 1 t
AR(1) : Y
Y
t 1
t1
t
Y
Y
μ
β
μ
ε
t 1
t o
t
Y
Y
β
ε
definition of AR(1) model
Mean Reversion form of AR(1)
Glossary of R Commands
• acf(): Computes (and by default plots) estimates of the
autocorrelation function
• back(): Computes a lagged version of a time series,
shifting the time base back once.
• diff() : Returns the differences between a value and its
lagged value.
• c(1:30): Generates 30 numbers from 1 to 30 with