# Smoothing Hee we set a

(1)

Comment

for

the

moving

average

smoothing

filter one

does

not have

enough

values of

the

original

times series in

the

window

centred

at _time t if t < _gti _or t > n

-g

e.g. . if 9=2 then the full filtered

output

cannot be

computed

at t- _1,2 , n - I , h s_! K : '! ! ! . -i -. . . - . . . E o • o d . I 2 3 . -. n -I n

The text uses x_, for times t - I and ka for times t s n . So

initially

the value x

, dominates

the

output

and

at

the

end the

value

In

dominates

the

output

.

Method

_Exponential

_Smoothing

#

'

Hee we set a

parameter

2 Eco

, D . Set Tn , = K, . Then fo- t > _I

int

= 2 It t C l - 2)

int

..

. Note that

by

recursion ,

Mnt

= 2kt t Cl -

2) Tht

. , = a Ketel - 4) (2kt -i t Cl -2)

Tht

-_s

)

= 2kt t dCi -4) Kt -i t C l -2) '

Tht

-z :

=L

It tall -

21kt

-it 2 ( i -

25kt

-z t ... t 2 Cl -a) t

-Sca

t Cl -

2)

" K , This is

computed

fo- t--2 , . . , n .

If one had observations

going infinitely

back into the

past

then one

could continue

the

above _recursion forever and _obtain

Tnt

= 2kt tall -2) It -, t 2 ( I - 4) ' Kt-z t . . - -=

£72

Cl -2) " It. -j = aj It -j ,

where

aj =L Ci -a) "

which

is a

causal

linear filter

with

coefficients { 9J

}

.

The

_exponential

smoother

can be

thought

of _as _an

approximation

to

this

causal filter.

As 2 increases

the

output Tht

becomes less

smooth

(

when

2=1

hit

= It for all t )

. The choice of 2 is

subjective

( as is

the

choice

of

q for

the

moving

average

smoothing

filter) , but

the

same comment

applies

for

the

choice

of

2 :

the

output

should look

reasonably

smooth

while

_following

the

_general

trend as seen

by

_eye

when

plotting

(2)

Another

_qualitative

guideline

one can use to choose a

smoothing

parameter

( _g or d

)

is to _, in addition to the previous comment

,

look at the

residuals

one

obtains

by

subtracting

the estimated

trend

from

the

times series

- if one can choose the

parameter

so that the

sample

ACF of

the

residuals indicates

i i d noise

then

this is

justification

that

this is a

good

choice

of

the

parameter

.

-more

generally

_, if the

residuals

look

stationary

(

e.g. , the

sample

Acf of

the

residuals looks like the true Acf of some model for a

stationary

time series

that

we know of

)

then

this

may also

be

(3)

Treudf-liminatcoisbydiffe-en.ci#

Differencing

and

Back

shift

Operators

-The

_differencing

operator

is denoted

by

Fp ( " na

bla

"

)

and

it

operates

as Tf Xt = Xe - Xt . , l first differences

)

. The

back

shift

operator

is denoted

by

B

and

it

operates

as

B _Xt =

Xt

-I . Powers

of

these

operators

are

compositions

of

the

_operator

the number of times as in

the

_power

Ci.e.

,

the

operator

is

applied

K times ,

where

k is

the

power

)

.

e.ge . D -Xt =

17117

Xt

)

=D ( Xt -Xt- i

)

= Xe -Xt- * -

(

Xt- _i -Xt - z

)

= Xt - 2kt -i t Xt-z

B'

_Xt = B ( B Xt

)

= B Xt -I = Xt -z we take Bo = I ,

where

I Xt = Xt .

The difference

operator

can be

expressed

as

= I

-B since C l -

B)

Xt = Xt - Xt - 1

Expressions

involving

the

back

shift

operator

B can be

manipulated

the

same as

polynomials

in _powers

of

B .

e.ge

.

TTXE

= C l -

B)

' Xt = ( I -2B t B'

)

Xt =

Xt

- 2 Xt -it

Xt

-z

For trend

elimination

_, one can

show

that

if one

has

a

time series

with

a trend and a

stationary

part

,

say

Xt

=

mt t

Yt

,

where

Mt is

a Kth

degree

polynomial

and

{

₄₊₃

is

weakly

stationary

then

₁₇

"_me is

constant

and

THE

is

weakly

stationary

.

Example

If me = Co t C it is linear _,

then

-Dmt = Co t Cit - (co t Ci (t-

D

)

= c

(4)

we can state

the

general

case

and

prove it

in

the

following

theorem

If Mt = Co t att . . . t Ck tk is a 4th

degree

polynomial

,

then

17

"

Mt = K! CK .

Prout

.

Proof

by

induction . We

just

showed

that

the

ka case

is true lice, Eco

tGt

)

= Ci

)

. Assume

the

theorem

holds

for

polynomials

of

orders

1 , 2, -. , K -I . Then T" ( co t Gt t . -

tenth

)

= T "-' (cot at t . - t Ck-it " -' t

cut

k

)

=

(

Ck -D ! Cx -i t

ti

-'

cat"

)

by

the

induction

hypothesis

.

--

wait

"

III.

a-

us

=p " -'

(

Cath

-Ck

EE

(

ki

)

c-

pit

"- i

)

(

_by

Binomial

theorem

₎

=D" -'

(

Kc_" t" ' - C C-

dit

" -i

)

= k Ck Ck- D! = Ck k! , since for j e k- I ,

17k

- ' q

ti

= Tl " -_'

DJ

_aj

tj

=

tyre

- i -J

ajj

!

= 0 Thus

,

differencing

K times will

eliminate

any

polynomial

trend of

_degree

K or less .

A

couple

of

observations

on

differencing

:

①

Ok is a causal linear

filtering

operation

.

Indeed

, 17"

Xt

= C l - B

)

" Xt =

(4)

HI"

-it

-

B)

i

]

Xt =

EC

C-

Di

Bi

Xt

i --o =

ii.

(7)

I-

Di

Xt - i ,

which

is a causal linear filter with filter

coefficients

a. =L _, _ai -

(

ki

)

L-

1)

i , i ←I , . .. , K , and aj = O fo-

j

# {oil, .... , K

}

. .

(5)

⑦

_Differencing

is

not

a

good

way

to do trend

estimation

.

For

_example

, if Xt = co t Cit t

Yt

. Th

ent

_Xt = C , t Ye

-Ye_, and the linear trend is

eliminated

.

However

,

the

trend estimate would be

hit

= Xt - T Xt = Xt - ( Xt - Xt - i

)

=

Xt

-i .

Thus

,

the

trend estimate is

just

the

original

time series

shifted

back

in

time

by

one unit _,

which

does follow the

general

linear trend

but is not a

good

estimate of a line .

So

,

differencing

is useful for

removing

low

frequency

components

of the time series (

such

as trends

)

, but does so in a

crude _way

,

and

removes more than

just

the trend . If one

only

cares that what is left over looks

stationary

_,

and

not in

estimating

the

trend itself _,

then

_differencing

_is

_fine

and is

a

useful

building

block for more

complicated

models

. It is a

simple operation

and

_easily

reversed for

_prediction

purposes : one

does

prediction

_on

the

_differenced

data

using

a

prediction

based

on a model for

stationary

time series

,

and

then

reverse

the

differencing

to

_get

the

_prediction

_in

_the

original

time _scale

of

the

data

.

Reversing

the

differencing

!

If

_Dt

_=D _Xt = Xt -Xt - I

then

if we take Xo -- O so that D_, - K ,

then

_Xt

=

Dt

_t Dt -it . .. t D ,

Note that

we could also have written

Dt

= Dft = Cl - B)

Xt

_, so

Xt

=

Dt

= B"

)

Dt

= Dt t

Dt

- it . . . D , t Dot D-it . . . .

ta¥tbe0

if we set Xt = O for TEO.