• No results found

That is, we want a functions. considerations to functions f that are. predictors are of the form. all. , Cov CY. j), isj. optimal " )) = o ci ) , n :

N/A
N/A
Protected

Academic year: 2021

Share "That is, we want a functions. considerations to functions f that are. predictors are of the form. all. , Cov CY. j), isj. optimal " )) = o ci ) , n :"

Copied!
9
0
0

Loading.... (view fulltext now)

Full text

(1)

Linear Prediction

-Say

Y

and Wi

. .. > Wn are random

variables

,

and

we want

to

predict

Y

given

W , . .. . , Wn . That is , we

want

a

functions

f-

(

W , . ... , Wn

)

of Wi . -g Wn

that

is in

some sense a

good

predictor

of

Y .

We will -restrict our

considerations

to

functions

f that are

linear in Wi , . .

, Wm , i.e.

, all

predictors

are of

the

form

got a, Wn t . . . t an W , . Let a -- Ca . . . . , an

)T

, My = ELY

)

Mi = E [ Wi) , i =L , ugh Mw = ( Mn , . ., M,

JT

W = ( Wn , .., W ,

JT

( so Mw = ECW

)

)

→ 8 = Cov ( Y , W

)

=

(

CovCY , Wn

)

. . .. , Cov CY , Wi

)

)

T

T

= Cov

(

W

)

c- hxn covariance matrix

of

W

So

Tij

= Cov ( Wnti

- i s

Wnt

,

-j

)

,

isj

= I , .

.

, n

we will say that

the

" best" or "

optimal

" set of

coefficients

{ go , a. . ... , an

}

are

those

that

minimize

the

mean

squared

error : F-

[ (

Y - Cao ta , Wnt .-i t an W ,

)

)

'

]

C*) To find

the

optimal

ao, a. . .

, an we

differentiate

(*

)

w. r.t .

to

each

aj

and

set

the

derivative

to 0

,

then

solve

for ao ,a , , ..

, an .

Differentiate

wir .t. ao : F-

(

Y - Cao ta , Wnt -e. tanw,

)

)

= o ci

)

Differentiate

wir't. aj ,

j

= I , . . , n : E

[

(

Y - Cao tai Wnt .- t an W ,

)

)

Wnt,

-j

)

-o Lii

)

Cil

gives

ELY

)

= Ao t ai F-[wilt . . . + an ECW

,

]

So

then

do = My -

(

9

, Un t . -it an M

,

)

(2)

The

predictor

now in terms

of

ai , . . , an is My - ATMw t a, Wh t .-. t an W, = my - at Mw t a' -W

in

, t at Cw - Mw

)

Now ,

plugging

this

into Lii)

gives

F-

[

(

4-my- at ( W -Mw

)

)

Wnt,

-j

)

= 0 ,

j

=L. . .. , h . Since ELY -my- at( W -Mw

)

)

= o

(

from Ci)

)

, we can

also

write Cii)

as E

KY

-my- a'(W- Mw

)

)

(

Wnt , -

j

-anti

-j

)

]

= O ,

which

is

the

same as Cov

(

Y , Wnt -j

)

-

at

Cov

(

W ,

Wnt

, -j

)

= O T vector of covariances ⇐ Cov

(

Y ,

Wnt

, -j

)

-e ai

Covlwnti

- i , Wha-j

)

= O ⇐

8J

-. ai

Ting

= O ⇐ 8 ; -

E

, ai

T.si

= O ,

j

-- I . . . , n

-C.

ATT

)

; =

(

Ta

)

; so

8

=

Ta

. That is

,

the

optimal

a

satisfies

Ta

= 8

.

Notation

we let P ( Y l w

)

denote

the

best

linear

predictor

of Y based on W = ( Wn , ..

, W,

)T

,

%77.hn?!nIgaTCw-Mw),whereaisthesolutiofa

The minimum mean

squared

error is

F-

[

(

Y - PCYIW

)

)

'

]

-

ELLY

-

cmytatcw

-Mw

)

)

)

'

]

= F-

[

(

( Y - Ny

)

-

ATC

w -Mw

)

)

'

)

=

E[

( Y -my)'

)

t

E[

at

( W -Mw

)

(

W - Mw

)

Ta

)

-

2aTE[

( Y -my

)

( W -Mw

)

)

(3)

= Var LY

)

t

at

Ina

- za

-18

= 8 =

Var

( Y

)

t aT 8 - z a-18 =

Var

( Y

)

-

at

8-heminimumMSEisvarl4-a.IT

(4)

Note

that

when we are

predicting

a

random

variable

Y

using

MSE as

the

criterion , we are

assuming

that

Var

;

otherwise

the

MSE would be

equal

to • for

any

predictor

.

Also, note

that

from Ci) and Iii) from

the

previous

lecture

,

cis is

saying

that PCH 1W

)

, which is

the

best linear

predictor

,

gives

residuals

which

are zero mean ,

Cii) is

saying

that Cov

(

Y - PLY 1W)

, W

)

= 0

;

that

is,

the

residuals

from the best linear

predictor

PCYIW) are

uncorrelated

with the

predictor

variables W .

Properties

of

the

Best Linear Predictor

-we can treat the

predictor

PCYIW

)

as an

operator

PC . I w

)

acting

on random variables ,

taking

them to their best linear

prediction

based on W . As an

operator

, PC . I w

)

is a linear

operator

.

Suppose

4, , 22

and

B are

real

numbers

and

U and V are

random

variables

. Then we have

the

following

property

:

PC

w

)

= 2 , PCU Iw

)

t 2.

PCVIW)

t p

Prot

. LHS is 2 , ECU

)

+ asECU) t B t at ( W -Mw ) ,

where a satisfies Ta = Cov (2, Utd. Vt B

, W

)

Ca) on the RHS PCU I w) = ECU) t

(

a' "

ICW

-Mw

)

and pcvlw

)

-- E[

is

+ La "

't

' (W -Mw

)

,

where

a "'

satisfies

Ta

"' = Cov

(

U , W

)

and

e b) aw' satisfies Ta"' = Cov ( V , W

)

, cc)

and the RHS is a , ECU) t 2, Ca

"'

)

'( W - Mw

)

t 2 , F-LV) th (a (W -Mw

)

+p = 2, East 2.East B t

(

a , a "

't

2. a'

4)

"

(

w -Mw

)

-So the LHS will

equal

the

RHS if a = a , a "' ta, am . So we

will

show

2. a "

't

229'" satisfies

ea¥ti

T

(

x, a"'t 2. a"'

)

=

Could

, U tart B , W

)

(5)

But

Could

, Uta. Vt B , w

)

=L, Cov ( U, W

)

t 2, GvCV, w

)

t covers, W

)

=L, Cov Lu, w

)

the Cool V, W

)

since Cov ( B, W)

= 0 Since

B

is a constant.

Thus , we want to

show

T

(

2 ,

a'

n

't

2. a "'

)

= 2,

Cov

( U, w

)

t da Gv (V , W

)

But from Cb) and Cc)

, Ta"' = Cov C U ,w

)

and Ta "' = Cover, w

)

So we

get

TK, a cult 2,a '"

)

= 4 , Ta "' t 2, Ta '" =L, Corfu,w) t 2, Cov ( V, W

)

we can

extend

to di U, t - . - tan Un t B

,

where dis-

-, 2n , B

are constants and Ui. ..

, Un are

random

variables . We

have

Pcd

, U, t . - t an Unt

Bl

w

)

= 2

, PCU, I w

)

t

Pla

Ust . .. tank

I

w

)

t

B

by

= 2

, PCU , I w

)

t da P cuz I w

)

t Pks Ust. .. tan Un

I

w

)

t B

'

,

by

again

=L

, PCU, Iw

)

t 2. Plus 1W) t

. . .. t

Lnp

( Un l w) t

B

.

If the

Ui

" are

equal

to the Wi

" ,

then

we

have

P

(

2 , Wit . .. t 2h Wnt B

I

w

)

= di Wit.. . tan Wnt p

By

property

① , Pca , W, t. ..tan Wnt

Btw

)

= 4 , PCW, 1W) t - attn PCW . I w

)

+

P

.

So it is sufficient to

show

that

PCWi 1W

)

= W

, for it. ... , n .

But

Pcwilw

)

= ECW

;)

t

at

( W - Mw

)

, where a

satisfies

Ta =

Cov

Cwi

, W

)

.

-T

But Cov( Wi, W) is

just the

(htt - i)

" column

of T .

So

the

solution to Ta = Inti - it

th column of T is a = ( O, - .. , O, I . , O,- . . , O ) T

t

(htt - i) th

component

So

Pcwilw

)

= E[Wi

)

t

(

Wi -Mi

)

=

Wi

(6)

If U is

uncorrelated

with

W then we

have

p ( ul w

)

= ECU) if Carl Usw

)

=

Or

vector of n zeroes

.

We

have

PCUIW) = Ecu)

t.at

( W

-Mw) , where a

satisfies

Ta = Corcu, w

)

= O . The solution to

this

is

a = O .

(7)

We have

discussed the

linear

predictor

PC . I 'w

)

and its

properties

.

In

the

context of time series we are

mostly

interested

in

forecasting

.

Forecasting

In PC Y l w

)

, if we take Y = Xnth.

and

W = ( Xn , . . , X ,

)T

,

where

{

Xd

is a

stationary

time series

,

then

linear

prediction

in

this

context is

called

forecasting

. The

authors

use

the

notation

Pn

Xanth

to mean

Plxnth I

Xn , . . , X.

)

- -T refers to

[

this is the

the number of random variable

past values of we are

trying

to

the time series

predict

we are

basing

the

prediction

on .

Examples

one -

step

prediction

of an ARCH

, 101<1 We want

Pu

Xu , = P ( Xnti

I

Xn , . -y X ,

)

.

Methodic

Direct

method)

. We have

T

=

18×10

) 8×41 AN

ii.

K Cn-i

)

=

IoT

. . . .

:*

:*

:

:

:

" :

:

::÷÷÷

:

' i , I \ ,

t.im

, '

i'

'

¥

:

:

i ' . . . 8×4) i 0"' . . 02 0 I Also , 8; = Cov

(

Xnti , Xn+, -j

)

= 8 ,

Cj

)

=

¥070

" so 8 =

7¥10

0 ' . ..

0h

)

' S.

Ta

= 8 is

given

by

i

:÷:÷÷

:

:

'

"

(8)

Note that

the

first

column

of T

multiplied

by

0 is 8

.

Therefore , we can see

by

inspection

that

the

solution

is

o

, . . . . ,

of

. Then

the

best linear

predictor

is

'

Ii

.ie#+ita

IT

. Here , E[ Xi

)

' - o for all i , and so we

get

Pan

Xue, I Xn as

the

best linear

predictor

of

Xnt

,

.

Methods

( use the

properties

of PL. I w

)

)

. We have

Xht

,

= 0 Xn t

Zhu

,

where {

Zt

}

is a zero mean WNCO

'

)

process. .

Therefore

,

Pn

Xn+, =

Pn

( 0Xn t Zinta

)

=

0Pa

Xn t Pu Zhi

,

by

linearity

of

the

prediction operator

.

=

01

Xn t P

,

Zn+

,

by

property

= 0 Xn t O

by property

=-D

Xn

Example

h

-step

prediction

of ARCH process

with

101<1

, h 21 We want Pu Xu+h = Pn ( 0Xnth -i t 2-nth

)

= of Pn X nth-I = of' Pm Xn th -2 ( s =

01h

pm

Xn =

Oh

Xn

Example

One

step

prediction

for an AR

Cp

)

process .

Let { Xt

)

be a zero mean ,

stationary

process

satisfying

Xt

= 0 ,

Xt

-i t 02

Xt

-z t - - t

Op

Xt -p t Zt ,

(9)

where

{ Zt

}

is a zero- mean WN (

04 process . we wish to

predict

X.n+, in terms of Xn

, - -, X, . That is , we

wish

to

compute

Pn X. + , = Pn

(

0 , Xn t . -u t

0pXn+

,

.pt

2-ht.

)

= 0 , Pn Xu t . . . t

Op

Pn Xin +,

.pt

Pn

Zn+ i = Of , Xu t.. . . t

Op

Xnt , -p t Pn Zn't ,

If { Xt

)

is

causal

( we have not

discussed

conditions

for an AR

Cp

)

process to be

causal

)

,

then

we

get

that Pn

Zn+

, = o

and

then

Ph

Xu+ , = 0 , Xn t . . . t p Xnti -p .

Note

If

14+3

is a

stationary

process with nonzero mean

,

say

it ,

then

we can write

Yt

= Xt tu ,

where

{

Xfl

is

stationary

and zero - mean .

Then Pn Ynth = Pn (Xn

th t m

)

=

Pn

Xn+h t Ph M

= Pn

Xanth t M

Hee Pn

Ynth

=

Plinth

I Yn . . .

, Y,

)

.

So

above

we

have

that

PC

Ynth l Yn . . . ,

Y

.

)

=P(

Xn

th l

Ya

, . . , Y,

)

t M .

However

, PC Xnti I Yn, .. ,

Y

,

)

=P ( Xnt,

I

Xn

, ..

, X,

)

,

because

the

set of all

possible

linear

predictors

in terms of Tn

, . .

, Y, is

the

same as

the

set

of all linear

predictors

in terms of Xn

, . . , X, .

Therefore

,

PC

Ynth l Ya , . . , Y,

)

= PC Xnt , I Xn , . . , X,

)

t M . That is ,

the

best linear

predictor

of

Ynth

in terms of Yn

,.

, Y, is

the

best linear

predictor

of Xnth based on Xn

,

-y X,

plus

the mean M

. Thus , it is

sufficient

to restrict attention to

only

zero -mean

processes when

doing

prediction

fo

References

Related documents

Looking at this from the Brazilian context we can see the manifestations of LGBT Jewish groups in Brazil, such as JGBR, Gay Jews in Brazil, Homoshabbat, Keshet Ga’avah in

01-Oct-2018 Version 17 Changed Overview, Standard Features, Preconfigured Models, Configuration Information, Core Options, Additional Options, and Memory sections were updated.

This paper describes our experiences using Active Learning in four first-year computer science and industrial engineering courses at the School of Engineering of the Universidad

The authors argue for a discipline- differentiated approach to weeding academic library collections that can employ quantitative criteria for disciplines, such as in the sciences,

bound fractions of the sediment was not suffi- ciently altered by intrinsic properties of either pri- mary particle size or surface coating to result in differences in the route

Such a collegiate cul- ture, like honors cultures everywhere, is best achieved by open and trusting relationships of the students with each other and the instructor, discussions

Improvement in oral health care and oral hygiene habits is essential to promote better oral health and quality of life among the institutionalized elderly.. The

These test data, and those from many other researchers, highlight the benefits to be obtained in terms of reducing sulphate attack and chloride ion penetration from incorporating