• No results found

A Multivariate Student’s t Distribution

N/A
N/A
Protected

Academic year: 2020

Share "A Multivariate Student’s t Distribution"

Copied!
8
0
0

Loading.... (view fulltext now)

Full text

(1)

http://dx.doi.org/10.4236/ojs.2016.63040

A Multivariate Student’s

t

-Distribution

Daniel T. Cassidy

Department of Engineering Physics, McMaster University, Hamilton, ON, Canada

Received 29 March 2016; accepted 14 June 2016; published 17 June 2016

Copyright © 2016 by author and Scientific Research Publishing Inc.

This work is licensed under the Creative Commons Attribution International License (CC BY).

http://creativecommons.org/licenses/by/4.0/

Abstract

A multivariate Student’s t-distribution is derived by analogy to the derivation of a multivariate normal (Gaussian) probability density function. This multivariate Student’s t-distribution can have different shape parameters νi for the marginal probability density functions of the

multi-variate distribution. Expressions for the probability density function, for the variances, and for the covariances of the multivariate t-distribution with arbitrary shape parameters for the marginals are given.

Keywords

Multivariate Student’s t, Variance, Covariance, Arbitrary Shape Parameters

1. Introduction

An expression for a multivariate Student’s t-distribution is presented. This expression, which is different in form

than the form that is commonly used, allows the shape parameter ν for each marginal probability density

function (pdf) of the multivariate pdf to be different. The form that is typically used is [1]

(

)

(

)

( )( )

(

[ ]

[ ]

)

( )2 T 1

2

2

1 .

2 π

n n

n

x x

ν ν

ν ν

− + −

Γ +

+ Σ

Γ Σ (1)

This “typical” form attempts to generalize the univariate Student’s t-distribution and is valid when the n

marginal distributions have the same shape parameter ν . The shape of this multivariate t-distribution arises

from the observation that the pdf for

[ ] [ ]

x = y σ is given by Equation (1) when

[ ]

y is distributed as a

multivariate normal distribution with covariance matrix

[ ]

Σ and σ2 is distributed as chi-squared.

The multivariate Student’s t-distribution put forth here is derived from a Cholesky decomposition of the scale

(2)

given in Section 2 to provide background. The multivariate Student’s t-distribution and the variances and covariances for the multivariate t-distribution are given in Section 3. Section 4 is a conclusion.

2. Background Information

2.1. Cholesky Decomposition

A method to produce a multivariate pdf with known scale matrix

[ ]

Σs is presented in this section. For nor-

mally distributed variables, the covariance matrix

[ ]

Σ = Σ

[ ]

s since the scale factor for a normal distribution is

the standard deviation of the distribution. An example with n=4 is used to provide concrete examples.

Consider the transformation

[ ] [ ][ ]

y = M x where

[ ]

y and

[ ]

x are 4 1× column matrices,

[ ]

M is 4 4×

square matrix, and the elements of

[ ]

x are independent random variables. The off-diagonal elements of

[ ]

M

introduce correlations between the elements of

[ ]

y .

1,1

1 1

2,1 2,2

2 2

3,1 3,2 3,3

3 3

4,1 4,2 4,3 4,4

4 4

0 0 0

0 0

0

m

m m

m m m

m m m m

 

   

 

   

 

 =  

 

   

 

   

 

    

y x

y x

y x

y x

(2)

The scale matrix

[ ] [ ][ ]

Σ =s M M T. The covariance matrix

[ ]

Σ has elements Σ =i j, E

{ }

y yi j where

{ }

E y yi j is the expectation of y yi j and Σ = Σi j, j i, . If the

[ ]

x are normally distributed, then

[ ]

[ ] [ ][ ]

T

s M M

Σ = Σ = , where the superscript T indicates a transpose of the matrix. If

[ ]

Σs is known, then

[ ]

M is the Cholesky decomposition of the matrix

[ ]

Σs [2]. For the 4 4× example of Equation (2),

[ ]

2

1,1 1,1 2,1 1,1 3,1 1,1 4,1

2 2

1,1 2,1 2,1 2,2 2,1 3,1 2,2 3,2 2,1 4,1 2,2 4,2

2 2 2

1,1 3,1 2,1 3,1 2,2 3,2 3,1 3,2 3,3 3,1 4,1 3,2 4,2 3,3 4,3

1,1 4,1 2,1 4,1 2,2 4,2 3,1 4,1 3,2 4,2 3,3 s

m m m m m m m

m m m m m m m m m m m m

m m m m m m m m m m m m m m m

m m m m m m m m m m m

+ + +

Σ =

+ + + + +

+ + + 2 2 2 2

4,3 4,1 4,2 4,3 4,4

.

m m m m m

 

 

 

 

 

+ + +

 

 

(3)

From linear algebra, det

(

[ ]

MMT =

)

det

[ ] [ ]

M det M T=

(

det

[ ]

M

)

2

  . For

[ ]

M as defined in Equation (2),

[ ]

1,1 2,2 3,3 4,4

det M =m m m m and

[ ]

2 2 2 2

(

[ ]

)

2

1,1 2,2 3,3 4,4

det Σ =s m m m m = det M whereas

{

}

2 , E

i i σ

Σ = =

i

i i y

y y is the va-

riance of the zero-mean random variable yi and

{ }

2 , E

i j σ

Σ = =

i j

i j y y

y y is the covariance of the zero-mean

random variables yi and yj.

2.2. Multivariate Normal Probability Density Function

To create a multivariate normal pdf, start with the joint pdf fN

( )

[ ]

x for n unit normal, zero mean, independent random variables

[ ]

x :

[ ]

( )

( )

[ ] [ ]

T 2

1

1 1 1 1

exp exp

2 2

2π

n

i

n i

f x x x x

=

   

= =

   

N (4)

where

[ ]

x is an n-row column matrix:

[ ]

xT=

[

x x1, 2,,xn

]

. fN

( )

[ ]

x d dx x1 2dxn gives the probability that the random variables

[ ]

x lie in the interval

[ ] [ ] [ ] [ ]

x < xx + dx .

The requirement for zero mean random variables is not a restriction. If E

{ }

xii, then xi′ =xi−µi is a

zero mean random variable with the same shape and scale parameters as xi.

Use Equation (2) to transform the variables. The Jacobian determinant of the transformation relates the products of the infinitesimals of integration such that

(

)

(

1 2

)

2 1 2

1 2

, , ,

d d d d d d .

, , ,

n

i n n

n

x x x

x x x y y y

y y y

∂ =

 

(3)

The magnitude of the Jacobian determinant of the transformation

[ ] [ ][ ]

y = M x is (Appendix)

(

)

(

11 22

)

[ ]

[ ]

, , , 1 1

, , , det det

n

n s

x x x

y y y M

= =

Σ

 (6)

where the equality det

[ ]

Σ =s

(

det

[ ]

M

)

2 has been used. Since

[ ] [ ][ ]

Σ =s M M T,

[ ]

s 1

( )

[ ]

M T 1

[ ]

M 1

− −

Σ = , and since

[ ] [ ] [ ]

x = M −1 y , the multivariate “z-score”

[ ] [ ]

T

x x becomes

[ ] [ ]

y T

( )

M T 1

[ ] [ ] [ ]

M 1 y yT

[ ] [ ]

s 1 y

= Σ , which equals

[ ] [ ] [ ]

yT Σ −1 y since

[ ] [ ]

Σ = Σs for normally distributed variables.

The result is that the unit normal, independent, multivariate pdf, Equation (4), becomes under the trans- formation Equation (2)

[ ]

( )

( )

[ ]

[ ]

[ ] [ ]

1 T

1 1

exp 2

2π detn s

s

f y = y Σ − y

 

Σ

N (7)

where

[ ]

y is a n-row column matrix:

[ ]

y T =

[

y y1, 2,,yn

]

and

[ ] [ ]

Σ = Σs . For the 4 4× example,

[ ]

1,1

2,1

1,1 2,2 2,2

1

3,2 2,1 3,1 2,2 3,2

1,1 2,2 3,3 2,2 3,3 3,3

4,3 3,2 4,2 3,3 4,3 1

1,4

2,2 3,3 4,4 3,3 4,4 4,4

1

0 0 0

1

0 0

, 1

0

1

m m

m m m

M

m m m m m

m m m m m m

m m m m m

m

m m m m m m

 

 

 

 

 

 

= 

 

 

 

 

(8)

from which

[ ]

s 1

( )

[ ]

M T 1

[ ]

M 1

− −

Σ = can be calculated. In Equation (8),

2,1 3,2 4,3 4,2 2,1 3,3 2,2 3,1 4,3 4,1 2,2 3,3 1

1,4

1,1 2,2 3,3 4,4

.

m m m m m m m m m m m m

m

m m m m

= − − − +

(9)

The denominator in the expression for m1,41

is det

[ ]

M .

3. Multivariate Student’s t Probability Density Function

A similar approach can be used to create a multivariate Student’s t pdf. Assume truncated or effectively truncated

t-distributions, so that moments exist [3][4]. For simplicity, assume that support is

[

µ−bβ µ, +bβ

]

where b

is a positive, large number, β is the scale factor for the distribution, and µ is the location parameter for the distribution. If b is a large number, then a significant portion of the tails of the distribution are included. If

b= ∞ then all of the tails are included.

Start with the joint pdf for n independent, zero-mean (location parameters

[ ]

µ =0) Student’s t pdfs with shape parameters

[ ]

ν , and scale parameters

[ ]

β =1:

[ ] [ ]

(

)

(

(

(

)

)

)

2 ( 1 2)

(

)

1 1

1 2

; 1 ;

2 π

i

n n

i i

i i i

i i i i i

x

f x g x

ν ν

ν ν

ν

ν ν

− +

= =

Γ +  

= + =

Γ

t (10)

with −∞ ≤xi≤ +∞. ft

(

[ ] [ ]

x

)

d dx x1 2dxn gives the probability that a random draw of the column matrix

(4)

Use the transformation of Equation (2) to create a multivariate pdf

[ ] [ ]

(

)

[ ]

(

(

(

)

)

)

( 1 2)

2 1 , 1 1 1 2 1

; 1 .

det 2 π

i

i i j j n

j i

i i i i

m y f y M ν ν ν ν ν ν − + − = =   

Γ + 

= +

Γ

 

 

t (11)

The solution 1 ,1

i i j i j j

x =

=m y− of the transformation Equation (2) was used. The elements of the inverse

matrix

[ ]

M −1, 1 , i j

m− , are given in terms of the mi j, by Equation (8) for the n=4 example. Note that the

shape parameters νi of the constituent distributions need not be the same in the multivariate t-distribution

given by ft

(

[ ] [ ]

y

)

.

[ ] [ ]

(

;

)

d d1 2 d n

ft y ν y yy gives the probability that a random draw of the column matrix

[ ]

y from the

multivariate Student’s t-distribution with shape parameters

[ ]

ν lies in the interval

[ ] [ ] [ ] [ ]

y < yy + dy . From the definition of the exponential function ex =limn→∞

(

1+x n

)

n where e=2.718281828 is Euler’s number, then ( )

(

)

1 2 2 2

lim 1 t exp t 2

ν ν ν − + →∞   + = −  

  (12)

and

[ ] [ ]

(

)

[ ]

[ ]

( )

( )

[ ]

[ ]

[ ] [ ]

[ ]

( )

2 1 , 1 1 2 1 , 1 1 1 T 1 1

lim ; exp 0.5

det 2π 1 exp 0.5 det 2π 1 1 exp 2 2π det . n i

i j j j i

n i i j j n

i j

s n

s

f y m y

M m y M y y f y ν ν − →∞ = = − = = −     = −        = −        

= − Σ

  Σ =

∑∑

t N (13)

In the limit as

[

ν → ∞

]

, the multivariate Student’s t-distribution ft

(

[ ] [ ]

y

)

, Equation (11), becomes a multivariate normal distribution.

3.1. Some Σi j, for the

n

=

4

Example

In this subsection some examples for the variances and covariances of a multivariate Student’s t-distribution

using the n=4 example of Equation (2) are given.

The variance of the random variable y3 is

{ }

2 2

(

[ ] [ ] [ ]

1

)

3 3 ; d d d d4 3 2 1

E y =

∫∫∫∫

y ft My ν y y y y (14)

with the limits of the integrations equal to µibβi and µi+bβi, i=1, 2, 3, 4.

Perform the integrations as listed. The integral over dy4 is unity since only x4 depends on y4 (c.f. Equation

(2)) and ft

(

[ ] [ ]

x

)

factors into a product g1

( ) ( ) ( ) ( )

x g1 2 x2 g3 x3 g4 x4 —see Equation (10). Write

1 1 1 1

4 4,4 4 4,3 3 4,2 2 4,1 1

x =my +m y− +my +m y− (15)

1 4,4 4 4

my µ

= − (16)

where the mi j,1

are the elements of the inverse of matrix

[ ]

M and are as given by

[ ]

M −1, Equation (8), and

4

µ is a constant as far as the integral over y4 is concerned.

Repeat the procedure for the integrals for dy3, dy2, and dy1. These integrals are not equal to unity owing to

the presence of the 2

3

(5)

The variance of the random variable yi for the multivariate Student's t-distribution with support

[

−∞ ∞,

]

and with ν νi = for all i is given by

{ }

2 2 2

,

1 2

i j j j

E σ m ν

ν =

= =

i

i y

y (17)

The expression for σ2

i

y is valid only for ν>2. The expression would be valid for ν ≥1 if the region of

support was

[

µibβ µi, i+bβi

]

rather than

[

−∞ ∞,

]

where βi is a scale factor and b< ∞ [3]-[5]. Note

that the scale factors for the multivariate t-distribution are βi= mi i, .

Truncation or effective truncation of the pdf keeps the moments finite [3]-[5]. For example, the second central

moment for a ν =1 Student’s t-distribution with scale factor β and support

[

µ−bβ µ, +bβ

]

is

(

)

2

(

( )

( )

)

2

arctan

1, , ,

arctan

b b

b

b

µ ν = β =β × − (18)

which is finite provided that b< ∞.

In the interest of brevity, only variances and covariances that were calculated for support of

[

−∞ ∞,

]

will be

discussed. The requirement that ν >i 2 will be understood to be waived if the pdf is truncated or effectively

truncated. It is also to be understood that the variances and covariances as calculated for support of

[

−∞ ∞,

]

provide upper limits for variances and covariances calculated for truncation or effective truncation of the pdf. If the νi are not equal, then for the n=4 example of Equation (2)

{ }

2 1 2 2 2 3 2

3 3,1 3,2 3,3

1 2 3

.

2 2 2

E ν m ν m ν m

ν ν ν

= × + × + ×

− − −

y (19)

The covariance E

{

y y2 3

}

for the ν νi = for all i is given by

{

2 3

}

(

2,1 3,1 2,2 3,2

)

E .

2

m m m m ν

ν

= +

y y (20)

If the νi are not equal, then the covariance E

{

y y2 3

}

{

}

1 2

2 3 2,1 3,1 2,2 3,2

1 2

E .

2 m m 2 m m

ν ν

ν ν

= × + ×

− −

y y (21)

The expression for E

{

y y1 3

}

, which is valid for the νi not equal, is

{

}

1

1 3 1,1 3,1 1

E .

2 m m

ν ν

= ×

y y (22)

The expressions for E

{

y y3 3

}

, E

{

y y2 3

}

, and E

{

y y1 3

}

show a simple pattern for the relationship between

the covariance matrix Σ, the scale matrix

[ ]

Σs Equation (3), and the matrix

[ ]

M Equation (2).

3.2. General Expressions for Σi j,

Given a matrix

[ ]

M that is an n×n square matrix with elements mi j, , an expression for the variance (assuming support

[

−∞ ∞,

]

, ν >i 2 for all i, and in) for the multivariate Student’s t-distribution ft

(

[ ] [ ]

y

)

is

{ }

2 2

, 1

E .

2 i

j j j

j j

m ν ν

=

= ×

i

y (23)

A general expression for the covariance (assuming support

[

−∞ ∞,

]

, ν >i 2 for all i, and in j, ≤n) for the multivariate Student’s t-distribution ft

(

[ ] [ ]

y

)

is

{ }

min ,( )

, , 1

E .

2

i j k

i k j k k k

m m

ν ν =

= ×

i j

y y (24)

If support is

[

µ−bβ µ, +bβ

]

, then the general expressions need to be multiplied by functions that depend

on b and ν . Truncation or effective truncation keeps the moments finite and defined for all ν ≥1 [3]-[5]. The

general expressions for the covariance, Equation (24), yields, when i= j, the general expression for the

variance, Equation (23). The general expression for the variance, Equation (23), is given to emphasize the 2

, j j

(6)

nature of the variance.

Unlike normally distributed random variables, the correlation matrix

[ ]

Σ for random variables that are

distributed as Student’s t is not equal to

[ ][ ]

M M T. For normally distributed variables, the scale parameter β

equals the standard deviation σ. For Student’s t distributed variables, the standard deviation σ does not

equal the scale parameter β. For a Student’s t distribution with shape parameter ν , scale parameter β, and

support

[

−∞ ∞,

]

, 2 2

(

)

2

σ =β ν ν× − . If the region of support for the Student’s t distribution is truncated to

[

µ−bβ µ, +bβ

]

then the variance 2 2

(

)

2

σ <β ν ν× − for all ν ≥2 and is finite for all ν ≥1 [3]-[5].

Given a matrix of the variances and the covariances,

[ ]

Σ , and a column matrix of the shape parameters

[ ]

ν

associated with each variable, the scale matrix

[ ] [ ][ ]

Σ =s M M T would in principle be determined sequentially, starting with m1,1 and m1,2. The shape parameters

[ ]

ν would be obtained from the marginal distributions or from other knowledge.

4. Conclusion

A multivariate Student’s t-distribution is derived by analogy to the derivation for a multivariate normal (or

Gaussian) pdf. The variances and covariances for the multivariate t-distribution are given. It is noteworthy that

the shape parameters

[ ]

ν of the constituent Student’s t-distributions of the multivariate t-distribution, Equation (11), need not be the same.

Acknowledgements

This work was funded by the Natural Science and Engineering Research Council (NSERC) Canada.

References

[1] Kotz, S. and Nadarajah, S. (2004) Multivariate t-Distributions and Their Applications. Cambridge University Press, Cambridge.

[2] Press, W.H., Teukolsky, S.A., Vetterling, W.T. and Flannery, B.P. (1992) Numerical Recipes in FORTRAN: The Art of Scientific Computing. 2nd Edition, Cambridge University Press, Cambridge, 89.

[3] Cassidy, D.T. (2011) Describing n-Day Returns with Student’s t-Distributions. Physica A, 390, 2794-2802.

http://dx.doi.org/10.1016/j.physa.2011.03.019

[4] Cassidy, D.T. (2012) Effective Truncation of a Student’s t-Distribution by Truncation of the Chi Distribution in a Mixing Integral. Open Journal of Statistics, 2, 519-525. http://dx.doi.org/10.4236/ojs.2012.25067

[5] Cassidy, D.T. (2016) Student’s t Increments. Open Journal of Statistics, 6, 156-171.

(7)

Appendix: The Jacobian

The Jacobian determinant is used in physics, mathematics, and statistics. Many of these uses can be traced to the

Jacobian determinate as a measure of the volume of an infinitesimially small, n-dimensional parallelepiped.

1. Volume of a Parallelepiped

The volume of an n-dimensional parallelepiped is given by the absolute value of the determinant of the com-

ponents of the edge vectors that form the parallelepiped.

The area of a parallelogram with edge vectors a and b is a b× .

The volume of a parallelepiped with edge vectors a=

(

a a a1, 2, 3

)

, b=

(

b b b1, 2, 3

)

, and c=

(

c c c1, 2, 3

)

is

given by the determinant

(

)

11 22 33

1 2 3

.

c c c a a a b b b

⋅ × =

c a b (25)

2. Inversion Exists

Assume that there are n functions xi = f q qi

(

1, 2,,qn

)

. The necessary and sufficient condition that the func-

tions can be inverted to find 1

(

)

1, 2, ,

i i n

q = fx xx is that the Jacobian determinant is nonzero, i.e.,

(

)

(

11 22

)

, , ,

0

, , ,

n

n

x x x q q q

≠ ∂

 (26)

where

(

)

(

)

1 1 1

1 2

1 2 2 2 2

1 2 1 2

. . .

, , ,

. . . .

, , ,

. . . .

n n

n n

x x x

q q q

x x x x x x

q q q q q q

∂ ∂ ∂

∂ ∂ ∂

∂ ∂ ∂

∂ ∂ ∂ ∂

 (27)

To simplify the notation, assume that n=3 so that xi= f q q qi

(

1, 2, 3

)

, i=1,, 3. The total differential is

1 2 3

1 2 3

d i d i d i d .

i

f f f

x q q q

q q q

∂ ∂ ∂

= + +

∂ ∂ ∂ (28)

These equations can be put in matrix form

1 1 1 1 2 3

1 1

2 2 2

2 2

1 2 3

3 3

3 3 3 1 2 3

d d

d d .

d d

f f f

q q q

x q

f f f

x q

q q q

x q

f f f

q q q

∂ ∂ ∂ 

 

   

  ∂ ∂ ∂  

=

   

   

   

 

 

(29)

These three equations can be solved for the dqi if the determinant of the 3 3× matrix is non-zero. This is a

standard result from linear algebra. The determinant of the 3 3× matrix is called the Jacobian determinant of

the transformation.

3. Change of Variables

(8)

(

)

(

1 2 3

)

1 2 3 1 2 3 1 2 3

, ,

d d d d d d d .

, ,

x x x

V x x x q q q

q q q

= =

∫∫∫

∫∫∫

∫∫∫

(30)

The absolute value sign is required since the determinant could be negative (i.e., the volume could decrease).

The Jacobian determinant for the inverse transformation (to obtain

[ ]

x as functions of

[ ]

y ) given by Eq-

uation (8) is

1

1,1 1

2,1

2 2

1,1 2,2 2,2

1 2

3,2 2,1 3,1 2,2 3,2

3 3 3

1 2 3 1,1 2,2 3,3 2,2 3,3 3,3

4 4 4 4 1 4,3 3,2 4,2 3,3 4,3

1,4

1 2 3 4 2,2 3,3 4,4 3,3 4,4 4,4

1

0 0 0

0 0 0

1

0 0

0 0

, 1

0 0

1

x

m y

m x x

m m m

y y

m m m m m

x x x

y y y m m m m m m

x x x x m m m m m

m

y y y y m m m m m m

− ∂

∂ ∂

∂ ∂

=

∂ ∂ ∂

∂ ∂ ∂

∂ ∂ ∂ ∂ −

∂ ∂ ∂ ∂

(31)

which equals

[ ]

1,1 2,2 3,3 4,4

1 1 1 1 1

. det

References

Related documents

He has been employed in mire wetland ecology for over 38 years and led more than ten major project of the Chinese Academy of Sciences and the National Fund for Natural

For more information visit the College Policies, Procedures and Guidelines webpage then click on the Academic Administration side tab and search for the document entitled

Figure 6.29: Pearlite interlamellar spacing versus salt bath heat treatment temperatures for the TW steel samples... 158 Figure 6.30: Electrical resistivity values versus

Thus every school must check the school furniture as standard anthropometric measurement which will lead for children a comfortable sitting and increase

Spray Vinyl &amp; Rubber Care on to the Perfect Polishing Cloth and treat the door and boot rubbers to prevent the rubbers sticking to the door frames and to give them a

the first time (3Q: one vessel at Greenland) Overview of present HFISK vessels We have included additional vessel values on the company equal to around NOK 1.3b HFISK

Alternatively, if the distal mGlu2 protomer is necessary for G i/o protein coupling, then the effect of the mGlu2/3 agonist LY379268 on Ca 2+ release stimulated by 5-HT 2A