• No results found

Inference For High-Dimensional Repeated Measures and Split-Plot-Designs

N/A
N/A
Protected

Academic year: 2021

Share "Inference For High-Dimensional Repeated Measures and Split-Plot-Designs"

Copied!
21
0
0

Loading.... (view fulltext now)

Full text

(1)

Split-Plot-Designs

Paavo Sattler and Markus Pauly

Ulm University Institute of Statistics [email protected]

EMS 2017 Helsinki

Paavo Sattler (Ulm) High-Dimensional Split-Plot-Designs July 2017

(2)

Introduction and Motivation

Motivating Example  Repeated Measures

Sleep-Laboratory Trial, Jordan et al. (2004)

I

Figure 1 Prostaglandin-D-synthase (β-trace) levels in 10 healthy

male subjects in a sleep-lab trial under dierent sleep conditions

(normal sleep / sleep deprivation / recovery sleep / REM sleep

deprivation) at the time points 24h, 4h, 8h, 12h, 16h, and 20h for 4

consecutive nights and days

(3)

Motivating Example  Repeated Measures

Sleep-Laboratory Trial, Jordan et al. (2004)

I

Figure 2 Prostaglandin-D-synthase (β-trace) levels in 10 healthy female subjects in a sleep-lab trial under dierent sleep conditions (normal sleep / sleep deprivation / recovery sleep / REM sleep deprivation) at the time points 24h, 4h, 8h, 12h, 16h, and 20h for 4 consecutive nights and days

Paavo Sattler (Ulm) High-Dimensional Split-Plot-Designs July 2017

(4)

Introduction and Motivation

Observations

a = 2 groups

n 1 = n 2 = 10 sample sizes

d = 24 time points (dimension) for each observation

⇒ High-dimensional setting

⇒ Classical multivariate procedures (as Hotelling's T 2 ) not applicable Factorial structure on repeated measures:

I

Factor I: 4 interventions (normal sleep / sleep deprivation / recovery / REM sleep deprivation)

I

Factor T: 6 measurements per day (at 24h, 4h, 8h, 12h, 16h, and 20h) Some questions of interest (null hypotheses)

I

H

0

( G): No gender eect

I

H

0

( T): No time eect

I

H

0

( GT): No interaction eect between gender and time

(5)

Split-Plot-Designs

Split-Plot-Design Model

I

X

ik

∼ N

d

i

, Σ

i

), µ

i

∈ R

d

, Σ

i

> 0, independent random vectors

I

i = 1, . . . , a independent groups (e.g., a = 2) with

I

k = 1, . . . , n

i

independent subjects/units in group i

I

N = P

a

i =1

n

i

total sample size

I

Note: Heteroscedasticity allowed: Σ

i

6= Σ

j

for all i 6= j.

General linear Hypotheses

I

Write µ = (µ

>1

, . . . , µ

>a

)

>

∈ R

ad

F

H

0

(T) : Tµ = 0, T = T

W

⊗ T

S

: proper projection contrast matrix

1

I

Some special cases

F

T

A

= P

a

d1

J

d

- No group eect

F

T

AT

= P

a

⊗ P

d

- No interaction eect time×group

J d := 1 d ×d P d := I d − 1/d · J d

1

T = H

>

(HH

>

)

H unique for contrast matrix H with T

>

= T and T

2

= T.

Paavo Sattler (Ulm) High-Dimensional Split-Plot-Designs July 2017

(6)

High-Dimensional Split-Plot-Designs

Test Statistics and Asymptotics

Test Statistic based on

Q N = N · X > TX, where

X = (X > 1 , ..., X > a ) > denotes the vector of pooled group mean vectors Asymptotic Frameworks

(i) a xed and min(d, n

1

, . . . , n

a

) → ∞, (ii) d xed and min(a, n

1

, . . . , n

a

) → ∞, (iii) or even min(a, d, n

1

, . . . , n

a

) → ∞.

Surprising result

W f N = Q N − E H

0

(Q N ) pVar H

0

(Q N )

behaves similar under all frameworks, if H 0 is fullled.

(7)

Test Statistics and Asymptotics

Under H 0 it holds

Q N = d

ad

X

`= 1

λ ` C ` E (Q N ) = tr(TV N ) Var(Q N ) = 2 tr TV N ) 2 

with C ` i .i .d .

∼ χ 2 1 and λ ` are the eigenvalues of TV N T, where V N := L a

i = 1 N n

i

Σ i .

Paavo Sattler (Ulm) High-Dimensional Split-Plot-Designs July 2017

(8)

High-Dimensional Split-Plot-Designs

Test Statistics and Asymptotics

Theorem

Under each of the asymptotic frameworks (i)-(iii) it holds under H 0 ( T) : Tµ = 0

(a) W f N −→ U ∼ N( d 0, 1) if β 1 → 0, (b) W f N −→ C ∼ d χ

21

1

2 if β 1 → 1, where β 1 = λ max /

q P ad

`= 1 λ 2 ` is the largest standardized EV of TV N T, where V N := L a

i = 1 N

n

i

Σ i .

(9)

Challenges

I

a, d and all n

i

have inuence on the EV and β

1

I

Ratio-consistent estimation of E

H0

(Q

N

) , Var

H0

(Q

N

) and other traces of certain powers are complicated.

I

In particular, the asymptotic frameworks (i)-(iii) have to be taken into account.

I

Large numbers are required

Paavo Sattler (Ulm) High-Dimensional Split-Plot-Designs July 2017

(10)

Better Approximation

Dierent Approximation

Idea of Pauly et al. (2015):

Approximation with sequence of distributions K f such that both cases are covered, i.e.

K f = χ 2 f − f

√ 2f

−→ d

( N( 0, 1) if f → ∞

χ √

21

− 1

2 if f → 1

f P = tr 3 ( TV N ) 2  / tr 2 ( TV N ) 3 

(even ts skewness) Advantage: τ P = 1/f P → 0 ⇐⇒ β 1 → 0 and

τ P → 1 ⇐⇒ β 1 → 1.

(11)

Dierent Approximation

Idea of Pauly et al. (2015):

Approximation with sequence of distributions K f such that both cases are covered, i.e.

K f = χ 2 f − f

√ 2f

−→ d

( N( 0, 1) if f → ∞

χ √

21

− 1

2 if f → 1

f P = tr 3 ( TV N ) 2  / tr 2 ( TV N ) 3 

(even ts skewness) Advantage: τ P = 1/f P → 0 ⇐⇒ β 1 → 0 and

τ P → 1 ⇐⇒ β 1 → 1.

Paavo Sattler (Ulm) High-Dimensional Split-Plot-Designs July 2017

(12)

Better Approximation

Proposed Estimators

I

Estimators of symmetrized U-statistics-type

⇒ Leads to ratio-consistent estimators for all frameworks (i)-(iii) such that

W

N

= Q

N

− b E

H0

(Q

N

) q

Var d

H0

(Q

N

)

fullls W

N

− f W

N

= o

p

( 1) under (a) and (b).

(13)

Proposed Estimators for τ P

Let be

Z (`

1,1

,`

1,2

,...,`

a,2

) :=

r N

n 1 X 1,`

1,1

− X 1,`

1,2

 >

, . . . , r N

n a X a,`

a,1

− X a,`

a,2

 >

! >

TZ (`

1,1

,`

1,2

,...,`

a,2

) ∼ N ad ( 0 ad , 2TV N T)

Paavo Sattler (Ulm) High-Dimensional Split-Plot-Designs July 2017

(14)

Better Approximation

Proposed Estimators for τ P

and

Λ 1 (` 1,1 , . . . , ` 6,a ) = Z (`

1,1

,`

2,1

,...,`

1,a

,`

2,a

) > TZ (`

3,1

,`

4,1

,...,`

3,a

,`

4,a

) , Λ 2 (` 1,1 , . . . , ` 6,a ) = Z (`

3,1

,`

4,1

,...,`

3,a

,`

4,a

) > TZ (`

5,1

,`

6,1

,...,`

5,a

,`

6,a

) , Λ 3 (` 1,1 , . . . , ` 6,a ) = Z (`

5,1

,`

6,1

,...,`

5,a

,`

6,a

) > TZ (`

1,1

,`

2,1

,...,`

1,a

,`

2,a

) ,

E

3

Y

m= 1

Λ m (` 1,1 , . . . , ` 6,a )

!

= 8 tr 

( TV N ) 3



(15)

Proposed Estimators for τ P

C 1 =

n

1

P

`1,1,...,`6,1=1

`1,16=...6=`6,1

· · · P n

a

`1,a,...,`6,a=1

`1,a6=...6=`6,a

Λ

1

(`

1,1

,...,`

6,a

)·Λ

2

(`

1,1

,...,`

6,a

)·Λ

3

(`

1,1

,...,`

6,a

) 8· Q

a

i =1 ni !

(

ni −6

)

!

I

E (C

1

) = tr 

( TV

N

)

3



I

Ratio-consistent in framework (i)-(iii) if p > 1 exists with min(n

1

, ..., n

a

) = O(a

p

) .

I

Similar but even more complicated we can nd a ratio-consistent estimator C

2

in framework (i)-(iii).

I

Q

a i =1 ni!

(ni−6)!

summations

⇒ Subsampling version C

1?

, C

2?

of this estimators should be used.

Paavo Sattler (Ulm) High-Dimensional Split-Plot-Designs July 2017

(16)

Better Approximation

Dierent Approximation

With other estimators results in ratio-consistent estimators τ b p for τ P

and bf P = 1/b τ P . Theorem

In framework (i)-(iii) it holds under H 0 ( T) : Tµ = 0:

χ 2

f b

P

− b f P q

2bf P

−→ d

( N( 0, 1) if τ P → 0

χ √

21

− 1

2 if τ P → 1

Faster convergence

No decision between both asymptotic distributions

Moreover: Behaviour of β 1 checkable via b τ p

(17)

Simulations for β 1 → 0  in case of α = 5%

n 1 = 20 and n 2 = 30 runs=10000

Paavo Sattler (Ulm) High-Dimensional Split-Plot-Designs July 2017

(18)

Summary

Simulations for β 1 → 1  in case of α = 5%

(19)

Conclusion

Asymptotic test for framework (i)-(iii) Estimators for framework (i)-(iii)

Multivariate normal distribution is the main requirement Extension of existing approaches require β 1 → 0, like ? 2 Even if lim β 1 → ρ ∈ ( 0, 1) we had some results

In the sleep trial H 0 ( T) can be rejected

2

Chen, S. X. and Qin, Y.-L. (2010). A two-sample test for high-dimensional data with applications to gene-set testing. Ann. Statist., 38(2):808-835.

Paavo Sattler (Ulm) High-Dimensional Split-Plot-Designs July 2017

(20)

Summary

References

Chen, S. X. and Qin, Y.-L. (2010). A two-sample test for high-dimensional data with applications to gene-set testing. Ann. Statist., 38(2):808835.

Pauly, M., Ellenberger, D., and Brunner, E. (2015). Analysis of high-dimensional

one group repeated measures designs. Statistics, 49:12431261.

(21)

Finally

Thanks for your attention!

Paavo Sattler (Ulm) High-Dimensional Split-Plot-Designs July 2017

References

Related documents

In Stewart, the claimant sustained an admittedly compensable injury on August 8, 2001, and he last received medical treatment for the injury on June 19, 2003. In October 2003,

dikarenakan penggunaan augmented reality sangat menarik dan telah banyak digunakan dalam kehidupan kita. Salah satu contohnya seperti pada strategi pemasaran dan pengenalan

This model aims to minimize the total cost by minimizing the different cost factors associated with the process of selecting suppliers, operational costs of warehous- ing, and

cluster em This function accepts as input the data, the number of components of the distribution, and the type of algorithm to be used. The algorithms avail- able include

This power dis- sipation is relatively independent of gate drive as long as the gate-source voltage exceeds the threshold voltage by several volts and an elaborate drive circuit

A prototypical set of image manipulation and transformation processes using sample Unmanned Airborne System (UAS) data acquired from NASA Ames Research Center were developed to

Changes in gross margins (in %) due to the cumulative impacts of climate change, adaptation, price and policy changes, and technological development for the A1/W+ and B2/G