Split-Plot-Designs
Paavo Sattler and Markus Pauly
Ulm University Institute of Statistics [email protected]
EMS 2017 Helsinki
Paavo Sattler (Ulm) High-Dimensional Split-Plot-Designs July 2017
Introduction and Motivation
Motivating Example Repeated Measures
Sleep-Laboratory Trial, Jordan et al. (2004)
I
Figure 1 Prostaglandin-D-synthase (β-trace) levels in 10 healthy
male subjects in a sleep-lab trial under dierent sleep conditions
(normal sleep / sleep deprivation / recovery sleep / REM sleep
deprivation) at the time points 24h, 4h, 8h, 12h, 16h, and 20h for 4
consecutive nights and days
Motivating Example Repeated Measures
Sleep-Laboratory Trial, Jordan et al. (2004)
I
Figure 2 Prostaglandin-D-synthase (β-trace) levels in 10 healthy female subjects in a sleep-lab trial under dierent sleep conditions (normal sleep / sleep deprivation / recovery sleep / REM sleep deprivation) at the time points 24h, 4h, 8h, 12h, 16h, and 20h for 4 consecutive nights and days
Paavo Sattler (Ulm) High-Dimensional Split-Plot-Designs July 2017
Introduction and Motivation
Observations
a = 2 groups
n 1 = n 2 = 10 sample sizes
d = 24 time points (dimension) for each observation
⇒ High-dimensional setting
⇒ Classical multivariate procedures (as Hotelling's T 2 ) not applicable Factorial structure on repeated measures:
I
Factor I: 4 interventions (normal sleep / sleep deprivation / recovery / REM sleep deprivation)
I
Factor T: 6 measurements per day (at 24h, 4h, 8h, 12h, 16h, and 20h) Some questions of interest (null hypotheses)
I
H
0( G): No gender eect
I
H
0( T): No time eect
I
H
0( GT): No interaction eect between gender and time
Split-Plot-Designs
Split-Plot-Design Model
I
X
ik∼ N
d(µ
i, Σ
i), µ
i∈ R
d, Σ
i> 0, independent random vectors
I
i = 1, . . . , a independent groups (e.g., a = 2) with
I
k = 1, . . . , n
iindependent subjects/units in group i
I
N = P
ai =1
n
itotal sample size
I
Note: Heteroscedasticity allowed: Σ
i6= Σ
jfor all i 6= j.
General linear Hypotheses
I
Write µ = (µ
>1, . . . , µ
>a)
>∈ R
adF
H
0(T) : Tµ = 0, T = T
W⊗ T
S: proper projection contrast matrix
1I
Some special cases
F
T
A= P
a⊗
d1J
d- No group eect
F
T
AT= P
a⊗ P
d- No interaction eect time×group
J d := 1 d ×d P d := I d − 1/d · J d
1
T = H
>(HH
>)
−H unique for contrast matrix H with T
>= T and T
2= T.
Paavo Sattler (Ulm) High-Dimensional Split-Plot-Designs July 2017
High-Dimensional Split-Plot-Designs
Test Statistics and Asymptotics
Test Statistic based on
Q N = N · X > TX, where
X = (X > 1 , ..., X > a ) > denotes the vector of pooled group mean vectors Asymptotic Frameworks
(i) a xed and min(d, n
1, . . . , n
a) → ∞, (ii) d xed and min(a, n
1, . . . , n
a) → ∞, (iii) or even min(a, d, n
1, . . . , n
a) → ∞.
Surprising result
W f N = Q N − E H
0(Q N ) pVar H
0(Q N )
behaves similar under all frameworks, if H 0 is fullled.
Test Statistics and Asymptotics
Under H 0 it holds
Q N = d
ad
X
`= 1
λ ` C ` E (Q N ) = tr(TV N ) Var(Q N ) = 2 tr TV N ) 2
with C ` i .i .d .
∼ χ 2 1 and λ ` are the eigenvalues of TV N T, where V N := L a
i = 1 N n
iΣ i .
Paavo Sattler (Ulm) High-Dimensional Split-Plot-Designs July 2017
High-Dimensional Split-Plot-Designs
Test Statistics and Asymptotics
Theorem
Under each of the asymptotic frameworks (i)-(iii) it holds under H 0 ( T) : Tµ = 0
(a) W f N −→ U ∼ N( d 0, 1) if β 1 → 0, (b) W f N −→ C ∼ d χ √
21− 1
2 if β 1 → 1, where β 1 = λ max /
q P ad
`= 1 λ 2 ` is the largest standardized EV of TV N T, where V N := L a
i = 1 N
n
iΣ i .
Challenges
I
a, d and all n
ihave inuence on the EV and β
1I
Ratio-consistent estimation of E
H0(Q
N) , Var
H0(Q
N) and other traces of certain powers are complicated.
I
In particular, the asymptotic frameworks (i)-(iii) have to be taken into account.
I
Large numbers are required
Paavo Sattler (Ulm) High-Dimensional Split-Plot-Designs July 2017
Better Approximation
Dierent Approximation
Idea of Pauly et al. (2015):
Approximation with sequence of distributions K f such that both cases are covered, i.e.
K f = χ 2 f − f
√ 2f
−→ d
( N( 0, 1) if f → ∞
χ √
21− 1
2 if f → 1
f P = tr 3 ( TV N ) 2 / tr 2 ( TV N ) 3
(even ts skewness) Advantage: τ P = 1/f P → 0 ⇐⇒ β 1 → 0 and
τ P → 1 ⇐⇒ β 1 → 1.
Dierent Approximation
Idea of Pauly et al. (2015):
Approximation with sequence of distributions K f such that both cases are covered, i.e.
K f = χ 2 f − f
√ 2f
−→ d
( N( 0, 1) if f → ∞
χ √
21− 1
2 if f → 1
f P = tr 3 ( TV N ) 2 / tr 2 ( TV N ) 3
(even ts skewness) Advantage: τ P = 1/f P → 0 ⇐⇒ β 1 → 0 and
τ P → 1 ⇐⇒ β 1 → 1.
Paavo Sattler (Ulm) High-Dimensional Split-Plot-Designs July 2017
Better Approximation
Proposed Estimators
I
Estimators of symmetrized U-statistics-type
⇒ Leads to ratio-consistent estimators for all frameworks (i)-(iii) such that
W
N= Q
N− b E
H0(Q
N) q
Var d
H0(Q
N)
fullls W
N− f W
N= o
p( 1) under (a) and (b).
Proposed Estimators for τ P
Let be
Z (`
1,1,`
1,2,...,`
a,2) :=
r N
n 1 X 1,`
1,1− X 1,`
1,2>
, . . . , r N
n a X a,`
a,1− X a,`
a,2>
! >
TZ (`
1,1,`
1,2,...,`
a,2) ∼ N ad ( 0 ad , 2TV N T)
Paavo Sattler (Ulm) High-Dimensional Split-Plot-Designs July 2017
Better Approximation
Proposed Estimators for τ P
and
Λ 1 (` 1,1 , . . . , ` 6,a ) = Z (`
1,1,`
2,1,...,`
1,a,`
2,a) > TZ (`
3,1,`
4,1,...,`
3,a,`
4,a) , Λ 2 (` 1,1 , . . . , ` 6,a ) = Z (`
3,1,`
4,1,...,`
3,a,`
4,a) > TZ (`
5,1,`
6,1,...,`
5,a,`
6,a) , Λ 3 (` 1,1 , . . . , ` 6,a ) = Z (`
5,1,`
6,1,...,`
5,a,`
6,a) > TZ (`
1,1,`
2,1,...,`
1,a,`
2,a) ,
E
3
Y
m= 1
Λ m (` 1,1 , . . . , ` 6,a )
!
= 8 tr
( TV N ) 3
Proposed Estimators for τ P
C 1 =
n
1P
`1,1,...,`6,1=1
`1,16=...6=`6,1
· · · P n
a`1,a,...,`6,a=1
`1,a6=...6=`6,a
Λ
1(`
1,1,...,`
6,a)·Λ
2(`
1,1,...,`
6,a)·Λ
3(`
1,1,...,`
6,a) 8· Q
ai =1 ni !
(
ni −6)
!I
E (C
1) = tr
( TV
N)
3I
Ratio-consistent in framework (i)-(iii) if p > 1 exists with min(n
1, ..., n
a) = O(a
p) .
I
Similar but even more complicated we can nd a ratio-consistent estimator C
2in framework (i)-(iii).
I
Q
a i =1 ni!(ni−6)!
summations
⇒ Subsampling version C
1?, C
2?of this estimators should be used.
Paavo Sattler (Ulm) High-Dimensional Split-Plot-Designs July 2017
Better Approximation
Dierent Approximation
With other estimators results in ratio-consistent estimators τ b p for τ P
and bf P = 1/b τ P . Theorem
In framework (i)-(iii) it holds under H 0 ( T) : Tµ = 0:
χ 2
f b
P− b f P q
2bf P
−→ d
( N( 0, 1) if τ P → 0
χ √
21− 1
2 if τ P → 1
Faster convergence
No decision between both asymptotic distributions
Moreover: Behaviour of β 1 checkable via b τ p
Simulations for β 1 → 0 in case of α = 5%
n 1 = 20 and n 2 = 30 runs=10000
Paavo Sattler (Ulm) High-Dimensional Split-Plot-Designs July 2017
Summary
Simulations for β 1 → 1 in case of α = 5%
Conclusion
Asymptotic test for framework (i)-(iii) Estimators for framework (i)-(iii)
Multivariate normal distribution is the main requirement Extension of existing approaches require β 1 → 0, like ? 2 Even if lim β 1 → ρ ∈ ( 0, 1) we had some results
In the sleep trial H 0 ( T) can be rejected
2
Chen, S. X. and Qin, Y.-L. (2010). A two-sample test for high-dimensional data with applications to gene-set testing. Ann. Statist., 38(2):808-835.
Paavo Sattler (Ulm) High-Dimensional Split-Plot-Designs July 2017
Summary
References
Chen, S. X. and Qin, Y.-L. (2010). A two-sample test for high-dimensional data with applications to gene-set testing. Ann. Statist., 38(2):808835.
Pauly, M., Ellenberger, D., and Brunner, E. (2015). Analysis of high-dimensional
one group repeated measures designs. Statistics, 49:12431261.
Finally
Thanks for your attention!
Paavo Sattler (Ulm) High-Dimensional Split-Plot-Designs July 2017