Implementation through quadratic programming

CHAPTER 3 : Extended Sensitivity Analysis for Heterogeneous Unmeasured Con-

3.4 Implementation through quadratic programming

The test statistics described in §3.2.3 can be represented as the sum of I independent random variables,ZTq=PI

i=1Ti, whereTi = (qi1+qi2)/2 + (Zi1−Zi2)(qi1−qi2)/2. This

suggests that, under mild regularity conditions, a central limit theorem would be applicable to the distribution of ZTq for any value of π in (3.11) for almost every sample path F_I. One sufficient condition proposed in the special central limit theorem of H´ajek et al. (1999,

§6.1.2) is that, almost surely,

i=1(qi1−qi2)2

max

1≤i≤I(qi1−qi2)

2 → ∞,

which requires that no one term (qi1 −qi2)2 dominates the sum as the number of pairs

increases. (An aside: the central limit theorem in H´ajek et al. (1999, §6.1.2) as originally stated applies to sums of the form PI

i=1aiXi whereXi areiid random variables; however,

the proof can readily be extended to settings where Iσ2 _≤PI

i=1var(Xi)≤Icσ2 forc > 1

our extended sensitivity analysis). Under a normal approximation, the problem of finding the worst-casep-value is equivalent to finding the worst-case deviate.

Recall that a sensitivity analysis is typically conducted only if the null hypothesis is rejected under the assumption of no unmeasured confounding (Γ = ¯Γ = 1), and then proceeds by it- eratively increasing the sensitivity parameters until the test fails to reject. Having proceeded to sensitivity analysis only after rejecting the null under no unmeasured confounding, even with one-sided alternatives we can safely consider rejection or failure to reject for sequen- tially larger values of Γ and ¯Γ based on the minimal squared deviate, an objective function which is preferred for computational reasons alluded to below. Recalling that under (3.11) we condition on F_I and hence treat the vectorq as fixed, minimizing the squared deviate can be expressed as an optimization problem over the unknown probabilities π as

min

π∈Uβ(Γ,Γ)¯

(t−_E_π[ZTq| F_I])2 varπ(ZTq| FI)

, (3.13)

wheret is the observed value of the statistict(Z,F), and the expectation and variance are for the test statistic t(Z,F) under the randomization distribution (3.11) for a given vector

π. Under a normal approximation fort(Z,F), the squared deviate follows aχ2₁ distribution. By the argument of the previous section, we then reject the null at levelαif (3.13) is greater than or equal toG−1(1−2(α−β)) for one-sided alternatives orG−1(1−(α−β)) for two-sided alternatives, whereG−1(p) is thep quantile of aχ2₁ distribution.

The expectation and variance of the contribution of Ti can be expressed as a function of

the unknown vectorπ as

Eπ[Ti | FI] =qTi πi (3.14)

varπ(Ti| FI) =πi(1−πi)(qi1−qi2)2 (3.15)

= (q2_i)Tπi−(qTi πi)2

respectively. Suppose without loss of generality that we are considering a one-sided, greater than alternative and that we rejected the null at (Γ,Γ) = (1¯ ,1), which implies that t ≥

(2I)−1PI i=1

j=1qij (i.e. that the observed value of texceeded its null expectation). Sort

each vectorqi in descending order such thatqi1 ≥qi2. Then, varπ(Ti| FI) = varπ∗(T_i | F_I)

from (3.15), while from (3.14) Eπ[Ti | FI] ≤ Eπ∗[T_i | F_I] = qT

i π∗i and (qi1 +qi2)/2 ≤

Eπ∗[T_i | F_I]. Hence, any feasible solution π0 to (3.13) has an objective value that is no

smaller than that of (π∗)0, as the variance will be the same while, recalling the iterative nature of a sensitivity analysis, the distance (t−_E₍_π∗₎0[ZTq0 | F_I])2 will be smaller than

(t−_E_π0[ZTq0 | F_I])2. Maintaining this ordering of the vectors q_i, we can express our

optimization problem as a function of the maximal probabilitiesπ∗_i.

For any candidate π∗, we reject under a normal approximation with a one-sided, greater than alternative at levelα−β if the corresponding squared deviate exceeds its critical value,

G−1(1−2(α−β)) i.e. ifζ(π∗, α−β) = (t−Eπ∗[ZTq| F_I])2−G−1(1−2(α−β))var_π∗(ZTq| FI)≥0. We writeζ(π∗, α−β) explicitly as a function ofπ∗ as

ζ(π∗, α−β) = (t−qTπ∗)2−G−1(1−2(α−β))

I X

i=1

(q2_i)Tπ_i∗−(qT_i π_i∗)2

If we find that ζ(π∗, α−β) ≥0 for all feasible π∗ ∈ Uβ(Γ,Γ), we can reject the null while¯

asymptotically controlling the size of the extended sensitivity analysis with parameters (Γ,Γ) at¯ α. The function ζ(π∗, α−β) is convex and quadratic in π∗. Meanwhile, we explicitly write the constraints determining membership inU_β(Γ,Γ) as¯

1/2≤π∗_i ≤Γ/(1 + Γ), 1≤i≤I (3.16) I−1 I X i=1 π_i∗ ≤µπ∗+I−1/2Φ−1(1−β){(Γ/(1 + Γ)−µ_π∗) (µ_π∗−1/2)}1/2 (3.17) µπ∗ ≤Γ¯/(1 + ¯Γ). (3.18)

probabilitesπ_i∗. Hence, for fixedµπ∗, the problem min_π∗ζ(π∗, α−β) subject to (3.16) and

(3.17) can be written as a quadratic program. With a one-sided alternative, an asymptotically level-α extended sensitivity analysis with parameters (¯Γ,Γ) simply requires checking whether the solution to that quadratic program is greater than or equal to zero, rejecting the null if so and failing to reject otherwise. For a two-sided alternative, simply replace

ζ(π∗, α−β) with ζ(π∗,(α−β)/2) to control the level of the procedure at α. See Rosen- baum (1992) and Fogarty and Small (2016) for similar formulations of sensitivity analyses as convex programs.

A minor complication is that for small values of I or for small values for β, the right-hand side of (3.17) need not be monotone increasing in µπ∗ if 2¯Γ/(1 + ¯Γ)≥Γ/(1 + Γ) + 1/2, as

decreasingµπ∗ may lead to an increase in the component dependent on the variance bound

which exceeds the corresponding decrease in the additive term µπ∗. To remedy this, one

can simply find the value for µπ∗ over the range [(Γ/(1 + Γ) + 1/2)/2,Γ¯/(1 + ¯Γ)] which

maximizes the right-hand side of (3.17) through a bisection algorithm, and then proceed with the quadratic program using this single value. If 2¯Γ/(1+ ¯Γ)<Γ/(1+Γ)+1/2, the right- hand side of (3.17) is, subject to (3.18), maximized atµπ∗ = ¯Γ/(1 + ¯Γ), so one can proceed

by replacing µπ∗ with ¯Γ/(1 + ¯Γ) and solving the required quadratic program. Importantly,

the method only requires solving a single quadratic program. Quadratic programs can be solved by many free and commercially available solvers; we provide code implementing our method using the R package for the solver Gurobi, which is free for academic use, at the author’s website http://www.raidenhasegawa.com. We also provide options to replace the constraint (3.17), justified by the Central Limit Theorem, with bounds described in Appendix 3.8.1 which are valid for anyIthrough distribution-free concentration inequalities.

In document Essays In Causal Inference: Addressing Bias In Observational And Randomized Studies Through Analysis And Design (Page 65-69)