F Assuming a Known Direction of Effect Impacts Reported Sensitivity

In both examples in Section 6, we perform inference under a host of assumptions on the potential outcomes. As is demonstrated therein, the assumption of a known direction of effect has a particularly strong impact on the corresponding sensitivity analysis. Note that when testing the null of δ = 0 ⇔ ϕ = 1 ⇔ λ = 0 under the assumption of a

direction of effect, the only allocation of r_T, rC that satisfies the null hypothesis is the

allocation of Fisher’s sharp null: rT ij = rCij ∀i, j. This results in testing a simple,

rather than composite, null hypothesis. At Γ = 1, the necessary hypothesis test can be performed using the permutation distribution (or a normal approximation thereof) of the test statistic under Fisher’s sharp null. For Γ > 1 the potential outcomes are still fixed at those of Fisher’s sharp null, but we must consider the possible vectors of unmeasured confounders. Without the assumption of a direction of effect, there are many possible allocations of potential outcomes satisfying this null. This additional flexibility in the optimization problem results in more extreme worst-case allocations for the inference being conducted.

As a simple illustration of why this is the case, consider testing this null with two

pairs of individuals. In strata 1, suppose R11 = R12 = 1, while in strata 2 suppose

R21= R22= 0, where without loss of generality the first individual in each matched set

received the treatment. If we assume a nonnegative treatment effect, rT 12 = 1, since

rC12 = 1. Similarly, rC21 = 0 since rT 21 = 0. Finally, the constraint that the null is

true forces r_C11= 1 and rT 22 = 0. For any Γ, these strata contribute expectation and

variance 0. Without the assumption of a direction of effect, we can also satisfy the null

hypothesis by setting rC11= 1, rT 12 = 0, rC21= 0, rT 22 = 1. Not only would we then

have positive variance contribution from each of these strata at any Γ, but also setting

u1 = [1, 0] and u2 = [0, 1] results in an aggregate expected value of (Γ − 1)/(1 + Γ) ≥ 0.

These choices allow one to find a less significant deviate under no constraints on the direction of effect than is possible under a model with a known direction of effect.

G

Sensitivity Analysis for a Simple Null

While the methodology presented herein was motivated by conducting sensitivity anal- yses for composite null hypotheses with binary outcomes, we note that a simplified version can be used to conduct a sensitivity analysis for a simple null hypothesis for general types of outcome variables without invoking asymptotic separability (Gastwirth

et al., 2000). With a simple null hypothesis, qij are fixed for each individual i and each

strata j. In the notation of Section 4, S represents the number of strata with unique

sets of values for the vector q_i. With continuous outcomes S will often equal I, but

for other types of outcomes there may be repeated strata. For each s, P_s (the number

of possible allocations of potential outcomes within unique set s) equals 1 as both sets of potential outcomes are fixed under a simple null. Hence, the subscript [sp] in our

original formulation can be replaced by a single subscript s, Define µ_sa and ν_sa by

replacing [sp] with s in the notation of Section 4.1 in the manuscript, and make the

analogous substitution of x_sa for x_[sp]a. Let M_s again represent the number of times

stratum s. Define µ = [µ11, .., µS,˜nS−1] and let the analogous definitions hold for ν and x. Finally, note that the constraint that the null must be true in formulation (P1) can

be removed entirely as q_ij are defined under this assumption. A sensitivity analysis at

a given Γ > 1 can be conducted by solving the following optimization problem: minimize x (t − (µ T_x))2_{− κ(ν}T_x) _(P2) subject to ˜ ns−1 X a=1 xsa= Ms ∀s xsa∈ Z ∀s, a xsa≥ 0 ∀s, a

As described in Section 5.2, we can conduct a sensitivity analysis for a given Γ > 1 by

minimizing (P2) with κ = χ2_1,1−α. To find the actual minimal deviate, we can follow

the iterative procedure outlined in Section 5.2 until converging to a stationary κ∗.

The constraint matrix corresponding to the above optimization program is totally unimodular. As a consequence, the polyhedron of the continuous relaxation equals the integer hull (Bertsimas and Tsitsiklis, 1997). Hence, if one were solving an integer linear program, the solution of the continuous relaxation would be guaranteed to be integral. When finding the worst-case deviate we are minimizing a constrained convex quadratic function; as such, the solution need not be at the vertex. Nonetheless, strong formulations of integer quadratic programs are essential for efficiently finding optimal solutions.

G.1 Example: Dropping Out of High School and Cognitive

Achievement

As an exposition of their methodology, Gastwirth et al. (2000) consider conducting a sensitivity analysis for comparing cognitive achievement of US high-school drop-outs with that of non-dropouts; see Rosenbaum (1986) for more details on the study. They conducted inference on 12 drop-outs in the study, where each drop-out was matched to two students who did not drop out, yet were similar on the basis of all other observed covariates. Using an aligned rank test, the test statistic for these 12 matched sets was t = 296, with expectation and variance at Γ = 1 of 222 and 1271, yielding a standardized deviate of 2.07 and approximate one sided p-value of 0.019.

Table 3 of Gastwirth et al. (2000) shows the results of the asymptotically separable algorithm on this data set for Γ = 2. At this strength of unmeasured confounding, the separable algorithm yields a bounding normal deviate with a mean of 257.40 and a variance of 1177.23, resulting in an approximation to the worst-case deviate of 1.125 and a one sided p-value of 0.129. We also explicitly minimized the deviate by solving (P2). This yields a bounding random variable with a mean of 256.60 and a variance of 1228.145, yielding a worst-case deviate of 1.124 and a worst-case p-value of 0.130. Investigating further, the worst-case allocations of u for each strata were in agreement for all of the matched sets except for matched set 11. There, the asymptotically sepa-

rable algorithm chooses u₁₁ = [0, 1, 1], contributing a mean of 24.80 and a variance of

139.76. The correct value for u₁₁ for minimizing the deviate is u₁₁ = [0, 0, 1], which

This demonstrates that for I even moderately large, the asymptotically separable algorithm can produce a bounding random variable that very closely approximates the true upper bound on the p-value. That being said, given our formulation the worst-case deviate can be explicitly found. Furthermore, one need not worry about computation time: for conducting the sensitivity analysis on this problem, an optimal solution was found in 0.15 seconds.

In document Randomization Inference and Sensitivity Analysis for Composite Null Hypotheses With Binary Outcomes in Matched Observational Studies (Page 34-36)