In both examples in Section 6, we perform inference under a host of assumptions on the potential outcomes. As is demonstrated therein, the assumption of a known direction of effect has a particularly strong impact on the corresponding sensitivity analysis. Note that when testing the null of δ = 0 ⇔ ϕ = 1 ⇔ λ = 0 under the assumption of a
direction of effect, the only allocation of rT, rC that satisfies the null hypothesis is the
allocation of Fisher’s sharp null: rT ij = rCij ∀i, j. This results in testing a simple,
rather than composite, null hypothesis. At Γ = 1, the necessary hypothesis test can be performed using the permutation distribution (or a normal approximation thereof) of the test statistic under Fisher’s sharp null. For Γ > 1 the potential outcomes are still fixed at those of Fisher’s sharp null, but we must consider the possible vectors of unmeasured confounders. Without the assumption of a direction of effect, there are many possible allocations of potential outcomes satisfying this null. This additional flexibility in the optimization problem results in more extreme worst-case allocations for the inference being conducted.
As a simple illustration of why this is the case, consider testing this null with two
pairs of individuals. In strata 1, suppose R11 = R12 = 1, while in strata 2 suppose
R21= R22= 0, where without loss of generality the first individual in each matched set
received the treatment. If we assume a nonnegative treatment effect, rT 12 = 1, since
rC12 = 1. Similarly, rC21 = 0 since rT 21 = 0. Finally, the constraint that the null is
true forces rC11= 1 and rT 22 = 0. For any Γ, these strata contribute expectation and
variance 0. Without the assumption of a direction of effect, we can also satisfy the null
hypothesis by setting rC11= 1, rT 12 = 0, rC21= 0, rT 22 = 1. Not only would we then
have positive variance contribution from each of these strata at any Γ, but also setting
u1 = [1, 0] and u2 = [0, 1] results in an aggregate expected value of (Γ − 1)/(1 + Γ) ≥ 0.
These choices allow one to find a less significant deviate under no constraints on the direction of effect than is possible under a model with a known direction of effect.
G
Sensitivity Analysis for a Simple Null
While the methodology presented herein was motivated by conducting sensitivity anal- yses for composite null hypotheses with binary outcomes, we note that a simplified version can be used to conduct a sensitivity analysis for a simple null hypothesis for general types of outcome variables without invoking asymptotic separability (Gastwirth
et al., 2000). With a simple null hypothesis, qij are fixed for each individual i and each
strata j. In the notation of Section 4, S represents the number of strata with unique
sets of values for the vector qi. With continuous outcomes S will often equal I, but
for other types of outcomes there may be repeated strata. For each s, Ps (the number
of possible allocations of potential outcomes within unique set s) equals 1 as both sets of potential outcomes are fixed under a simple null. Hence, the subscript [sp] in our
original formulation can be replaced by a single subscript s, Define µsa and νsa by
replacing [sp] with s in the notation of Section 4.1 in the manuscript, and make the
analogous substitution of xsa for x[sp]a. Let Ms again represent the number of times
stratum s. Define µ = [µ11, .., µS,˜nS−1] and let the analogous definitions hold for ν and x. Finally, note that the constraint that the null must be true in formulation (P1) can
be removed entirely as qij are defined under this assumption. A sensitivity analysis at
a given Γ > 1 can be conducted by solving the following optimization problem: minimize x (t − (µ Tx))2− κ(νTx) (P2) subject to ˜ ns−1 X a=1 xsa= Ms ∀s xsa∈ Z ∀s, a xsa≥ 0 ∀s, a
As described in Section 5.2, we can conduct a sensitivity analysis for a given Γ > 1 by
minimizing (P2) with κ = χ21,1−α. To find the actual minimal deviate, we can follow
the iterative procedure outlined in Section 5.2 until converging to a stationary κ∗.
The constraint matrix corresponding to the above optimization program is totally unimodular. As a consequence, the polyhedron of the continuous relaxation equals the integer hull (Bertsimas and Tsitsiklis, 1997). Hence, if one were solving an integer linear program, the solution of the continuous relaxation would be guaranteed to be integral. When finding the worst-case deviate we are minimizing a constrained convex quadratic function; as such, the solution need not be at the vertex. Nonetheless, strong formulations of integer quadratic programs are essential for efficiently finding optimal solutions.
G.1
Example: Dropping Out of High School and Cognitive
Achievement
As an exposition of their methodology, Gastwirth et al. (2000) consider conducting a sensitivity analysis for comparing cognitive achievement of US high-school drop-outs with that of non-dropouts; see Rosenbaum (1986) for more details on the study. They conducted inference on 12 drop-outs in the study, where each drop-out was matched to two students who did not drop out, yet were similar on the basis of all other observed covariates. Using an aligned rank test, the test statistic for these 12 matched sets was t = 296, with expectation and variance at Γ = 1 of 222 and 1271, yielding a standardized deviate of 2.07 and approximate one sided p-value of 0.019.
Table 3 of Gastwirth et al. (2000) shows the results of the asymptotically separable algorithm on this data set for Γ = 2. At this strength of unmeasured confounding, the separable algorithm yields a bounding normal deviate with a mean of 257.40 and a variance of 1177.23, resulting in an approximation to the worst-case deviate of 1.125 and a one sided p-value of 0.129. We also explicitly minimized the deviate by solving (P2). This yields a bounding random variable with a mean of 256.60 and a variance of 1228.145, yielding a worst-case deviate of 1.124 and a worst-case p-value of 0.130. Investigating further, the worst-case allocations of u for each strata were in agreement for all of the matched sets except for matched set 11. There, the asymptotically sepa-
rable algorithm chooses u11 = [0, 1, 1], contributing a mean of 24.80 and a variance of
139.76. The correct value for u11 for minimizing the deviate is u11 = [0, 0, 1], which
This demonstrates that for I even moderately large, the asymptotically separable algorithm can produce a bounding random variable that very closely approximates the true upper bound on the p-value. That being said, given our formulation the worst-case deviate can be explicitly found. Furthermore, one need not worry about computation time: for conducting the sensitivity analysis on this problem, an optimal solution was found in 0.15 seconds.