In order to build a confidence set for the entire graph structure, we can use the same simulation method on other graphs. For example, we could consider a graph that we call βgraph 25β or π’25, since it is the HIV graph with 25 randomly-chosen edges removed. A
picture is available in Figure6.3.
As discussed in Section 4.2.3, there are a number of different statistics that one might consider. Here, we consider two-sided tests for ππ’25, ππ’HIV, and ππ’25 βππ’HIV, i.e. the edges within statistic computed with respect to π’25, the edges within statistic computed with respect to the HIV graph, and the difference of these statistics. The results are shown in Figure 6.4.
There are a number of things to observe. First, using the statistics ππ’25 andππ’HIV lead to confidence sets for π½ of approximately [1.148,10] and [1.27347,10] respectively. On the other hand, the difference ππ’25 βππ’HIV leads to the confidence interval [0,10]. This is unsurprising, since the π’25 and π’HIV are not too different.
We can repeat this procedure on a different graph,π’100, which has 100 edges removed from the HIV graph uniformly at random. An image may be found in Figure6.5, and note that
π’100 is a subgraph of π’25. The results of the analysis are shown in Figure6.6.
The major difference for π’100 is that no value of π½ is included in the confidence set for the difference statistic ππ’100 βππ’HIV. This is likely due to there being many edges in π’HIV between infected vertices that do not exist when simulating the distribution using π’100. In
Figure 6.3The graphπ’25 used in computing the joint confidence set of (π’25, π½). This graph was formed from the HIV graph by removing 25 edges uniformly at random. As in Figure2.1, black vertices are infected, white vertices are uninfected, and gray vertices are censored.
0.0 0.2 0.4 0.0 2.5 5.0 7.5 10.0 Ξ² pβv alue W Difference G HIV
Figure 6.4The π-values as a function of π½ for the HIV graph using the two-statistic test with different test statistics. The statistics areππ’25,ππ’HIV, andππ’25βππ’HIV, and theπ-values for
these are given by the black line, the light gray line, and the dark graph line respectively.
short, for the values of π½ from [0,10], we have found a graph that is not in the confidence set of graphs given by the observed infection.
Figure 6.5The graphπ’100used in computing the joint confidence set of (π’100, π½). This graph was formed from the HIV graph by removing 100 edges uniformly at random. As in Figure2.1, black vertices are infected, white vertices are uninfected, and gray vertices are censored.
0.0 0.2 0.4 0.0 2.5 5.0 7.5 10.0 Ξ² pβv alue W Difference G HIV
Figure 6.6The π-values as a function of π½ for the HIV graph using the two-statistic test with different test statistics. The statistics areππ’100,ππ’HIV, and ππ’100βππ’HIV, and the π-values
for these are given by the black line, the light gray line, and the dark graph line respectively.
set for this particular infection. This provides a very general method of graph testing at the expense of requiring many simulations.
Part III
Permutation
7.
Theory
In this chapter, we cover the primary contribution of this dissertation: tests based on permutation. Note that we restrict our inquiry to the case where the edges are unweighted, or equivalentlyπ€(π’, π£) = 1.
7.1
Permutation-Invariant Statistics
A natural statistic to consider for the purpose of graph testing is the likelihood ratio. For the stochastic spreading model, the likelihood ratio is often difficult to compute and depends onπ½ in a nontrivial manner, making the theoretical derivations somewhat challenging. Our main focus will be on a class of statistics that are invariant under a group of permutations, which allow us to perform permutation testing based on symmetries in the graph sets. We first introduce some terminology regarding permutations and group actions, and then introduce a class of invariant statistics that will be central to our analysis.
7.1.1 Permutations and Group Actions
Recall that agraph automorphism π’= (π±,β°) is an elementπof the permutation group ππ
such that (π’, π£)β β° if and only if (π(π’), π(π£))β β°. For simple hypotheses, we denote the automorphism groups of π’0 and π’1 by Ξ 0 = Aut(π’0) and Ξ 1= Aut(π’1), respectively. We also need to define theaction of a permutation on vertices, graphs, and infections. The action of a permutationπ on a vertexπ’ is simply the imageπ(π’). This is easily extended to tuples and subsets of vertices by applyingπ to the underlying vertices. A specific example is the action on edges of the graph:
πβ° ={(π(π’), π(π£)) : (π’, π£)β β°}. The action of π on a graph π’= (π±,β°) is then defined to be
ππ’:= (ππ±, πβ°) = (π±, πβ°).
Another natural extension is to define the action of a set of permutations on a set of graphs: Ξ G={ππ’ :πβΞ andπ’ βG}.
If Gπ = ππ{π’π}, we say that hypothesis π corresponds to a hypothesis of a particular
graphtopology, since all node labelings are included in the set. We also define the action Ξ Ξπ = Ξ GπΓπ΄π. Finally, we define the action of a permutation π on an infectionπ½:
ππ½ :=(οΈπ½πβ1(1), . . . , π½πβ1(π)
)οΈ
.
In other words, the infection status of the image vertex π(π’) is the infection status of π’ underπ½.
7.1.2 Invariant Statistics
The theory presented in our paper applies to the following class of statistics:
Definition 7.1.1. SupposeΞ is a subgroup of ππ. A statistic π is Ξ -invariant ifπ(π½) =π(ππ½) for anyπ½ βIπ,π and πβΞ .
In our permutation test, we will compute the edges-within statistic with respect to the graph
π’1 appearing in the alternative hypothesis in the case of a simple test, so we reject π»0 when
π1(π½) :=ππ’1(π½) exceeds a certain threshold. We derive the invariance of the statistic π under the permutation group Ξ = Aut(π’) in Chapter 8.