Earlier robustness analyses, such as Sullivan & Swofford (2001), generated datasets using a relatively general model, and then inferred trees using a method that as- sumed a more restricted class of models. Importantly, there remained commonalities between the generating model and inference model class: they both assumed a single tree topology. This made it possible to test robustness by measuring the proportion of 4-taxon tree topologies recovered correctly.
In our paper, we generated data on a mixture model using trees with different
topologies and fed into a single-tree inference method. Consequently it may very well be asked: What are the common parameters shared by the generating model and the inference model class? Or more plainly: What are we hoping to get?
5.2.1
Does “the” internal edge of a mixture of two trees
really exist?
The main difficulty surrounds our use of the term “the internal edge” in describing the mixture of trees A and B on p. 87: it is not clear what single parameter in the generating model the inferred internal edge length can be said to be estimating.
No difficulties arise if we assume that one of the trees in the mixture, called the “true tree”, has a much larger proportion than the other, which we can call the
“noise tree”. If the proportion pA of tree A is large in relation to the proportion
pB = 1−pAof tree B, and if sufficiently many sites are available that sampling error is not a concern, then it is reasonable to expect that single-tree ML will infer A’s topology, so the internal branch length inferred is an estimate of A’s internal branch
length (and vice versa when pB pA). But when pA ≈ pB, this interpretation is
5.2.2
Shared parameter values
In order to sensibly discuss the behaviour of inferences made using more-restricted model classes on data generated under more-general models, we need to make precise the notion of when parameter values are “shared” by different models.
Ever-present real-valued parameters like transition-transversion ratio are a sim-
ple case: if allk components of a mixture model have equal values for such a param-
eter, then we can sensibly describe this collection of parameters θi,1 ≤ i ≤ k as a
single parameterθof the overall mixture model, and attempt to infer it using a more
specific, single-component model. The resulting estimate, ˆθ, can be meaningfully
compared with theθi, and statements can be made about the estimation procedure
regarding the usual parameter-dependent statistical properties like convergence (or lack thereof) and bias.
On the other hand, the structure of an edge-weighted phylogenetic tree means that it is not immediately clear how, or even whether, the parameters describing one
tree T can be matched up with the parameters describing another tree U. T and
U may in general have different topologies, which potentially makes their respective
sets of parameters prima facie structurally different and thus incommensurable.
5.2.3
The edges of a mixture model
One way to overcome this is to follow the lead of the Hadamard conjugation tech- nique (Hendy & Penny, 1993), and embed the topology-dependent parameter space of a particular tree in a larger, topology-independent space. Recall that an edge in
a tree (considered without its length) is defined by a split of taxa X|Y, and that
there are 2n−1 distinct splits on n taxa. This enables us to represent the 2n −3
edge-length parameters describing any edge-weighted, unrooted binary tree T on n
taxa using a vector sT of 2n−1 parameters indexed by split: the 2n−3 elements
corresponding to edges present in T are assigned the corresponding length, while
the remaining 2n−1 −2n+ 3 elements, which correspond to edges absent from T,
are assigned the value 0.2 Now that we have a set of parameters that is structurally
identical across different tree topologies, we can safely say that two trees T and U
2This describes the situation for an unrooted binary tree; much the same procedure works for
share a parameter value whenever sT i =sUi for some split i.
According to this formulation, the meaning of “the internal edge” of a mixture of trees is not well defined; but whenever all components in the mixture model contain some splitA|B as an edge, and this edge has identical length across all components,
the meaning of “the edge splitting taxon set A from taxon set B” is well defined,
regardless of how the topologies of the components may otherwise differ. All such shared edges can be regarded as edgesof the mixture model, capable of being inferred using a single-tree inference method (at least in principle).
It follows that the four external edges of the trees A and B analysed in the paper
are shared parameter values. Likewise, for the 5-taxon analysis, all five external
edges, plus the edge separating taxa 4 and 5 from the rest in the mixture of trees A and B, are genuine shared parameter values. Figure 1B shows that ML estimation of the four external edges in the 4-taxon analysis is indeed biased upward as the mixture approaches an even balance between the two topologies. That such a simple (and, we propose, frequently occurring) effect as a mixture of two trees is enough to distort the results of single-tree ML analysis is a persuasive argument for the use of
network methods like Spectronet (Huberet al., 2002), which are inherently immune
to such problems.