Distributions of cherries and pitchforks for the Ford model

(1)

Distributions of cherries and pitchforks for the Ford model

Gursharn Kaur

^∗

, Kwok Pui Choi

^†

, and Taoyang Wu

^‡

October 7, 2021

Abstract

Distributional properties of tree shape statistics under random phylogenetic tree models play an important role in investigating evolutionary forces underlying real world phylogenies. In this paper, we study two subtree counting statistics, the number of cherries and that of pitchforks for Ford’s alpha model, a one-parameter family of random phylogenetic tree models which includes as specific instances of both the uniform and the Yule models, two tree models commonly used in phylogenetics. Based on a version of the extended P´olya urn models, in which negative entries are permitted for their replacement matrices, we obtain the strong laws of large numbers and the central limit theorems for the joint distribution of these two count statistics for the Ford model. Furthermore, we derive a recursive formula for computing the exact joint distribution of these two statistics, which leads to higher order asymptotic expansions of their marginal and joint moments.

1 Introduction

One important topic in many branches of biology is to understand evolutionary events and forces leading to current biological systems, such as a group of species or strains of a virus.

To this end, evolutionary relationships among the biological system under investigation are typically represented by a phylogenetic tree, that is, a binary tree whose leaves are labelled by the taxon units in the system. As these events and forces, such as rates of speciation and expansion, are often not directly observable [22, 16], one popular approach is to compare empirical shape indices computed from trees inferred from real datasets with those predicted by a null tree growth model [5, 15]. Furthermore, topological tree shapes are also closely

∗Biocomplexity Institute, University of Virginia, Charlottesville, USA 22911.

†Department of Statistics and Data Science, and the Department of Mathematics, National University of Singapore, Singapore 117546.

‡School of Computing Sciences, University of East Anglia, Norwich, NR4 7TJ, U.K.

arXiv:2110.02850v1 [math.PR] 6 Oct 2021

(2)

related to several fundamental statistics in population genetics [12, 2] and certain important parameters in the dynamics of virus evolution and propagation [9].

One important family of tree shapes are balance indices, such as Colless’ index, Sackin’s index and the number of subtrees (see, e.g. [13] and the references therein). Various properties concerning these statistics have been established in the past decades on the following two fundamental random phylogenetic tree models: the Yule model (aka the Yule-Harding- Kingman (YHK) model) [24, 11, 18] and the uniform model (aka the proportional to distin- guishable arrangements (PDA) model) [21, 6, 25, 8]. However, for phylogenetic trees inferred from real datasets, the Yule or uniform model may not always be a good fit [5], and several general classes of random trees have been proposed for modelling and analysing the observed data, two popular ones being Ford’s alpha model [14] and Aldous’ beta model [1].

In this paper, we confine ourselves to Ford’s alpha model, a one-parameter family of random tree growth models introduced by Daniel J. Ford in his PhD thesis [14]. More precisely, under the Ford model with a fixed parameter 0 ≤ α ≤ 1, a random tree of a given number of leaves is generated such that at any step in which a tree T_n with n leaves has been constructed from previous steps, a new leaf attaches to an internal edge of Tn with probability _n−α^α and to a leaf edge in T_n with probability _n−α^1−α. The resulting random tree model will be referred to as the Ford model (with parameter α) in this paper, which is also known as the alpha tree model (see, e.g. [10]). Note that the Ford model is a family of random tree models which includes the Yule model with α = 0, the uniform model with α = 1/2, and the Comb model with α = 1.

The tree shape indices studied in this paper are the number of cherries and that of pitchforks. Here a cherry is a subtree with precisely two leaves and a pitchfork a subtree with three leaves. The asymptotic properties of the number of cherries was first studied by McKenzie and Steel [21], who showed that the number of cherries is asymptotically normal for the Yule and the uniform models as the number of leaves tends to infinity. Later, similar properties of the number of cherries are extended to the Ford model [14, Theorem 57] and to the Crump-Mode-Jagers branching process [23]. For the number of pitchforks, Rosenberg [24] obtained its mean and variance and Chang and Fuchs [6] proved that the number of pitchforks is also asymptotically normal for the Yule and the uniform models.

For the joint distributions, Holmgren and Janson showed that [18] the joint distribution is asymptotically normal for the Yule model. This was recently extended by us to the uniform model based on a uniform version of the extended urn models in which negative entries are permitted for their replacement matrices [7].

In this paper, we establish the strong law of large numbers and the central limit theorem for the joint distribution of cherries and pitchforks under the Ford model (Theorem 3.2) by considering an associated nonuniform urn model (Theorem 3.1). These results are presented in Section 3, following Section 2 in which we collect background concerning the Ford model and limiting theorems on uniform urn models. Furthermore, we derive a recurrence formula for computing the exact joint distribution under the Ford model (Theorem 4.2) in Section 4, generalizing the results in [25, 8] for the Yule and the uniform model. This recurrence formula enables us to obtain exact expressions for the mean and variance of the number

(3)

of cherries and that of pitchforks and their covariance under the Ford model. This, in- particular, generalises the exact expressions of mean and variance for the number of cherries and that of pitchforks for the Yule and the uniform models [21, 24, 6] and the number of cherries for the Ford model [14, Theorem 60]. As an application, in Section 5 we obtain higher order expansions of the first and second moments of the joint distributions.

2 Ford Model and Urn Model

In this section, we first introduce the Ford model, which is a one-parameter family of random phylogenetic tree models. Next we present a nonuniform version of the extended urn models associated with the Ford tree model. Finally, we recall certain conditions on the related uniform version of the extended urn model under which the strong law of large numbers and the central limit theorem are obtained.

2.1 Ford model

A rooted binary tree is a finite connected simple graph without cycles that contains a unique vertex of degree 1 designated as the root and all the remaining vertices are of degree 3 (interior vertices) or 1 (leaves). A phylogenetic tree with n leaves is a rooted binary tree whose leaves are bijectively labelled by the elements in {1, . . . , n}. Edges incident with leaves are referred to as pendant edges.

Under the Ford model with parameter 0 ≤ α ≤ 1, a random phylogenetic tree T_n with n leaves is constructed recursively by adding one leaf at a time as follows. Fix a random permutation (x₁, . . . , x_n) of {1, . . . , n}. The initial tree T₂ contains precisely two leaves (e.g.

one cherry) which are labelled as x₁ and x₂. For the recursive step, given a tree T_m with m leaves constructed so far, choose a random edge in T_m according to the distribution that assigns weight 1 − α to each pendant edge (i.e., those incident with a leaf) and weight α to each of the other edges. The new leaf labelled x_m+1 bifurcates the selected edge and joins in the middle. Every single addition of a leaf in the tree results into a replacement of the selected edge with two new edges. Finally, we let A_n and C_n denote the numbers of pitchforks and cherries in tree T_n, respectively.

2.2 An urn model associated with trees

Consider an urn containing balls of d different colours where colours are denoted by integers {1, 2, . . . , d}. Let U_n= (U_n,1, . . . , U_n,d) be the configuration vector of length d such that the i-th element of U_n is the number of balls of colour i at time n. Let U₀ be the initial vector of colour configuration, then at every time n ≥ 1, a ball is selected uniformly at random from the urn and if the colour of the selected ball is i then the ball is replaced along with R_i,j many balls of colour j, for every 1 ≤ j ≤ d. The dynamics of the urn configuration depends on its initial configuration U₀ and the d × d replacement matrix R = (R_i,j)_1≤i,j≤d.

(4)

2 2

ρ

T2

ρ ρ

T3 T4 T5 T6

2 2 1 1 3 1 1 3 2 2

U0 U1 U2 U3 U4

(i)

(ii)

2 2 1 1 3 1 1 3 2 2

5

5 6

6 6 6 6

6 6

6 5

5

3

6 6 6 5 6 6 6

5

2 2 2 2

5 5

5

1 1 3

4 6 6

3

2 2 2 2

5 5

1 1 3 2 2

6 6

5 6

4 6

6

Figure 1: A sample path of the Ford model and the associated trajectory under the urn model. (i) A sample path of the Ford model evolving from T2 with two leaves to T6 with six leaves. The labels of the leaves are omitted for simplicity. The type of an edge is indicated by the circled number next to it. For 2 ≤ i ≤ 5, the edge selected in T_i to generate Ti+1 is highlighted in bold and the associated edge type is indicated in the circled number above the arrow. (ii) The associated urn model with six colours, derived from the types of pendants edges in the trees. In vector form, U₀ = (0, 2, 0, 0, 1, 0), U₁ = (2, 0, 1, 0, 0, 2), U₂ = (0, 4, , 0, 0, 2, 1), U3 = (2, 2, 1, 0, 1, 3), and U4 = (2, 2, 1, 1, 1, 4).

We study the limiting properties of the numbers of cherries and pitchforks via an equiv- alent urn process. Towards this, we use six different colours and assign one colour to each type of edges of the tree in the following scheme introduced in [7]: colour 1 for all pendant edges of a cherry in a pitchfork; colour 2 for pendant edges of a cherry not contained in a pitchfork; colour 3 for pendant edges in a pitchfork but not in any cherry; colour 4 for pendant edges in neither a cherry nor a pitchfork; colour 5 for internal edges adjacent to a cherry but not in a pitchfork (i.e., those adjacent to colour 2 edges), and colour 6 for all other (necessarily internal) edges (including the one incident with the root). See Fig. 1 for an illustration of the scheme.

Consider an urn with colour configuration at time n as U_n= (U_n,1, . . . , U_n,6), where U_n,i denotes the number of edges of colour i in the tree at time n, which has precisely n + 2 leaves.

Then U₀ = (0, 2, 0, 0, 1, 0), since at the initial time step (n = 0) there is one internal edge and one essential cherry in a rooted tree; see T₂ in Fig. 1. Based on the colouring scheme of the edges, at any time n ≥ 0, we have

(A_n+2, C_n+2) = 1

2(U_n,1, U_n,1+ U_n,2) , (1) where A_n+2and C_n+2are the numbers of pitchforks and cherries in T_n+2, respectively. Under the alpha tree model, the dynamics of the corresponding urn process evolves according to

(5)

the following replacement matrix

R =







0 0 0 1 0 1

2 −2 1 0 −1 2

−2 4 −1 0 2 −1

0 2 0 −1 1 0

2 −2 1 0 −1 2

0 0 0 1 0 1





 .

Let e_i, 1 ≤ i ≤ 6, denote a 6-vector in which the i-th component is 1 and 0 elsewhere; and χ_n the random vector taking value e_i if, at time n, speciation happens at an edge with type i. Thus, we have the following recursion

U_n= U_n−1+ χ_nR, n ≥ 1, where

P (χn = ei|Fn−1) ∝







(1 − α)U_n−1,i, for i ∈ {1, 2, 3, 4}, α U_n−1,i, for i ∈ {5, 6}.

(2) Observe that the process (U_n)_n≥0, which describes the dynamics of the numbers of cherries and pitchfork, is a nonuniform urn model since the balls are not selected uniformly at random from the urn, which is different from the classical uniform urn models in which the balls are selected uniformly at random from the urn (see, e.g. [17, Chapter 7]).

2.3 Limiting theorems on uniform urn models

In this subsection, we recall the strong laws of large numbers and the central limit theorems on a version of uniform urn models developed in [7], which will be related to the nonuniform urn process in Subsection 2.2 later using the urn coupling idea in [4].

For the classical uniform urn models, it has been shown (see [3]) that the random process U_n/n converges almost surely to the left eigenvector of R corresponding to the maximal eigenvalue and the asymptotic normality holds with a known limiting variance matrix under certain assumptions on R. Standard assumptions made in the urn model theory are that the replacement matrix is irreducible with a constant row sum and all the off-diagonal elements are non-negative (see, e.g. [20]). In [7], we extend this to the case when off-diagonal elements of a replacement matrix can be negative satisfying the following set of assumptions (A1)–

(A4), which was slightly rephrased from [7]. Let diag(a₁, . . . , a_d) denote the diagonal matrix whose diagonal elements are a₁, . . . , a_d.

(A1): Tenable: It is always possible to draw balls and follow the replacement rule.

(A2): Small: All eigenvalues of R are real; the maximal eigenvalue λ₁, called the principal eigenvalue is positive with λ₁ > 2λ holds for all other eigenvalues λ of R.

(A3): Strictly balanced: The column vector u₁ = (1, 1, . . . , 1)^>, is a right eigenvector of R corresponding to λ₁; and it has a principal left eigenvector v₁ (i.e., the left eigenvectors corresponding to λ₁) that is also a probability vector.

(6)

(A4): Diagonalisable: There exists an invertible matrix V with real entries whose first row is v₁ such that the first column of V⁻¹ is u₁ and

V RV⁻¹ = diag(λ₁, λ₂, . . . , λ_d) =: Λ, (3) where λ1 > λ2 ≥ · · · ≥ λd are eigenvalues of R.

Let N (0, Σ) be the multivariate normal distribution with mean vector 0 = (0, . . . , 0) and covariance matrix Σ. Then we have the following result from [7, Theorems 1 & 2], which can also be alternatively derived from [19, Theorems 3.21 & 3.22 and Remark 4.2].

Theorem 2.1. Under assumptions (A1)–(A4), we have (nλ₁)⁻¹U_n a.s.

−→ v₁ and n^−1/2(U_n− nλ₁v₁)−−→ N (0, Σ),^d (4) where λ₁ is the principal eigenvalue and v₁ is the principal left eigenvector of R, and

Σ =

d

X

i,j=2

λ₁λ_iλ_ju^>_i diag(v₁)u_j

λ₁− λ_i− λ_j v^>_i v_j, (5) where vj is the j-th row of V and uj the j-th column of V⁻¹ for 2 ≤ j ≤ d.

3 Limit Theorems for the Joint Distribution

In this section, we present the strong laws of large numbers and the central limit theorems on the joint distribution of the number of cherries and that of pitchforks under the Ford model.

3.1 Main convergence results

For later use, we consider the following polynomials in α:

φ1 = 8α³− 32α²+ 45α − 23, φ4 = 8α³− 40α²+ 37α + 13, φ₂ = 40α³ − 164α²+ 221α − 97, φ₅ = 40α³− 112α²− 31α + 181, φ₃ = 56α³ − 248α²+ 367α − 181, φ₆ = 8α³+ 4α²− 71α + 71;

(6)

and for simplicity of notation, we do not indicate the φ_i’s as functions of α. Moreover, it can be verified directly that φ1, φ2, φ3 < 0 and φ4, φ5, φ6 > 0 for α ∈ (0, 1). Then, we have the following result on the joint asymptotic properties of the urn model process associated with the α-tree model.

Theorem 3.1. Suppose (U_n)_n≥0 is the urn process associated with the Ford model with parameter α ∈ (0, 1). Then,

U_n n

−→ va.s. and U_n− nv

√n

−−→ N (0, Σ) ,d (7)

(7)

as n → ∞, where

v = 1

2(3 − 2α)(2(1 − α), 2(1 − α), (1 − α), 1 + α, 1 − α, 5 − 3α) (8) and with the polynomials φ1, . . . , φ6 defined in (6),

Σ = 1 − α

4(3 − 2α)²(5 − 4α)(7 − 4α)







−12φ₁ 4φ₂ −6φ₁ −2φ₄ 2φ₂ −2φ₂ 4φ2 −4φ3 2φ2 −2φ6 −2φ3 2φ3

−6φ₁ 2φ₂ −3φ₁ −φ₄ φ₂ −φ₂

−2φ₄ −2φ₆ −φ₄ φ₅ −φ₆ φ₆ 2φ2 −2φ3 φ2 −φ6 −φ3 φ3

−2φ₂ 2φ₃ −φ₂ φ₆ φ₃ −φ₃







. (9)

The proof of Theorem 3.1 is given at the end of this section.

Remark 1. For later use, here we present the limiting results on the urn model using a scaling factor relating to the time n (which is motivated by noting that the number of leaves in the tree at time n is n + 2). However, the results can be readily rephrased using the proportion of color balls in the urn process.

Remark 2. Using the approach outlined in [7], Theorem 3.1 continues to hold for the unrooted α-tree models.

With Theorem 3.1, we are ready to present one of our main results in this paper concerning limit theorems on the joint distribution of the number of cherries C_n and the number of pitchforks A_n under the Ford model.

Theorem 3.2. Under the Ford model with parameter α ∈ [0, 1], we have 1

n(An, Cn) a.s.

−→ (ν, µ) := 1 − α

2(3 − 2α)(1, 2),

and (An, Cn) − n(ν, µ)

√n

−−→ N (0, 0), Sd , where

S = τ² ρ ρ σ²

= 1 − α

(3 − 2α)²(5 − 4α)

"_−24α3+96α²−135α+69 4(7−4α)

−(2−α)(1−2α) 2

−(2−α)(1−2α)

2 2 − α

#

. (10)

Remark 3. We consider special cases of α-tree model, which are commonly studied in phylogenetics. The first two have been established in [7].

1. The uniform model corresponds to α = 1/2, where all edges, internal or leaf, are selected with equal weight and the limit results hold with

(ν, µ) = 1

8(1, 2) and τ² ρ ρ σ²

= 1 64

3 0 0 4

.

(8)

2. The Yule model corresponds to α = 0, where only leaf edges are selected with equal weight and the limit results hold with

(ν, µ) = 1

6(1, 2) and τ² ρ ρ σ²

= 1 45

69/28 −1

−1 2

.

3. The Comb model corresponds to α = 1, a degenerate case. It is easy to see that (ν, µ) = (0, 0) and τ² = ρ = σ² = 0.

Proof of Theorem 3.2. First note that the case α = 1 reduces to a degenerate case of Comb model and therefore we only consider α ∈ [0, 1). The limiting results for the case α = 0 has been obtained in [7], which agree with the above results when α = 0. Thus, it is enough to prove the result for α ∈ (0, 1).

By (1), we have (A_n, C_n) = U_nQ with Q^> = 1

2

1 0 0 0 0 0 1 1 0 0 0 0

. (11)

Since

U_n n

−→ v =a.s. 1

2(3 − 2α) 2(1 − α), 2(1 − α), 1 − α, 1 + α, 1 − α, 5 − 3α, (12) using the relation from equation (1) we get

1

n(A_n, C_n) = U_n n

Q a.s.

−→ v Q = 1 − α

2(3 − 2α)(1, 2).

This concludes the proof of the almost sure convergence. We now prove the central limit theorem and obtain the expression for the limiting variance matrix.

Denoting covariance matrix Σ by (σ_i,j) for 1 ≤ i, j ≤ 6, we consider the matrix S = Q^>ΣQ = 1

4

σ_1,1 σ_1,1+ σ_1,2 σ1,1+ σ2,1 σ1,1+ σ2,1+ σ1,2+ σ2,2

= 1 − α

16(3 − 2α)²(5 − 4α)(7 − 4α)

−12φ₁ −12φ₁+ 4φ₂

−12φ1+ 4φ2 −12φ1+ 8φ2− 4φ3

= 1 − α

(3 − 2α)²(5 − 4α)





−24α³+96α²−135α+69 4(7−4α)

−(2−α)(1−2α) 2

−(2−α)(1−2α)

2 2 − α



. Since (A_n, C_n) = U_nQ, where Q is as defined in (11), we get

(A_n, C_n) − n(ν, µ)

√n = 1

√n (U_n− nv) Q−−^d→ N 0, Q^>ΣQ = N (0, S) . This completes the proof.

(9)

We end this subsection with the following results on the behaviour of the first and second moments of the limiting joint distribution of cherries and pitchforks in the parameter region, as indicated by their plots in Figure 3.1.

Corollary 3.3. (i) For 0 < α < 1, A_n/C_n a.s.

−→ 1/2 as n → ∞. That is, the number of pitchforks is asymptotically equal to the number of essential cherries.

(ii) A_n/n a.s.

−→ _2(3−2α)^1−α , which decreases strictly from 1/6 to 0, as α increases from 0 to 1.

(iii) The limiting variance of A_n/√

n, τ², decreases strictly from 23/420 to 0, as α increases from 0 to 1.

(iv) The limiting variance of C_n/√

n, σ², increases strictly from 2/45 to 0.0695 over (0, a₀) and decreases from 0.0695 to 0 over (a0, 1), where a0 = 0.7339, the unique root of 19 − 48α + 36α²− 8α³ = 0 in (0, 1).

(v) The limiting covariance of A_n/√

n and C_n/√

n changes sign from negative to positive at α = 1/2. Specifically, it increases from −1/45 to 0.0225 over (0, a1) and decreases from 0.0225 over (a₁, 1), where a₁ = 0.8688, the unique root of −24α⁴+160α³−370α²+ 358α − 123 = 0 in (0, 1).

Figure 2: Plot of the limiting covariances of the joint distribution of cherries and pitchforks with respect to the parameter α under the Ford model.

(10)

3.2 A uniform urn model derived from U

_n

For α ∈ (0, 1), consider the diagonal 6 × 6 matrix T_α = diag(1 − α, 1 − α, 1 − α, 1 − α, α, α) and

Ue_n := U_nT_α = ((1 − α)U_n,1, . . . , (1 − α)U_n,4, αU_n,5, αU_n,6) .

Clearly, there is a one to one correspondence between U_n and eU_n= U_nT_α for α ∈ (0, 1) and therefore it is sufficient to obtain the limiting results for the urn process eU_n. Note that the off-diagonal elements of the replacement matrix R_α are not all non-negative, therefore we will use the limit results from [7] to obtain the convergence results for the urn process eUn. Theorem 3.4. Suppose α ∈ (0, 1). Then ( eU_n)_n≥0is an uniform urn process with replacement matrix Rα = RTα and

Ue_n n

−→a.s. ve₁, (13)

where

ve1 = 1

2(3 − 2α) 2(1 − α)², 2(1 − α)², (1 − α)², 1 − α², α(1 − α), α(5 − 3α)

(14) is the normalized left eigenvector of R_α corresponding to the largest eigenvalue λ₁ = 1.

Furthermore,

Ue_n− nev₁

√n

−−→ N (0, ed Σ), (15)

with the polynomials φ₁, . . . , φ₆ defined in (6) and β = 1 − α,

Σ =e β

4(3 − 2α)²(5 − 4α)(7 − 4α)







−12β²φ₁ 4β²φ₂ −6β²φ₁ −2β²φ₄ 2αβφ₂ −2αβφ₂ 4β²φ2 −4β²φ3 2β²φ2 −2β²φ6 −2αβφ3 2αβφ3

−6β²φ₁ 2β²φ₂ −3β²φ₁ −β²φ₄ αβφ₂ −αβφ₂

−2β²φ₄ −2β²φ₆ −β²φ₄ β²φ₅ −αβφ₆ αβφ₆ 2αβφ2 −2αβφ3 αβφ2 −αβφ6 −α²φ3 α²φ3

−2αβφ₂ 2αβφ₃ −αβφ₂ αβφ₆ α²φ₃ −α²φ₃





 .

(16) Proof of Theorem 3.4. First, observe that at any time n, there are n + 2 pendant edges and n + 1 internal edges in a rooted tree. That is,

Un,1+ Un,2+ Un,3+ Un,4= n + 2 and Un,5+ Un,6= n + 1.

This gives

k eU_nk₁ = (1 − α)

4

X

j=1

U_n,j + α

6

X

j=5

U_n,j = (1 − α)(n + 2) + α(n + 1) = n + 2 − α.

(11)

Therefore, from (2) we get,

E[χn|F_n−1] = U_n−1T_α

kU_n−1T_αk₁ = U_n−1T_α n + 1 − α, and

E[Uⁿ|F_n−1] = U_n−1+ E[χn|F_n−1]R = U_n−1+ 1

n + 1 − αU_n−1T_αR.

Multiplying both sides by T_α, we get

E[ eU_n|F_n−1] = eU_n−1+ 1

k eU_n−1k₁Ue_n−1

! RT_α.

Hence, ( eUn)n≥0 is a classical uniform urn model with replacement matrix Rα = RTα.

Note that (A1) holds because the general Ford’s dynamics on a rooted tree is well defined at every time n, thus the corresponding urn model satisfies the assumption of tenability. That is, it is always possible to draw balls without getting stuck with the replacement rule. Note that R_α is diagonalisable as

V R_αV⁻¹ = Λ holds with Λ = diag 1, 0, 0, 0, −2(1 − α), −(3 − 2α),

V⁻¹ =







1 _β¹ 0 0 1 1 − α

1 0 ¹_β 0 1 3 − α

1 ⁻²_β 0 _β³ ^−(2−α)_β −5 + α 1 0 0 _β¹ ^−(2−α)_β −3 + α 1 0 ⁻²_α _α¹ 1 3 − α

1 0 0 ⁻¹_α 1 1 − α







(17)

and

V = 1

2(3−2α)







2β² 2β² β² (1+α)β αβ α(5−3α)

2β(1+α−α²) 2β³ −(2−α)β² (2−α)β² −αβ² −αβ(5−3α)

2αβ² 2α(2−α)β αβ² −αβ² −α(3−α)β −3αβ²

2α(2−α)β 2αβ² α(2−α)β −α(2−α)β α²β −3α(2−α)β

2(2−α)β −2β² (2−α)β −(4−α)β −αβ αβ

−2β 2β −β β α −α





 .

(18) Therefore, R satisfies condition (A4). Next, (A2) holds because R_α has eigenvalues

1, 0, 0, 0, −2(1−α), −(3−2α)

(12)

which are all real. The maximal eigenvalue λ₁ = 1 is positive with λ₁ > 2λ holds for all other eigenvalues λ of R_α. Furthermore, put u_i = V⁻¹e^>_i and v_i = e_iV for 1 ≤ i ≤ 4. Then (A3) follows by noting that u₁ = (1, 1, 1, 1, 1, 1)^> is the principal right eigenvector, and

ve₁ = 1

2(3−2α) 2(1−α)², 2(1−α)², (1−α)², 1−α², α(1−α), α(5−3α) is the principal left eigenvector.

Since all the assumptions (A1)–(A4) are satisfied by the replacement matrix R_α, by Theorem 2.1, (13) holds. Furthermore, since

Σ =e

6

X

i,j=2

λ_iλ_ju^>_i diag(v₁)u_j

1−λ_i−λ_j v^>_i v_j, (19)

by (13) it follows that (15) holds.

3.3 Proof of Theorem 3.1

Proof. Observe that P6

i=1U_n,i = 3 + 2n (since 2 balls are added into the urn at every time point), thus the vector of color proportions is U_n/(3 + 2n). Since α ∈ (0, 1), it follows that Tα is invertible and its inverse is

T_α⁻¹ = 1

α(1 − α)diag(α, α, α, α, 1 − α, 1 − α),

which is also a diagonal matrix, and so (T_α⁻¹)^> = T_α⁻¹. Note that we have U_n = eU_nT_α⁻¹ and consider

v =ve₁(T_α)⁻¹ = 1

2(3 − 2α) 2(1 − α), 2(1 − α), 1 − α, 1 + α, 1 − α, 5 − 3α.

Since Ue_n n

−→a.s. ve₁ holds in view of (13) in Theorem 3.4, U_n

n

−→ v,a.s. (20)

which concludes the proof of the almost sure convergence in (7).

Consider the covariance matrix eΣ for eU_n as stated in (16), then by straightforward cal- culation we have

Σ = (T_α⁻¹)^>ΣTe _α⁻¹ = T_α⁻¹ΣTe _α⁻¹. Therefore, since

Ue_n− nev₁

√n

−−d→ N (0, eΣ) in view of Theorem 3.4, we get

U_n− nv

√n

−−d→ N 0, (T_α⁻¹)^>Σ Te _α⁻¹ = N (0, Σ).

This completes the proof.

(13)

4 Exact Distributions

In this section, we present recursion formulas for exact computation of the joint distributions of cherries and pitchforks, their means, variances and covariance for fixed n under the Ford model.

We begin with the following notation. Given a phylogenetic tree T , let E₁(T ) be the set of pendant edges that are contained in a pitchfork but not a cherry; E2(T ) the set of edges in T that are contained in a cherry but not in a pitchfork (note that in our notation a cherry contains three leaves); E₃(T ) the set of pendant edges that are contained in neither a cherry nor a pitchfork; and E4(T ) = E(T ) \ (E1(T ) ∪ E2(T ) ∪ E3(T )). In addition, E(T ) can be decomposed into the disjoint union of these four sets of edges. i.e., E(T ) = E₁(T ) t E₂(T ) t E₃(T ) t E₄(T ), where t denotes disjoint union. Let C(T ), A(T ) be the number of cherries and pitchforks in a tree T . The following result presented in [25] will be useful later.

Lemma 4.1. Suppose that T is a phylogenetic tree with n leaves. Then we have

E(T ) = E₁(T ) t E₂(T ) t E₃(T ) t E₄(T ). (21) In addition, we have |E₁(T )| = A(T ), |E₂(T )| = 3(C(T ) − A(T )), |E₃(T )| = n − A(T ) − 2C(T ), and |E₄(T )| = n − 1 + 3A(T ) − C(T ). Furthermore, suppose that e is an edge in T and T⁰ = T [e]. Then we have

A(T⁰) =







A(T ) if e ∈ E3(T ) ∪ E4(T ), A(T ) − 1 if e ∈ E₁(T ),

A(T ) + 1 if e ∈ E2(T );

and C(T⁰) =







C(T ) if e ∈ E2(T ) ∪ E4(T ),

C(T ) + 1 if e ∈ E1(T ) ∪ E3(T ).

We start with the following result on the exact computation of the joint probability mass function (pmf) of A_n and C_n, which can be regarded as a generalization of the previous results on the Yule model (e.g. when α = 0 [25, Theorem 1]) and the uniform model (e.g.

α = 1/2 [25, Theorem 4]). A related result for unrooted trees is presented in [8].

Theorem 4.2. For n ≥ 3, 0 ≤ a ≤ n/3 and 1 ≤ b ≤ n/2, under the Ford model with parameter α ∈ [0, 1] we have

P(Aⁿ⁺¹ = a, C_n+1 = b)

= 2a + α(n − a − b − 1)

n − α P(Aⁿ = a, C_n= b) + (1 − α)(a + 1)

n − α P(Aⁿ= a + 1, C_n= b − 1) +(2 − α)(b − a + 1)

n − α P(An= a − 1, C_n = b) +(1 − α)(n − a − 2b + 2)

n − α P(An = a, C_n = b − 1).

Proof of Theorem 4.2. Fix n > 3, and let T₂, . . . , T_n, T_n+1 be a sequence of random trees generated by the Ford process, that is, T₂ contains two leaves and T_i+1= T_i[e_i] for a random

(14)

edge e_i in T_i chosen according to the Ford model for 2 ≤ i ≤ n. Then we have P(An+1 = a, C_n+1 = b) = P(A(Tn+1) = a, C(T_n+1) = b)

=X

p,q

P(A(Tn+1) = a, C(T_n+1) = b | A(T_n) = p, C(T_n) = q)P(A(Tn) = p, C(T_n) = q)

=X

p,q

P(A(Tⁿ⁺¹) = a, C(Tn+1) = b | A(Tn) = p, C(Tn) = q)P(Aⁿ= p, Cn= q), (22) where the first and second equalities follow from the law of total probability, and the defini- tion of random variables An and Cn.

Let e_n be the edge in T_n chosen in the above Ford process for generating T_n+1, that is, T_n+1 = T_n[e_n]. Since Lemma 4.1 implies that

P(A(Tn+1) = a, C(T_n+1) = b | A(T_n) = p, C(T_n) = q) = 0 (23) for (p, q) 6∈ {(a, b), (a + 1, b − 1), (a − 1, b), (a, b − 1)}, it suffices to consider the following four cases in the summation in (22): case (i): p = a, q = b; case (ii): p = a + 1, q = b − 1; case (iii): p = a − 1, q = b; and case (iv): p = a, q = b − 1.

Firstly, Lemma 4.1 implies that case (i) occurs if and only if e_n ∈ E₄(T_n). Using Lemma 4.1 again, it follows that E₄(T_n) contains precisely 2A(T_n) pendent edges and (n − 1) + A(T_n) − C(T_n) interior edges. Therefore we have

P(A(Tn+1) = a, C(T_n+1) = b | A(T_n) = a, C(T_n) = b)

= 2A(T_n)(1 − α) + α(n − 1 + A(T_n) − C(T_n))

n − α = 2a + α(n − a − b − 1)

n − α . (24)

Similarly, Lemma 4.1 implies that case (ii) occurs if and only if e_n ∈ E₁(T_n). Using Lemma 4.1 again, it follows that E₁(T_n) contains precisely A(T_n) pendent edges and no interior edges. Therefore we have

P(A(Tn+1) = a, C(T_n+1) = b | A(T_n) = a + 1, C(T_n) = b − 1) = (a + 1)(1 − α)

n − α . (25) Next, Lemma 4.1 implies case (iii) occurs if and only if e_n ∈ E₂(T_n). Using Lemma 4.1 again, it follows that E₂(T_n) contains precisely 2(A(T_n) − C(T_n)) pendent edges and A(T_n− C(T_n) interior edges. Thus we have

P(A(Tⁿ⁺¹) = a, C(Tn+1) = b | A(Tn) = a − 1, C(Tn) = b)

= 2(a − 1 − b)(1 − α) + α(a − 1 − b)

n − α = (2 − α)(b − a + 1)

n − α . (26)

Finally, Lemma 4.1 implies case (iv) occurs if and only if e_nis contained in E₃(T_n). Using Lemma 4.1 again, it follows that E3(Tn) contains precisely n − A(Tn) − 2C(Tn) pendent edges and no interior edges. Hence, it follows that

P(A(Tⁿ⁺¹) = a, C(T_n+1 = b) | A(T_n) = a, C(T_n) = b − 1) = (1 − α)(n − a − 2b + 2)

n − α . (27)

Substituting Eq. (24)–(27) into Eq. (22) completes the proof of the theorem.

(15)

To study the moments of A_n and C_n, we present below a functional recursion form of Theorem 4.2, whose proof is straightforward and hence omitted here.

Theorem 4.3. Let ϕ : N × N → R be an arbitrary function. For n ≥ 3, under the Ford model with parameter α ∈ [0, 1] we have

(n − α)Eϕ(An+1, C_n+1) = E

α(n − An− C_n− 1) + 2A_n ϕ(An, C_n)

+(1 − α)A_nϕ(A_n− 1, C_n+ 1) + (2 − α)(C_n− A_n)ϕ(A_n+ 1, C_n) +(1 − α)(n − A_n− 2C_n)ϕ(A_n, C_n+ 1)

.

For a fix integer k, consider the indicating function I_k(x, y) that equals to 1 if y = k, and 0 otherwise. Then by Theorem 4.3 the following result on the distribution of cherries follows.

Corollary 4.4. For integers n ≥ 3 and 0 ≤ k ≤ n/2, under the Ford model with parameter α ∈ [0, 1] we have

(n − α)P(Cn+1= k) = [(n − 1)α + 2(1 − α)k]P(Cn= k) + (1 − α)(n − 2k + 2)P(Cn+1= k − 1).

For the purpose of next section, we end this section by writing the recurrence relation in the following form in the next Corollary.

Corollary 4.5. For n ≥ 3, under the Ford model with parameter α ∈ [0, 1] we have

(n − α)E[Cn+1] − (n − 2 + α)E[Cn] = n(1 − α), (28) (n − α)E[An+1] − (n − 3 + α)E[An] = (2 − α)E[Cn], (29) (n − α)E[Cn+1² ] − (n − 4 + 3α)E[Cn²] = 2(n − 1)(1 − α)E[Cn] + n(1 − α), (30) (n − α)E[An+1C_n+1] − (n − 5 + 3α)E[AnC_n] = (n − 1)(1 − α)E[An] + (2 − α)E[Cn²], (31) (n − α)E[A²n+1] − (n − 6 + 3α)E[A²n] = 2(2 − α)E[AⁿCn] + (2 − α)E[Cⁿ] − E[A(32)ⁿ] with initial conditions E[A3] = E[C3] = E[A²3] = E[C3²] = E[A3C₃] = 1.

Remark 4. Let µ_n= E[Cn] and σ_n² = var(C_n). Substituting E[Cn²] = σ²_n+ µ²_n into (30) and applying (28), we obtain below a recurrence relation of the σ_n², which was also obtained in Ford’s thesis (Theorem 60, [14]):

(n − α)σ_n+1² − (n − 4 + 3α)σ_n² = −4(1 − α)²

n − α µ²_n+ 2(1 − α)[(1 − 2α)n + α]

n − α µ_n+α(1 − α)n(n − 1)

n − α .

(16)

5 Higher Order Asymptotic Expansion of the Joint Moments

Although the leading terms of the first and second moments of the distributions of cherries and pitchforks, E[An], E[Cn], var(A_n), var(C_n) and cov(A_n, C_n), can be identified from Theo- rem 3.2, for better understanding of their asymptotic behaviour we derive their higher order expansions in this Section.

We start with the following result on the first moments. Note that Proposition 5.1 (i) has been obtained in [14].

Proposition 5.1. Under the Ford model with parameter α ∈ [0, 1], the following exact expansions hold for E[Cn] and E[An].

(i) E[Cn] = 1 − α

3 − 2α n + α

2(3 − 2α) + x_n, where

x₂ = (2 − α)

2(3 − 2α), x₃ = α

2(3 − 2α), x_n = α 2(3 − 2α)

n−1

Y

i=3

i − 2 + α

i − α , n ≥ 4.

Further, as n → ∞,

x_n = αΓ(3 − α)

2(3 − 2α)Γ(1 + α)n^−2(1−α)(1 + o(1)) . (33) (ii) E[An] = 1 − α

2(3 − 2α) n + α

2(3 − 2α) + y_n, where

y₂ = α − 2

2(3 − 2α), y₃ = 1

2, y_n = 1 2

n−1

Y

i=3

i − 3 + α

i − α +(2 − α)α 2(3 − 2α)

n − 3 n − 3 + α

n−1

Y

i=3

i − 2 + α

i − α , n ≥ 4.

Further, as n → ∞,

y_n = (2 − α)Γ(3 − α)

2(3 − 2α)Γ(α) n^−2(1−α)(1 + o(1)) . (34) Proposition 5.2. Under the Ford model with parameter α ∈ [0, 1], the following asymptotic expansions hold for var(Cn), cov(An, Cn) and var(An):

(i)

var(C_n) = (1 − α)(2 − α)

(3 − 2α)²(5 − 4α) n − α(1 − α)(2 − α)

(3 − 2α)²(5 − 4α) + O(n^−2(1−α)).

(ii)

cov(A_n, C_n) = −(1 − α)(2 − α)(1 − 2α)

2(3 − 2α)²(5 − 4α) n − α(1 − α)(2 − α)

(3 − 2α)²(5 − 4α) + O(n^−2(1−α)).

(17)

(iii)

var(A_n) = (1 − α)(69 − 135α + 96α²− 24α³)

4(3 − 2α)²(5 − 4α)(7 − 4α) n+3α(1 − α)(1 − 2α)(5 − 3α)

4(3 − 2α)²(5 − 4α)(7 − 4α)+O(n^−2(1−α)).

Remark 5. When n is large, Cov(A_n, C_n) changes sign. Specifically, for α ∈ (0, 1/2), A_n and Cn are negatively correlated, which is expected; and for α ∈ (1/2, 1), An and Cn are positively correlated, which is unexpected.

5.1 Proofs of Propositions 5.1 and 5.2

We need the lemmas below to prove the two propositions.

Lemma 5.3. Suppose a real sequence {X_n, n ≥ n₀} satisfies the recursion X_n+1= f_nX_n+ g_n, n ≥ n₀,

where {f_n, n ≥ n₀} and {g_n, n ≥ n₀} are sequences such that for every ` ≥ n₀, |Qn

i=`f_i| ≤ C(n/`)^−a and |g_`| ≤ C`^−b, for some finite a, b and C > 0. Then, there exists a finite positive constant C⁰ (which depends on |X_n₀| and C) such that |X_n| ≤ C⁰n^−q^a,b where q_a,b :=

min{a, b − 1}.

Proof of Lemma 5.3. It is easy to verify that the solution to the given recursion is given by

X_n= X_n₀

n−1

Y

i=n0

f_i+

n−1

X

i=n0

g_i

n−1

Y

j=i+1

f_j, n ≥ n₀.

Therefore,

|X_n| ≤ |X_n₀|

n−1

Y

i=n0

f_i

+

n−1

X

i=n0

|g_i|

n−1

Y

j=i+1

f_j . Under the assumptions of the Lemma,

|X_n₀|

n−1

Y

i=k

f_i

≤ C|X_n₀|n^−a≤ C⁰n^−a;

and

n−1

X

i=n0

|g_i|

n−1

Y

j=i+1

f_j

≤ C

n−1

X

i=n0

|g_i|(n/i)^−a ≤ C²n^−a

n−1

X

i=n0

i^−bi^a≤ C⁰n^−an^−b+a+1 = C⁰n^−b+1.

Thus

|X_n| ≤ C⁰max(n^−a, n^−b+1) = C⁰n^−q^a,b, where q_a,b= min{a, b − 1}. This completes the proof.

(18)

Lemma 5.4. For finite non-negative integers l, k such that l ≥ k, m ≥ 1 and α ∈ [0, 1], there exists a positive constant K = K(α, l) such that

n−1

Y

i=l

i − k + mα i − α

≤ K (n/l)^−k+(m+1)α for all 1 ≤ l ≤ n − 1. (35)

and as n → ∞

n−1

Y

i=l

i − k + mα

i − α = Γ(l − α)

Γ(l − k + mα)n^−k+(m+1)α(1 + o(1)) . (36) Proof of Lemma 5.4. The bound in (35) follows from Lemma 2 of [7]. We now prove (36).

Note that, we can write

i − k + mα

i − α = Γ(i + 1 − k + mα)Γ(i − α) Γ(i − k + mα)Γ(i + 1 − α). Thus

n−1

Y

i=l

i − k + mα i − α =

n−1

Y

i=l

Γ(i + 1 − k + mα)Γ(i − α) Γ(i − k + mα)Γ(i + 1 − α)

= Γ(n − k + mα) Γ(l − k + mα)

Γ(l − α)

Γ(n − α) (37)

= Γ(l − α) Γ(l − k + mα)

Γ(n + mα) Γ(n − α)

k

Y

j=1

1 n − j + mα.

k

Y

j=1

1

n − j + mα = n^−k(1 + o(1)) . (38)

By Stirling’s approximation formula, Γ(x) =√

2π x^x−1/2e^−x(1 + o(1)), we have Γ(n + mα)

Γ(n − α) =

√2π(n + mα)^n+mα−1/2e^−(n+mα)

√2π(n − α)^n−α−1/2e^−(n−α) (1 + o(1))

= n^(m+1)α(1 + mα/n)^n+mα−1/2

(1 − α/n)^n−α−1/2 e^−(m+1)α(1 + o(1))

= n^(m+1)α(1 + o(1)) . (39)

Combining (38) and (39), we get (36).

Proof of Proposition 5.1. Recall µn = E[Cⁿ]. By Theorem 3.2, µn = _3−2α^1−α n + O(1). Thus, we write µ_n as

µ_n = 1 − α

3 − 2α n + α

2(3 − 2α) + x_n. (40)

(19)

For simplicity, the dependence of µ_n and x_n on α are suppressed.

Since µ₂ = µ₃ = 1, we get x₂ = 1 − _2(3−2α)^4−3α = _2(3−2α)^2−α and x₃ = 1 − _2(3−2α)^6−5α = _2(3−2α)^α . Substituting (40) into (28) leads to

(n − α)x_n+1− (n − 2 + α)x_n= 0, n ≥ 2, and hence,

x_n =







α 2(3−2α)

Qn−1 i=3

i−2+α

i−α n ≥ 4,

α

2(3−2α) n = 3,

(2−α)

2(3−2α) n = 2.

To prove (33), we rewrite x_n as follows

x_n= x₃

n−1

Y

i=3

i − 2 + α

i − α = x₃Γ(3 − α) Γ(1 + α)

Γ(n − 2 + α)

Γ(n − α) , n ≥ 4.

Apply Lemma 5.4, (33) holds. Consequently, µ_n= 1 − α

3 − 2αn + α

2(3 − 2α) + αΓ(3 − α)

2(3 − 2α)Γ(1 + α)n^−2(1−α)(1 + o(1)) . (41) This completes the proof of part (i).

The same method of proof can be used to prove part (ii). Recall ν_n= E[An]. By Theorem 3.2, ν_n= _2(3−2α)^1−α n + O(1), and we write it as

ν_n= 1 − α

2(3 − 2α) n + α

2(3 − 2α) + y_n, (42)

where, again, the dependence of ν_n and y_n on α are suppressed. Substituting (42) into (29) leads to

y_n+1 = n − 3 + α

n − α y_n+ 2 − α

n − αx_n, n ≥ 4.

The solution to this recurrence relation is given by

y_n= y₃

n−1

Y

i=3

i − 3 + α i − α +

n−1

X

i=3

2 − α i − αx_i

n−1

Y

j=i+1

j − 3 + α j − α .

(20)

Since y₃ = 1/2 and the expression for x_i from part (i), we get

yn= y3 n−1

Y

i=3

i − 3 + α i − α +

n−1

X

i=3

2 − α

i − α × α 2(3 − 2α)

i−1

Y

j=3

j − 2 + α j − α ×

n−1

Y

j=i+1

j − 3 + α j − α

= 1 2

n−1

Y

i=3

i − 3 + α

i − α + (2 − α)α 2(3 − 2α)

n−1

X

i=3 n−1

Y

j=i+1

j − 3 + α j − α × 1

i − α ×

i−1

Y

j=3

j − 2 + α j − α

= 1 2

n−1

Y

i=3

i − 3 + α

i − α + (2 − α)α 2(3 − 2α)

n−1

X

i=3

1 3 − α

n−1

Y

j=4

j − 3 + α j − α

= 1 2

n−1

Y

i=3

i − 3 + α

i − α + (2 − α)α 2(3 − 2α)

(n − 3) (3 − α)

n−1

Y

j=4

j − 3 + α j − α . Thus, for n ≥ 5,

yn= 1

2 + (2 − α)α 2(3 − 2α)

(n − 3) (3 − α)

ⁿ⁻¹ Y

j=4

j − 3 + α

j − α . (43)

By Lemma 5.4,

y_n = 1

2 + (2 − α)α 2(3 − 2α)

(n − 3) (3 − α)

Γ(4 − α)

Γ(1 + α)n^−3+2α(1 + o(1))

= (2 − α)α 2(3 − 2α)

Γ(3 − α)

Γ(1 + α)n^−2+2α(1 + o(1))

= (2 − α)Γ(3 − α)

2(3 − 2α)Γ(α) n^−2(1−α)(1 + o(1))

as n → ∞. This completes the proof of part (ii) and hence the Proposition.

Proof of Proposition 5.2. The method of proof is similar to that of Proposition 5.1.

Recall σ_n² = var(C_n). From Theorem 3.2, we have σ²_n = (5−4α)(3−2α)^{(1−α)(2−α)}²n + O(1). We first consider E[Cn²]. As

E[Cn²] = µ²_n+ σ_n² = (1 − α)²

(3 − 2α)²n²+ O(n), we rewrite it as

E[Cn²] = (1 − α)²

(3 − 2α)² n² +2(1 − α)(1 + 2α − 2α²)

(5 − 4α)(3 − 2α)² n − α(8 − 17α + 8α²)

4(5 − 4α)(3 − 2α)² + z_n, (44) and derive below a recursion on z_n. Substituting (44) into (30) and after straightforward algebraic simplification, we have

(n − α)z_n+1− (n − 4 + 3α)z_n= 2(1 − α)(n − 1)x_n, n ≥ 2.

(21)

Since C₂ = C₃ = 1, we get z₂ = ^{3(2−α)(8α}_4(3−2α)²2^−21α+14)(5−4α) and z₃ = ^88α_4(3−2α)³^−213α²2^+152α−24(5−4α) . Consequently,

σ_n² = (1 − α)(2 − α)

(5 − 4α)(3 − 2α)² n − α(1 − α)(2 − α)

(5 − 4α)(3 − 2α)² + v_n− x²_n, where

v_n= z_n−2(1 − α)

3 − 2α nx_n− α

3 − 2αx_n= z_n− [2(1 − α)n + α]

3 − 2α x_n. Then, for n ≥ 6,

(n − α)v_n+1 = (n − α)z_n+1− [2(1 − α)(n + 1) + α]

3 − 2α (n − α)x_n+1

= (n − 4 + 3α)z_n+ 2(1 − α)(n − 1)x_n−[2(1 − α)(n + 1) + α]

3 − 2α (n − 2 + α)x_n

= (n − 4 + 3α)v_n+ (n − 4 + 3α)[2(1 − α)n + α]

3 − 2α x_n + 2(1 − α)(n − 1)x_n− [2(1 − α)(n + 1) + α]

3 − 2α (n − 2 + α)x_n

= (n − 4 + 3α)v_n−2(1 − α) 3 − 2α x_n. Equivalently,

vn+1= n − 4 + 3α

n − α vn− 2(1 − α) (3 − 2α)

x_n (n − α).

Applying Lemma 5.3, with f_n= ^(n−4+3α)_(n−α) , g_n = −(3−2α)(n−α)^2(1−α)xⁿ , a = 4 − 3α and b = 3 − 2α, we get v_n= O(n^−2+2α). This proves part (i) of the proposition.

Part (ii) is proved in a similar fashion. By Theorem 3.2, Cov(An, Cn) = −(1−α)(2−α)(1−2α) 2(3−2α)²(5−4α) n+

O(1). Since E[AⁿCn] = Cov(An, Cn) + µnνn, with µn and νn found in Proposition 5.1, we write

E[AnC_n] = (1 − α)²

2(3 − 2α)² n²−(1 − α)(4 − 25α + 16α²)

4(5 − 4α)(3 − 2α)² n − α(8 − 17α + 8α²)

4(5 − 4α)(3 − 2α)² + t_n.(45) Combining (31) and (45), t_n satisfies the recursion,

(n − α)t_n+1− (n − 5 + 3α)t_n = (2 − α)z_n+ (1 − α)(n − 1)y_n, n ≥ 6. (46) By (40), (42) and (45),

Cov(A_n, C_n) = −(1 − α)(2 − α)(1 − 2α)

2(5 − 4α)(3 − 2α)² n − α(1 − α)(2 − α)

(5 − 4α)(3 − 2α)² + w_n− x_ny_n,