4.3 Testing between 1 and 2 mixture components
4.3.4 A special case with 2 states
Frydman (2005) presents mixtures of Markov chains that migrate among the dis- crete states 1, . . . , w, where the wth state is an absorbing state. We consider the simplest case where we have w = 2 states in total. In the context of modelling credit rating migrations, the first state represents “non-default”, the other repre- sents “default”. As in Frydman (2005) (and widely in practice), we assume the second, “default” state is an absorbing state. The time until a transition from “non-default” to “default” is exponentially distributed. However, since a firm is only observed for a fixed time period then the default time may or may not occur in the observation period. Thus, a sample of independent observations of n such firms is iid with a censored exponential distribution.
Now, we consider a sampleX1, X2, . . . , Xnofiidobservations from a 2-component
mixture of 2-state continuous-time Markov chainsX0(t) and X1(t), observed from
time 0 to T, starting in state 1 and with state 2 being an absorbing state. In this case, we do not require the parameter d since all of our firms begin in the non-default state 1 (i.e. d= (1,0)) and our parameters, which is represented as
si,m= πmdi,m
PN
m=1πmdi,m
for i= 1,2 andm = 1,2
can be represented by a simple scalar p where
p = P(X(t) = X1(t)|X(0) = 1)
= P(X(t) = X1(t)) and
1−p = P(X(t) = X0(t)).
Note that we do not have multiple jumps for a particular observation. It either stays in state 1 for the entire observation window [0, T] or it jumps once to the
absorbing state 2 within the observation window and remains there until the end of the observation window T. Thus, we can represent an observation x(·) using the time until the first jump out of the initial state with
x= T0 if T0 < T and T otherwise.
This is (in effect) the amount of time that a firm is observed in the “non- default” state. Our density for a particular observation x(·) =x is then
f(x) = (1−p)g(x|λ0) +pg(x|λ1) (61) where, g(x|λ) is given by g(x|λ) = λe−λx if 0< x < T e−λT if x=T 0 otherwise. (62)
This is a censored exponential distribution, with rate parameterλobserved within a finite time window (0, T].
For the hypothesis test, which tests between 1 and 2 mixture components for the two-state case
H0 : X(·) ∼ g(x|λ0),
H1 : X(·) ∼ (1−p)g(x|λ1) +pg(x|λ2) (63)
we define the log-likelihood ratio for n iid observations to be
Λn = n X i=1 sup λ1,λ2,p log [(1−p)g(x|λ1) +pg(x|λ2)]−sup λ0 log [g(x|λ0)] .
From (48), we see that in the test between 1 and 2 components we have Λn ≥Λ1+Op(1), (64) where Λ1 = sup λ1,p n X i=1 log (1−p)g(x|λ0) +pg(x|λ1) g(x|λ0) . (65)
If Λ1 diverges as the sample sizen → ∞then this implies Λnalso diverges. We
have shown that Λ1 does in fact diverge, using Theorem 4.3. Understanding the
rate of divergence of Λ1will give us a substantial insight into the rate of divergence
for Λn. From (64), we see that if Λ1 diverges at a rate R1,n, then Λn diverges at a
rate R2,n ≥R1,n. For the case where w= 2 we can go beyond the rate to find the
exact limiting distribution of Λ1.
Without loss of generality, we will also assume that the true value under the null hypothesis forλ0 is 1. Then we can present the following theorem, with proof
to be provided in the following chapter:
Theorem 4.4. If we let Λ = sup λ>0,0≤p≤1 n X k=1 {log [(1−p)g(xk|1) +pg(xk|λ)]−logg(xk|1)},
where g(x|λ) is given by (62) above, then
lim
n→∞P
5
Censored exponential mixture detection
In this chapter, we continue with the motivating application of the previous chap- ter, where we are faced with the problem of modelling the non-homogeneous dy- namics of credit rating migrations of firms. We are focussed on different regimes applying to different segments of the population, rather than different regimes over time. The theoretical developments in this chapter contribute towards establish- ing a proof of Theorem 4.4, which states the exact limiting distribution of the log-likelihood ratio test statistic for the test between 1 and 2 component mixtures of Markov chains, which each have 2 states, with the second state being an absorb- ing state. The challenges of testing between 1 and 2 component mixtures using the likelihood ratio test were explored for location mixtures of normal distributions in Hartigan (1985), which proves that the log-likelihood ratio diverges in probability and conjectures that the rate of divergence is of the order log logn where n is the sample size. This conjecture was later proven in Bickel and Chernoff (1993) and Liu and Shao (2004). The problem of finding the limiting distribution was addressed for location mixtures of normal distributions in Garel (2005) and for mixtures of gamma distributions in Liu et al. (2003). Although there have been some studies that work towards a general solution, under particular regularity conditions, such as Dacunha-Castelle and Gassiat (1997) and Liu and Shao (2003), there remains a gap in the theory for our specific problem of testing between 1 and 2 Markov chain mixture components with 2 states, one of which is an absorbing state. It is motivated by a simple case of our practical example from Frydman (2005) and our key result is that we go beyond our findings in the previous chapter, where we proved that the log-likelihood ratio test statistic diverges to infinity as the sample size n → ∞, to successfully derive its rate of divergence and exact limiting dis- tribution. We find that this problem can be reframed as a test between a 1 and 2 component mixture of censored exponentials and so is more broadly applicable
than just to our Markov chain context.
We follow a similar strategy to Liu et al. (2003) and solve some key theoretical challenges that arise from the fact that our practical application requires that we have censoring (due to the finite observation window on our data). Liu et al. (2003) derive the limiting distribution of the log-likelihood ratio test statistic for testing between 1 and 2 components in a scale mixture of gamma distributions, with the constraint that the scale parameter of the second unknown component is greater than the scale parameter of the first known component. Technical difficulties pre- vent them from dealing with the two-sided version of the test. The log-likelihood ratio test statistic is shown in Liu et al. (2003) to be asymptotically equivalent to the square of the maximum of a stationary Gaussian process over an interval whose length increases as the logarithm of the sample size. The stationarity of the Gaussian process is crucial to their derivation of the limiting distribution of the statistic. The corresponding process in the censored case is no longer stationary and so in order to use the same general strategy of Liu et al. (2003) some new tools are required. Such tools are provided by the locally stationary Gaussian process extreme value theory developed by H¨usler (1990). One obstacle to the use of these tools is the potentially difficult verification that a given Gaussian process is indeed in thelocally stationary class. Our Lemma 5.9 achieves this for the Gaussian pro- cess we consider by showing that certain higher-order derivatives of its correlation function are uniformly controlled.
A happy consequence of the censoring is that we are able to consider the two- sided version of the testing problem. We are able to elegantly extend the methods of Liu et al. (2003) to analyse the maximum of the log-likelihood ratio statistic over this extended range, thus removing the rather restrictive one-sided constraint that Liu et al. (2003) are forced to adhere to in the uncensored version of the problem. We then use this result to derive the exact limiting distribution of the log-likelihood
ratio test statistic, thus solving the outstanding practical problem from Frydman (2005). After providing an overview of the testing problem in Section 5.1, we work in Section 5.2 to establish our key results. We then provide the detailed proofs of these results in Section 5.3.
5.1
An overview of the testing problem
Censored exponentials are widely used in practice for modelling time-to-event data where events occur with a constant underlying rate over a given finite time win- dow (0, T]. In the previous chapter we studied a problem that was motivated by the application of modelling credit rating migration dynamics of firms, which involved testing for mixtures of discrete-state Markov chains with an absorbing state observed continuously over a finite time period. In the simplest case when the Markov chain has 2 states, the time to absorption has a censored exponential distribution. We consider the problem of testing for the existence of a mixture of censored exponentials. Specifically, we study the asymptotics of the log-likelihood ratio test statistic for testing between 1 and 2 mixture components and show that it diverges in probability at a rate of log logn, where n is the sample size.
Let X1, X2, . . . , Xn be an independent and exponentially distributed sample
with rate parameter λ. Since we are only observing the data from time 0 to T, we define Yi = min(Xi, T), so that Y1, Y2, . . . , Yn is an iid sample from a censored
exponential distribution. We thus have the cumulative distribution function
Gλ(y) = 0 if y <0 1−e−λy if 0≤y < T 1 if y≥T.
This distribution has a density gλ(y) = λe−λy if 0< y < T e−λT if y=T 0 otherwise,
with respect to a dominating measure given by the sum of Lebesgue measure on [0,∞) and counting measure on {T}. The expectation operator with respect to this density is given by
E[f(Y1)] = Z f gλdµ= Z T 0 f(y)λe−λydy+f(T)e−λT (66)
where µ(A) = L(A) +1{T ∈A} with L(·) the Lebesgue measure.
The log-likelihood of a series of observations y = (y1, y2, . . . , yn) can thus be
written as
L(1)n (λ|y, T) =
n
X
i=1
log λe−λyi1{y
i < T}+e−λT1{yi =T}
. (67)
The corresponding 2-component mixture distribution, where each observation
y has density (1−p)g(y|λ0, T) +pg(y|λ, T), yields a log-likelihood for n iidobser-
vations y= (y1, y2, . . . , yn) as follows
L(2)n (p, λ0, λ|y, T) = Pni=1 log [(1−p)λ0e−λ0yi+pλe−λyi]1{yi < T}
+[(1−p)e−λ0T +pe−λT]1{y
i =T}
. (68)
We are interested in the testing problem
H0 : Y1 ∼ Gλ0, for λ0 >0 known, against
where without loss of generality, we may take λ0 = 1. For convenience, we write
g =g1. We write the log-likelihood ratio test statistic as
Λn = sup p,λ Ln(p, λ) = sup p,λ L(2)n (p,1, λ|Y, T)−L(1)n (1|Y, T) = sup p,λ n X i=1 log (1−p)g(Yi) +pgλ(Yi) g(Yi) = sup p,λ n X i=1 log [1 +pZi(λ)], (70) where Zi(λ) = gλ(Yi) g(Yi) −1 = λe−(λ−1)Yi1{Y i < T}+e−(λ−1)T1{Yi =T} −1. (71)
From (66), we calculate the expected value and variance of Z1(λ) under the
single component density as
E{Z1(λ)} = E gλ(Y1) g(Y1) −1 = 0 and Var{Z1(λ)} = E h λe−(λ−1)Y11{Y 1 < T}+e−(λ−1)T1{Y1 =T} −1 2i = Z T 0 λ2e−(2λ−1)ydy+e−(2λ−1)T −1 = λ2 2λ−1 −1 1−e−(2λ−1)T. (72)