interval-censored time-to-event data
2.2 maximum likelihood estimation
× k ∏ j=2 {exp(−w⊺Λ(yk,j−1)ez ⊺θ ) −exp(−w⊺Λ(yk,j)ez ⊺θ )}δk,j × {exp(−w⊺Λ(yk,k)ez ⊺θ )}δk,k+1 (2.2) =δk,1{1−exp(−w⊺Λ(yk,1)ez ⊺θ )} + k+1 ∑ j=2δk,j {exp(−w⊺Λ(yk,j−1)ez ⊺θ ) −exp(−w⊺Λ(yk,j)ez ⊺θ )}, +δk,k+1exp(−w⊺Λ(yk,k)ez ⊺θ ). LetXi = (∆Ki i,Y i
Ki,Ki,Wi,Zi),i=1, . . . ,n, beniid observations ofXfrom(θ0,Λ0), where Yi
Ki = (YKii,1, . . . ,YKii,Ki) and ∆Kii = (∆Kii,1, . . . ,∆iKi,Ki+1). The corresponding log-likelihood function is log likn(θ,Λ) = n ∑ i=1∆ i Ki,1log{1−exp(−Wi⊺Λ(YKii,1)eZ ⊺ iθ)} + Ki ∑ j=2∆ i Ki,jlog{exp(−Wi⊺Λ(YKii,j−1)eZ ⊺ iθ) −exp(−Wi⊺Λ(YKi i,j)eZ ⊺ iθ)} −∆iK i,Ki+1Wi⊺Λ(YKii,Ki)eZ ⊺ iθ. (2.3)
2.1 remark. The expression in (2.3) reduces to same likelihood function obtained under the noninformative censoring mechanism from Definition1.32(Lawless2003, p. 65). The stronger requirement ina1simplifies the derivation of asymptotic prop- erties. It may be motivated by the setting in which individuals are assessed according to a predetermined schedule, with the completion and exact timing of assessments determined by some random process related toT only via(W,Z). ◽
2.2 maximum likelihood estimation
The regression model (2.1) is a valid intensity function provided thatW⊺Λis almost
surely nondecreasing. Estimating equations derived from the likelihood process (1.4) often dispense with the requirement entirely, but (1.4) applies only to filters. A dis- crete inspection process necessitates constrained maximization of the likelihood. To address this complication, consider two simplifying assumptions:
a2 The support ofFW,W ≡ supp(FW), is a bounded subset ofRdw. In particular,
there exists some knownw0,w1∈Wsuch that P(w0≤W ≤w1) =1.
a3 FW2× ⋯ ×FW
2.2 remark. Conditiona3essentially implies that we needw⊺Λnondecreasing for
everyw ∈W. Conditiona2allows us to ensure monotonicity only inw⊺Λ, wherew
is a matrix whose entries are determined from the values inw0andw1. The choice of
w0andw1is relatively straightforward by standardizing any continuous covariates.
In general W is appropriately scaled so that the first entry in Λ, Λ1, is a baseline
cumulative hazard function. ◽
Not every inspection time contributes information to the likelihood function. As in Groeneboom and Wellner (1992, Part ii, Definition 1.1) irrelevant inspections can be discarded to obtain a “thinned” set of observation times.
2.3 definition. LetY(1), . . . ,Y(m)be the order statistics of
Y= {YKi
i,j,j=1, . . . ,Ki,i=1, . . . ,n∶∆iKi,j+∆iKi,j+1=1}
and(W(i),∆(i))denote the(W,∆K,j)corresponding to theith order statisticY(i). ◽
If∆(1)=0, then theΛmaximizing (2.3) should satisfyΛ(Y(1)) =0. If∆(m)=1, then
the maximizing Λ satisfies w⊺Λ(Y(m)) = ∞ for everyw ∈ W or, in other words,
Λ1(Y(m)) = ∞. When combined with the remaining observations in Definition2.3,
these cases contribute nothing to the likelihood. So without loss of generality assume that∆(1)=1 and∆(m)=0.
LetΘandHdenote the set all possibleθandΛ, respectively. In particularHis the set of all zero-at-time-zero cadlag functions{Λ}on[0,τ]withw⊺Λnondecreasing.
Since∆(m)=0 we can (again without loss of generality) assume that eachΛ∈H is
uniformly bounded with 0<w⊺Λ(τ) < ∞. Under conditionsa1toa3themaximum
likelihood estimator(θˆn, ˆΛn)is defined by
log likn(θˆn, ˆΛn) = max
θ∈Θ,Λ∈Hlog likn(θ,Λ).
Since the likelihood depends on Λonly through its value at the inspection times, we take(θˆn, ˆΛn)as thesemiparametric maximum likelihood estimator(spmle) that
concentrates its distribution function on a subset ofY. Themaximalsubset can be identified by adapting Turnbull (1976, Lemmas 1 and 2).
2.4 definition. Let (L,R]denote thecensoring interval (YK,j−1,YK,j]satisfying
∆K,j = 1; that is, T ∈ (YK,j−1,YK,j] = (L,R]. From the random sample X1, . . . ,Xn
2.2 maximum likelihood estimation
maximal intersections(Figure2.3) given by the set of disjoint intervals whose left- and right-endpoints are selected respectively fromLandRsuch that
(sj,tj] ∩ (Li,Ri] = ⎧ ⎪ ⎪ ⎨ ⎪ ⎪ ⎩ (sj,tj], or ∅,
for every j=1, . . . ,dandi=1, . . . ,n. ◽
( L1 ] R1 t1 ( L2 ] R2 t2 ( L3 s1 ] R3 ( L4 s2 ] R4 figure 2.3 Maximal intersections.
2.5 proposition.W⊺Λˆnis almost surely constant outside I. Moreover for fixedΛˆn
on the boundary of I, the likelihood is invariant to the behaviour of Λˆn on the interior
of I.
Proof(cf. Alioum and Commenges1996, Lemmas 1 and 2). Fix some(sj−1,tj−1]and (sj,tj]inI. Consider ¯Λ, ˜Λ ∈ H with ¯Λ is constant outsideIand ¯Λ = Λ˜ except on (sj−1,tj]. In particular suppose thatW⊺Λ˜ almost surely increases on(tj−1,sj]. Then
there is someuj ∈ (tj−1,sj]such thatuj>r∈R,uj<l ∈Land one of the following
hold almost surely
W⊺Λ˜(tj−
1) <W⊺A˜(uj) =W⊺Λ¯(tj−1)
W⊺Λ¯(s
j) =W⊺Λ˜(uj) <W⊺Λ˜(sj).
This implies that likn(θ, ¯Λ) > likn(θ, ˜Λ). The last statement follows from the fact
that likn(θ,Λ)depends onΛonly through its value at the inspection times. ◾
The maximal set to which ˆΛnassigns mass is given byT =Y∩I= {t1, . . . ,td}. Esti-
mation reduces to a finite-dimensional optimization problem with objective function (2.3) continuous in the set of feasible solutions. The spmle(θˆn, ˆΛn)therefore exists.
Uniqueness is established in the following result.
2.6 proposition.Let H0 be the set of all possible Λ satisfying likn(θ,Λ) > 0for
every θ. Thenlikn(θ,Λ)is log-concave in Λ∈H0for each θ. Moreover for fixed Λ∈H0,
likn(θ,Λ)is log-concave in θ.
Proof. The function д(θ) = pθ,Λ(x)satisfiesд(θ)д′′(θ) ≤д′(θ)2for eachx and any
the unit vectors inRdw. For the jth component inΛ∈H0, j=1, . . . ,dw, consider the
pathΛ+sjφ∈Hwithsj=sej,ssufficiently small andφsome arbitrary function. It is
straightforward to show that the second partial derivative of logpθ,Λ(x)with respect
tosj is bounded above by zero if x corresponds to an interval- or right-censored
observation. Moreover ifx is left-censored, the second derivative is strictly negative. Since∆(1) = 0 the log-likelihood is concave in each component of Λ, holding all
remaining entries andθfixed. ◾