maximum likelihood estimation - interval-censored time-to-event data

interval-censored time-to-event data

2.2 maximum likelihood estimation

× k ∏ j=2 {exp(−w⊺Λ(y_k_,_j−₁)ez ⊺_θ ) −exp(−w⊺Λ(y_k_,_j)ez ⊺_θ )}δk,j × {exp(−w⊺Λ(y_k_,_k)ez ⊺_θ )}δk,k+1 (2.2) =δ_k_,1{1−exp(−w⊺Λ(y_k_,1)ez ⊺_θ )} + k+1 ∑ j=2δk,j {exp(−w⊺Λ(y_k_,_j−₁)ez ⊺_θ ) −exp(−w⊺Λ(y_k_,_j)ez ⊺_θ )}, +δ_k_,_k+₁exp(−w⊺Λ(y_k_,_k)ez ⊺_θ ). LetXi = (∆_Ki i,Y i

Ki,Ki,Wi,Zi),i=1, . . . ,n, beniid observations ofXfrom(θ0,Λ0), where Yi

Ki = (YKii,1, . . . ,YKii,Ki) and ∆Kii = (∆Kii,1, . . . ,∆iKi,Ki+1). The corresponding log-likelihood function is log likn(θ,Λ) = n ∑ i=1∆ i Ki,1log{1−exp(−Wi⊺Λ(YKii,1)eZ ⊺ iθ)} + Ki ∑ j=2∆ i Ki,jlog{exp(−Wi⊺Λ(YKii,j−1)eZ ⊺ iθ) −exp(−W_i⊺Λ(Y_Ki i,j)eZ ⊺ iθ)} −∆i_K i,Ki+1Wi⊺Λ(YKii,Ki)eZ ⊺ iθ. (2.3)

2.1 remark. The expression in (2.3) reduces to same likelihood function obtained under the noninformative censoring mechanism from Definition1.32(Lawless2003, p. 65). The stronger requirement ina1simplifies the derivation of asymptotic prop- erties. It may be motivated by the setting in which individuals are assessed according to a predetermined schedule, with the completion and exact timing of assessments determined by some random process related toT only via(W,Z). ◽

2.2 maximum likelihood estimation

The regression model (2.1) is a valid intensity function provided thatW⊺_Λ_{is almost}

surely nondecreasing. Estimating equations derived from the likelihood process (1.4) often dispense with the requirement entirely, but (1.4) applies only to filters. A dis- crete inspection process necessitates constrained maximization of the likelihood. To address this complication, consider two simplifying assumptions:

a2 The support ofFW,W ≡ supp(F_W), is a bounded subset of_Rdw. In particular,

there exists some knownw0,w1∈Wsuch that P(w₀≤W ≤w₁) =1.

a3 FW2× ⋯ ×F_W

2.2 remark. Conditiona3essentially implies that we needw⊺_Λ_{nondecreasing for}

everyw ∈W. Conditiona2allows us to ensure monotonicity only inw⊺Λ, wherew

is a matrix whose entries are determined from the values inw0andw1. The choice of

w0andw1is relatively straightforward by standardizing any continuous covariates.

In general W is appropriately scaled so that the first entry in Λ, Λ1, is a baseline

cumulative hazard function. ◽

Not every inspection time contributes information to the likelihood function. As in Groeneboom and Wellner (1992, Part ii, Definition 1.1) irrelevant inspections can be discarded to obtain a “thinned” set of observation times.

2.3 definition. LetY(1), . . . ,Y(m)be the order statistics of

Y= {Y_Ki

i,j,j=1, . . . ,Ki,i=1, . . . ,n∶∆iKi,j+∆iKi,j+1=1}

and(W_(i),∆_(i))denote the(W,∆_K_,_j)corresponding to theith order statisticY_(i). ◽

If∆(1)=0, then theΛmaximizing (2.3) should satisfyΛ(Y₍₁₎) =0. If∆_(m)=1, then

the maximizing Λ satisfies w⊺_Λ₍_Y_(m)_{) = ∞} _{for every}_w _∈ _W _{or, in other words,}

Λ1(Y_(m)) = ∞. When combined with the remaining observations in Definition2.3,

these cases contribute nothing to the likelihood. So without loss of generality assume that∆(1)=1 and∆_(m)=0.

LetΘandHdenote the set all possibleθandΛ, respectively. In particularHis the set of all zero-at-time-zero cadlag functions{Λ}on[0,τ]withw⊺Λnondecreasing.

Since∆(m)=0 we can (again without loss of generality) assume that eachΛ∈H is

uniformly bounded with 0<w⊺Λ(τ) < ∞. Under conditionsa1toa3themaximum

likelihood estimator(θˆ_n, ˆΛ_n)is defined by

log likn(θˆ_n, ˆΛ_n) = max

θ∈Θ,Λ∈Hlog likn(θ,Λ).

Since the likelihood depends on Λonly through its value at the inspection times, we take(θˆ_n, ˆΛ_n)as thesemiparametric maximum likelihood estimator(spmle) that

concentrates its distribution function on a subset ofY. Themaximalsubset can be identified by adapting Turnbull (1976, Lemmas 1 and 2).

2.4 definition. Let (L,R]denote thecensoring interval (Y_K_,j−₁,Y_K_,_j]satisfying

∆K,j = 1; that is, T ∈ (Y_K_,_j−₁,Y_K_,_j] = (L,R]. From the random sample X₁, . . . ,X_n

2.2 maximum likelihood estimation

maximal intersections(Figure2.3) given by the set of disjoint intervals whose left- and right-endpoints are selected respectively fromLandRsuch that

(s_j,t_j] ∩ (L_i,R_i] = ⎧ ⎪ ⎪ ⎨ ⎪ ⎪ ⎩ (s_j,t_j], or ∅,

for every j=1, . . . ,dandi=1, . . . ,n. ◽

( L1 ] R1 t1 ( L2 ] R2 t2 ( L3 s1 ] R3 ( L4 s2 ] R4 figure 2.3 Maximal intersections.

2.5 proposition.W⊺_Λˆ_n_{is almost surely constant outside I. Moreover for fixed}_Λˆ_n

on the boundary of I, the likelihood is invariant to the behaviour of Λˆn on the interior

of I.

Proof(cf. Alioum and Commenges1996, Lemmas 1 and 2). Fix some(s_j−₁,t_j−₁]and (s_j,t_j]inI. Consider ¯Λ, ˜Λ ∈ H with ¯Λ is constant outsideIand ¯Λ = Λ˜ except on (s_j−₁,t_j]. In particular suppose thatW⊺Λ˜ almost surely increases on(t_j−₁,s_j]. Then

there is someuj ∈ (t_j−₁,s_j]such thatu_j>r∈R,u_j<l ∈Land one of the following

hold almost surely

W⊺_Λ˜₍_t_j−

1) <W⊺A˜(u_j) =W⊺Λ¯(tj−₁)

W⊺_Λ¯₍_s

j) =W⊺Λ˜(u_j) <W⊺Λ˜(s_j).

This implies that likn(θ, ¯Λ) > lik_n(θ, ˜Λ). The last statement follows from the fact

that likn(θ,Λ)depends onΛonly through its value at the inspection times. ◾

The maximal set to which ˆΛnassigns mass is given byT =Y∩I= {t₁, . . . ,t_d}. Esti-

mation reduces to a finite-dimensional optimization problem with objective function (2.3) continuous in the set of feasible solutions. The spmle(θˆ_n, ˆΛ_n)therefore exists.

Uniqueness is established in the following result.

2.6 proposition.Let H0 be the set of all possible Λ satisfying likn(θ,Λ) > 0for

every θ. Thenlikn(θ,Λ)is log-concave in Λ∈H₀for each θ. Moreover for fixed Λ∈H₀,

likn(θ,Λ)is log-concave in θ.

Proof. The function д(θ) = p_θ_,_Λ(x)satisfiesд(θ)д′′(θ) ≤д′(θ)2for eachx and any

the unit vectors inRdw. For the jth component inΛ∈H₀, j=1, . . . ,d_w, consider the

pathΛ+s_jφ∈Hwiths_j=se_j,ssufficiently small andφsome arbitrary function. It is

straightforward to show that the second partial derivative of logpθ,Λ(x)with respect

tosj is bounded above by zero if x corresponds to an interval- or right-censored

observation. Moreover ifx is left-censored, the second derivative is strictly negative. Since∆(1) = 0 the log-likelihood is concave in each component of Λ, holding all

remaining entries andθfixed. ◾

In document Semiparametric Methods for the Analysis of Progression-Related Endpoints (Page 39-42)