computation - interval-censored time-to-event data

interval-censored time-to-event data

2.4 computation

Use of standard methods to compute(θˆ_n, ˆΛ_n)is complicated by the size of the param-

eter space and constraints onΛ. The latter cannot be eliminated through transforma- tion, but can be expressed as a linear inequality. From Proposition2.6, log likn(θ,Λ)

is concave, so computation of ˆΛnreduces to quadratic programming (qp). Cheng et

al. (2011) recently applied qp to obtain Wellner and Y. Zhang’s (2007) semiparametric estimators from panel count data. They proposed jointly updating estimates forθ andΛusing Pan’s (1999) extension of the iterative convex minorant algorithm (Jong- bloed1998). The approach proposed here is similar, but the quadratic approximation is based on the relatively flexible Lagrangian framework of Dümbgen et al. (2006). 2.4.1 Parameter estimates

Letλj =Λ(t_j), wheret_jis the right-endpoint of the jth maximal intersection from

Definition2.4. Bya2the almost-sure constraintsW⊺_λ_j _≥_{0 and}_W⊺_Λ₍_t_j_{) ≤}_W⊺_Λ₍_t_k₎_,

j<k, amount to the inequalityAλ≥0, whereλ= (λ₁⊺, . . . ,λ⊺_d)⊺andAis the block

diagonal matrix A= ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ w 0 0 0 ⋯ 0 −w w 0 0 ⋯ 0 0 −w w 0 ⋯ 0 ⋯ 0 0 ⋯ 0 −w w ⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ ,

withwas described in Remark2.2. In practice the minimumw0and maximumw1

2.4 computation

For brevity putϕ = (θ⊺,λ⊺)⊺and let log lik_n(ϕ) ≡ log lik_n(θ,λ). Following the

results of Section1.2.4, we specify a computational algorithm by an initial valueϕ(0)_,

a candidate stepη(r)_{= (}_η⊺

θ,η⊺λ)⊺, a line search findingϕ(r+1)∈seg(ϕ(r),ϕ(r)+η(r))

such that log likn(ϕ(r+1)) ≥log lik_n(ϕ(r)), and a stopping ruled(ϕ(r),ϕ(r+1)) <ε.

Applying the framework of Dümbgen et al. (2006, Section 3), the candidate step forλ(r)_,_η(r)

λ , is based on a quadratic approximation. In particular

η(r)_λ = arg max

ηλ∶A(ηλ+λ(r))≥0

∇_λlog lik_n(ϕ(r))⊺η_λ+1₂η⊺_λ∇2_λlog lik_n(ϕ(r))η_λ (2.13) ≈arg max

λ∶Aλ≥0 log likn

(θ(r),λ) −log lik_n(ϕ(r)) −λ(r).

θ(r)_{is updated via the Newton-Raphson step}

η(r)_θ = − ∇2_θlog lik_n(ϕ(r))−1∇_θlog lik_n(ϕ(r)). (2.14)

Following Jongbloed (1998) overshoot is avoided using the step-halving line search, based on a variant of Armijo’s (1966) rule. It is given by

ϕ(r+1)₌_ϕ(r)₊_η(r)_/₂j_, _(2.15)

where jis the smallest nonnegative integer satisfying

log likn(ϕ(r)) −log lik_n(ϕ(r)+η(r)/2j) ≤α∇_ϕlog lik_n(ϕ(r))⊺η(r)/2j.

Hereαis a fixed parameter set to some positive value less than the step factor: 0<

α<1/2. Its value can affect the number of iterations needed to achieve the stopping

rule, but is otherwise inconsequential (Fletcher1987, p. 30). 2.15 algorithm. Set r ∶= 0, θ(0) = 0 and λ(_j0) = (t_j/τ, 0⊺_d

w−1)⊺. Let η(r) be the candidate step with components given by (2.14) and (2.13) andϕ(r+1)_{be the result of}

the line search (2.15). If

∥ϕ(r+1)−ϕ(r)∥∞≤ε, (2.16)

for small positive valueε, then stop. Otherwise, putr∶=r+1. ◽

Convergence of Algorithm2.15to the maximum likelihood estimator follows from Propositions1.27and2.6. Alternative convergence criteria to (2.16) can be based on the characterization of the spmle implied by Proposition1.27:

∣∇_ϕlog lik_n(ϕ(r))⊺ϕ(r)∣ ≤ε. (2.17)

Constrained Newton methods generally require many more iterations than the standard Newton-Raphson algorithm. Computing time is largely determined by process- ing power and the software used to carry out qp. The c routines available with ibm’s (2012) cplex Optimization Studio offer a reasonably fast solution.

2.4.2 Variance estimates

The variance estimator for ˆθn given by (2.12) is based the curvature of the profile

log-likelihood. This requires repeated evaluation of the profile log-likelihood log plikn(θ) = sup

λ∶Aλ≥0log likn (θ,λ),

by fixingθ(r)_at_θ _{in Algorithm}_2.15_{. Since we need to approximate the only value}

of the profile likelihood and not the profile maximizer, the stopping rule (2.16) is replaced by

∣1−log likn

(θ,λ(r+1))

log likn(θ,λ(r)) ∣ ≤ε.

This can reduce the computation time considerably since the log-likelihood often converges faster thanλ(r)_.

The tuning parameterρnin (2.12) determines the values around ˆθnused to assess

the curvature of the profile log-likelihood. Standard practice calls for a scalar value ρn ≂n−1/2with proportionality constant chosen empirically. Some informal experi-

mentation suggests that variance estimates are not highly sensitive to the choice ofρn,

particularly with larger sample sizes and frequent inspections. This also seems appar- ent in numerical studies from Zeng et al. (2006). However for the sake of convenience, a data-driven selection method is desirable. Borrowing methods from numerical dif- ferentiation we adopt the matrix form ofρnand reduce the choice to specifying broad

parameters describing the magnitude ofθ.

Let f ∶_R→_Rbe a continuously differentiable function. In the finite-difference

approximation

f′₍_x_{) ≈} f(x+ρ) − f(x)

ρ ,

it is standard practice to selectρ∼ √

єcurv(x), whereєis the error in evaluating f

and curv = √

f/f′′ is the “curvature scale” of f. This choice is a minimizer of the

truncation errorρ3_f′′_{in the above first-order approximation, plus the “round-off”}

errorє∣f(x)/ρ∣(Press et al.2007, Section 5.7). When little is known about f′′one

can simply setρ∼ √

єx or, forx close to zero, ρ∼

√

єsign(x)max(∣x∣, typx),

where typx is a typical absolute value forx (Dennis and Schnabel1996, p. 98). In (2.12) the curvature of the profile log-likelihood is evaluated with a second- order finite difference approximation. The corresponding curvature scale is based on the ratio of the profile log-likelihood and its third derivative, which can be evaluated

In document Semiparametric Methods for the Analysis of Progression-Related Endpoints (Page 52-55)