Change of response variable - Quantile based estimation of treatment effects in censored data

Thus far we have estimated µ and σ by regressing ˆF₂⁻¹(u) on ˆF₁⁻¹(u) by analogy with the relation

Y = µ + σX.^d However, if we reparameterize the model as

X = ν + τ Y^d

then we would estimate ν and τ by regressing ˆF₁⁻¹(u) on ˆF₂⁻¹(u). Since τ = _σ¹ and ν =−^µ_σ, logical consistency would require that

ˆ τ = 1

σ and ˆν = −µˆ ˆ

σ. (3.22)

However, from regression theory we know that 3.22 is false. We consider the following two cases:

Case 1 F₂⁻¹ = µ + σF₁⁻¹

Case 2 F₁⁻¹ = ν + τ F₂⁻¹. (3.23)

In case 1, µ and σ were estimated by GLS. In case 2 ν and τ were estimated by GLS and then converted to estimates of µ and σ using (3.22). The sample sizes were 50, 250 and 500, with 1000 repetitions. F₁ was a standard normal distribution, F₂ was normal (µ, σ) and k was set to 8. The same setup as in (3.20) to estimate the density function was used. The continuous EDF (2.3) was used to estimate F₁(x) and F₂(x) and hence F₁⁻¹(u) and F₂⁻¹(u). The mean of the estimates over the 1000 repetitions is given in Table 3.14. As can be seen the diﬀerence between the two sets of estimates are very minor and thus of little or no practical consequence.

Table 3.14: Estimates of µ and σ in cases 1 and 2 from (3.23) Case 1 Case 2

µ σ n µˆ σˆ µˆ σˆ

0 1 50 0.00 1.00 0.00 1.00 0 5 50 -0.05 4.97 -0.03 4.99 5 1 50 5.01 1.00 4.99 1.00 5 5 50 5.05 4.99 5.06 4.99 0 1 250 0.00 1.00 0.00 1.00 0 5 250 -0.01 5.00 -0.01 5.00 5 1 250 5.00 1.00 5.01 1.00 5 5 250 4.99 4.99 5.00 4.99 0 1 500 0.00 1.00 0.00 1.00 0 5 500 0.01 5.01 0.01 5.01 5 1 500 5.00 1.00 5.00 1.00 5 5 500 5.00 4.99 5.01 4.99

Chapter 4 A semi-parametric regression method for censored data

The scenario investigated in Chapter 3 will now be extended to the case where cen-soring of the observations can occur. This is discussed in Hsieh [10] and we will follow the methodology set out there. The following notation will be used when the data are censored. The data from the two groups will still be denoted by X₁ ={X1,1, . . . , X_1,n₁} and X2 ={X2,1, . . . , X_2,n₂} with CDFs F1 and F₂respectively.

The censoring observations will be C1 ={C1,1, . . . , C1,n1} and C2 ={C2,1, . . . , C2,n2} with survival functions K₁ and K₂ respectively. The observed data will then be de-noted by ˜X₁ ={ ˜X_1,1, . . . , ˜X_1,n₁} and ˜X₂ ={ ˜X_2,1, . . . , ˜X_2,n₂}, where ˜X_1,i = X_1,i∧ C1,i

and ˜X_2,i = X_2,i∧ C2,i. The observed censoring indicators will be δ_1,i =I(X1,i ≤ C1,i) and δ_2,i =I(X2,i ≤ C2,i). Denote the observed data, ordered in their ﬁrst component, by

( ˜X_1,(i), δ_1,(i)), i = 1, . . . , n₁ and ( ˜X_2,(j), δ_2,(j)), j = 1, . . . , n₂ with ˜X1,(1) < . . . < ˜X1,(n1) and ˜X2,(1) < . . . < ˜X2,(n2).

The extension to the censored case is based on the following model T₂ = T

1 γ

1 λ, (4.1)

where T₁ and T₂ are random variables representing the lifetimes in the two treatment

groups and γ, λ > 0. Taking logarithms in (4.1) gives log T2 = 1

γ log T1 + log λ. (4.2)

Set µ = log λ, σ = ¹_γ, X₁ = log T₁ and X₂ = log T₂. Then (4.2) becomes X2 = µ + σX1.

which is the model considered in Chapter 3. Following the methodology set out there, we postulate a heteroscedastic regression model

Fˆ₂⁻¹(u) = µ + σ ˆF₁⁻¹(u) + ϵ(u), (4.3)

where now ˆF₁ and ˆF₂ are Kaplan-Meier estimators of F₁ and F₂, i.e. for j = 1, 2 Fˆj(t) = 1− ∏

{i: ˜Xj,(i)≤t}

( n_j− i n_j− i + 1

)δj,(i)

, (4.4)

where t ≥ 0. Monte Carlo simulations will be undertaken to investigate the relative eﬀectiveness of the GLS and OLS estimators of µ and σ.

4.1 The regression setup

Expanding (4.3) with 0≤ u1 ≤ . . . ≤ uk≤ 1, we have Fˆ₂⁻¹(u₁) = µ + σ ˆF₁⁻¹(u₁) + ϵ(u₁)

...

Fˆ₂⁻¹(u_k) = µ + σ ˆF₁⁻¹(u_k) + ϵ(u_k), (4.5) an ordinary simple regression setup with response variable ˆF₂⁻¹(u) and predictor vari-able ˆF₁⁻¹(u). Hsieh [10] looks at both the OLS and GLS cases. He also shows that the GLS method is asymptotically eﬃcient. The expressions for the asymptotic vari-ances are complicated and will not be dealt with here. We will be focusing on the simulation results.

From (9) and (10) from Hsieh [10], ˆF₁⁻¹ and ˆF₂⁻¹ can be represented in terms of two generalized Kiefer processes with covariance functions

Λi = D₍₁_−u)C⁻¹D_(Cγ⁽ⁱ⁾₎C^T⁻¹D₍₁_−u). (4.6) In (4.6) D_g represents a diagonal matrix with main diagonal vector g and where the matrix C is the linear operator such that Cu = (u1, u2 − u1, . . . , uk − uk−1)^T and can be estimated by

ˆ (Recall the deﬁnition of D⁻¹

f₁^⋆( ˆF₁⁻¹(u)) from (3.12)). The covariance matrix is cov( ˆβ) = σ²(X^TΣ_⋆⁻¹X)⁻¹.

The asymptotic covariance matrix of β can be found in [10] page 2713. The OLS

A number of cases were investigated by Monte Carlo simulation. Table 4.1 shows the distributions that were used. F₁ and K₁ denote the CDF and survival function of X₁ and its associated censoring variable C₁ respectively. The X₂ data are distributed as µ + σX₁ with censoring variable C₂, which has survival function K₂. Table 4.1 lists the three cases that will be considered. The exponential distribution, denoted by exp(λ), has the density function

f (x; λ) = e^−x^λ λ .

The lognormal distribution, denoted by lognormal(a, b), has the density function f (x : a, b) = 1 In addition we deﬁne

φ₁ = 1

which are the observed proportions of uncensored observations.

Case 1 was the same simulation setup from Hsieh [10]. Our censoring proportions for the second group did not match his though for the ﬁrst group they did. We could not ﬁx this discrepancy and as such a direct comparison of his results was not possible.

Table 4.1: Distributions of variables X₁, C₁ and C₂

Case X₁ C₁ C₂

1 exp(1) exp(4) exp(8)

2 log(exp(1)) exp(2) exp(4)

3 lognormal(1, 1) lognormal(1.5, 1) lognormal(2, 1)

4.2.1 Bias of the estimators

Table 4.2 gives the bias results for the cases outlined in Table 4.1 for various values of µ and σ. There were 8 regression points, from 0.1 to 0.8 with evenly spaced intervals. The simulations were run 1000 times with the sample size being set to 50 for both groups. The continuous version of the Kaplan Meier estimator (2.16) was used to estimate F₁(t) and F₂(t) and hence F₁⁻¹(u) and F₂⁻¹(u). The bias is calculated simply as the arithmetic mean of the estimates over the simulations (¯µ and ¯ˆ σ alongˆ with ¯µ and ¯˜ σ) minus the true value, that is,˜

µ_bias = ¯µˆ− µ and ˆσbias= ¯σˆ− σ for GLS and

µbias = ¯σ˜− µ and ˜σbias = ¯σ˜− σ

for OLS. There is hardly any bias for any of the cases. The OLS method has less of a bias for µ than the GLS method though ˆµbias is still negligible.

4.2.2 Variance of the estimators

Table 4.3 gives the variance of the estimators for case 1 for the same simulation setup as used in the simulations to test the bias of the estimators. The variance was used rather than the mean squared error due to the negligible bias. The sample sizes were equal (n₁ = n₂ = n). The ﬁnite sample eﬃciencies are given by

e(˜µ : ˆµ) = var(ˆµ)

var(˜µ) and e(˜σ : ˆσ) = var(ˆσ) var(˜σ).

Table 4.2: Bias of the estimators, for the cases outlined in Table 4.1 Case µ σ φ₁ φ₂ µˆ_bias µ˜_bias σˆ_bias σ˜_bias

1 0.5 0.5 0.80 0.88 0.01 0.00 0.02 0.02 1 1 0.80 0.78 0.01 0.01 0.08 0.09 2 1.5 0.80 0.66 0.02 0.00 0.20 0.23 2 0.5 0.5 0.91 0.91 -0.03 0.01 0.00 0.00 1 1 0.91 0.84 -0.06 -0.02 0.00 0.00 2 1.5 0.91 0.72 -0.09 0.03 0.00 -0.01 3 0.5 0.5 0.85 0.95 0.00 0.00 0.02 0.02 1 1 0.85 0.86 0.02 0.00 0.02 0.03 2 1.5 0.85 0.72 0.02 0.00 0.03 0.04

The results show that the GLS method outperforms the OLS method by a large margin. The variance is low for both the estimation of µ and σ for both OLS and GLS methods. The only notably higher variance was in the estimation of σ when it is greater than 1 and the sample size is small.

Table 4.3: Variance of estimators for Case 1 from Table 4.1

µ σ n φ₁ φ₂ var(ˆµ) var(˜µ) var(ˆσ) var(˜σ) e(˜ˆµ : ˆµ) e(˜ˆσ : ˆσ)

0.5 0.5 50 0.80 0.89 0.00 0.00 0.02 0.02 0.38 0.86

1 1 50 0.80 0.78 0.01 0.02 0.08 0.09 0.38 0.91

2 1.5 50 0.80 0.66 0.02 0.04 0.21 0.23 0.39 0.90

0.5 0.5 100 0.80 0.88 0.00 0.00 0.01 0.01 0.40 0.95

1 1 100 0.80 0.78 0.00 0.01 0.04 0.05 0.39 0.87

2 1.5 100 0.80 0.66 0.01 0.02 0.11 0.12 0.37 0.87

Table 4.4 provides the variance of the estimators for case 2. The variances are larger than in case 1 but are still fairly low. There is little diﬀerence between OLS and GLS, OLS giving a better estimation of µ and GLS giving a better estimation of

σ.

Table 4.4: Variance of estimators for Case 2 from Table 4.1

µ σ n φ₁ φ₂ var(ˆµ) var(˜µ) var(ˆσ) var(˜σ) e(˜ˆµ : ˆµ) e(˜ˆσ : ˆσ)

0.5 0.5 50 0.91 0.91 0.02 0.02 0.01 0.01 1.05 0.96

1 1 50 0.91 0.84 0.06 0.06 0.05 0.06 1.06 0.95

2 1.5 50 0.91 0.72 0.15 0.13 0.13 0.13 1.10 1.03

0.5 0.5 100 0.91 0.92 0.01 0.01 0.01 0.01 1.00 0.94

1 1 100 0.91 0.84 0.03 0.03 0.03 0.03 1.03 0.93

2 1.5 100 0.91 0.72 0.07 0.06 0.06 0.06 1.05 0.94

The variance results for case 3 are given in Table 4.5. They are similar to case 1 in that the GLS gives a much more accurate estimation than OLS. The variances were all low except for values of σ greater than 1 with small sample sizes.

Table 4.5: Variance of estimators for Case 3 from Table 4.1

µ σ n φ₁ φ₂ var(ˆµ) var(˜µ) var(ˆσ) var(˜σ) e(˜ˆµ : ˆµ) e(˜ˆσ : ˆσ)

0.5 0.5 50 0.85 0.95 0.01 0.01 0.03 0.03 0.45 0.83

1 1 50 0.85 0.86 0.02 0.04 0.09 0.11 0.42 0.77

2 1.5 50 0.85 0.72 0.04 0.10 0.22 0.28 0.39 0.79

0.5 0.5 100 0.85 0.95 0.00 0.01 0.01 0.01 0.40 0.75

1 1 100 0.85 0.86 0.01 0.02 0.05 0.06 0.40 0.77

2 1.5 100 0.85 0.72 0.02 0.05 0.11 0.13 0.37 0.79

There was a problem with the estimation due to the method of choosing the regression points, u. The Kaplan-Meier estimator is only deﬁned for a certain interval;

within 0 and the quantile of the last uncensored observation. This happens as beyond the last uncensored observation there is no more available information that can be used by the product-limit estimator as it has its jumps at the uncensored values.

Thus when there is heavy censoring, the quantile of the last uncensored observation may be lower than the highest regression point. Figure 4.1 illustrates this point. In

this case F₁(x) is a standard exponential distribution and C₁ is uniformly distributed on [0, 2.2], chosen so approximately 40 percent of the data is censored, n = 100. ˆF1(x) is only deﬁned up to 0.89. If there was a regression point at 0.9 for instance, then the product limit estimator will be undeﬁned. To counteract this, it is suggested that the largest regression point is less than or equal to the quantile of the last uncensored observation.

Figure 4.1: The estimated CDF for an standard exponential distribution with 40%

censoring

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

x F 1 (x)

Chapter 5 Nonparametric confidence bands for the quantile comparison

function with censored data

This chapter is concerned with a fully nonparametric model in which two independent data sets, X₁ = {X1,1, . . . , X_1,n₁} (> 0) and X2 = {X2,1, . . . , X_2,n₂} (> 0), come from two unspeciﬁed survival functions, S₁ and S₂ respectively. Our objective is to construct a conﬁdence band for the quantile comparison function, q = S₂⁻¹(S₁), and our only assumptions are that both S₁ and S₂ are continuous and strictly decreasing.

Doksum and Sievers [4] considered this type of problem for complete data. The additional factor that we wish to incorporate into the analysis is to allow incomplete data. Lu, Wells and Tiwari [15] used a bootstrap method that allowed for censored data. This section will look at a method that does not require use of a bootstrap.

The data are censored and we do not observe X1 and X2 but rather ˜X1 = {X1,i ∧ C_1,i, i = 1, . . . , n₁} and ˜X₂ = {X2,i ∧ C2,i, i = 1, . . . , n₂} where the C1,i and C_2,i are independent observations from continuous distributions with survival functions K₁ and K₂ respectively. The observed data therefore comes from distributions with survival functions, with t≥ 0,

H(t) = P [ ˜X_1,i > t] = S₁(t)K₁(t)

and

J (t) = P [ ˜X_2,i > t] = S₂(t)K₂(t).

5.1 Asymptotic representation of the quantile com-parison function

First, suppose there is no censoring present. Set

n_⋆ =

From Potgieter [16], Section 2.3, we see that

√n_⋆(ˆq(t)− q(t)) =

√n_⋆( ˆS₁(t)− ˆS₂(q(t)))

f₂(q(t)) + o_p(1). (5.2) (Since Potgieter [16] is possibly not freely available, we reproduce his derivation in Appendix B). In principle, therefore, conﬁdence bands for q could be obtained using the asymptotic distribution of the ﬁrst term of the right hand side of (5.2). However this would require the estimation of f₂(q(t)) which is extremely variable where f₂(q(t)) is close to zero. The situation here is analogous to the estimation of a single quantile discussed in Section 21.8, page 309, of van der Vaart [18]. As pointed out there it is simpler to base the estimation on the numerator in (5.2) alone.

Our conﬁdence band for q will be I := where C is chosen so that

P (q(t)∈ I|S1, S₂) = 1− α.

Notice that

Sˆ2(q(t)) =1− 1

where the U_1,i and U_2,i are uniformly distributed between zero and one. Therefore the distributions of ˆS₂(q(t)) and ˆS₁(t) are independent of F₂. Thus, we may assume

Thus far we have not taken account of any censoring. If censoring is indeed present then it seems natural to take ˆS1 and ˆS2 in (5.3) and (5.4) to be the respective Kaplan-Meier estimators.

We showed in Section 2.3.1 - see (2.20) and (2.21) - for i = 1, 2 that ¯Z_i(u), 0 ≤ u≤ 1 is a path-continuous Gaussian process, with zero mean and covariance function

cov( ¯Z_i(u₁), ¯Z_i(u₂)) = u₁u₂

∫ u1∧u2

w²× Ki(S_i⁻¹(w)). Notice that from (3.7)

√n_⋆ =

converges in distribution to a zero mean Gaussian process B which has covariance function ap-proximate the distribution of the latter random variable we use the following Lemma from Lombard [14].

Lemma 5.1. Let ˆB(u), 0 < u < 1, be a path-continuous Gaussian process with covariance function c(u₁, u₂). Suppose there exist continuous functions ν(u) > 0 and θ(u) such that

c(u, u + ϵ) = ν(u)− θ(u)ϵ + o(ϵ)

as ϵ→ 0. Set

Notice that in (5.8) |θ(u)| is incorrectly given as θ(u) in Lombard [14].

Now set

so that Lemma 5.1 is applicable. Since the survival functions K_i and S_i appearing in the expression for θ(u) are unknown, we replace them by consistent estimates made from the data. Then, from the convergence in distribution of B_n_⋆ to ˆB together with (5.8), we have the approximation

P (√ for ”large” b and for ”large” n₁ and n₂. The righthand side of (5.9) provides an approximate p-value for a test of the hypothesis S₁ ≡ S2. Thus, in order to ﬁnd an approximate 100(1− α)% simultaneous conﬁdence band for q, we must solve the equation computation of the integral by numerical integration. These issues are dealt with in the next section.

In document Quantile based estimation of treatment effects in censored data (Page 52-67)