Abstract
Prevalent cohort studies in medical research often give rise to length-biased survival data that require special treatments. The recently proposed varying-coefficient partially linear transformation (VCPLT) model has the virtue of providing a more dynamic content of the effects of the covariates on survival times than the well-known partially linear transformation (PLT) model by allowing flexible interactions between the covariates. However, no existing analysis of the VCPLT model has considered length-biased sampling. In this paper, we consider the VCPLT model when the data are length-biased and right censored, thereby extending the reach of this flexible and powerful tool. We develop a martingale estimating function-based approach to the estimation of this model, provide theoretical underpinnings, evaluate finite sample performance via simulations, and showcase its practical appeal via an empirical application using data from two HIV vaccine clinical trials conducted by the U.S. National Institute of Allergy and Infectious Diseases.
Funding source: Emory University
Award Identifier / Grant number: Unassigned
Acknowledgements
We thank the editor Prof. Michael Rosenblum and the referees for their helpful comments. All remaining errors are ours.
-
Author contribution: All the authors have accepted responsibility for the entire content of this submitted manuscript and approved submission.
-
Research funding: Part of this work was carried out when Wei Zhao was visiting Emory University. Wan’s work was supported by the Hong Kong Research Grants Council (Grant No. 11500419) and the National Natural Science Foundation of China (No. 71973116). Zhou’s work was supported by the Key Program of the National Natural Science Foundation of China (Grant No. 71931004) and the National Key R&D Program of China (Grant Nos. 2021YFA1000100 and 2021YFA1000101).
-
Conflict of interest statement: The authors declare no conflicts of interest regarding this article.
Let ‖⋅‖ denote the
Our Proof of Theorem 1 requires the following lemma:
Lemma 1
Assume that Conditions (C1)–(C6) hold, and h → 0 and nh → 0 as n → ∞. Also, assume that the matrix
A
z
(defined in the proof) is finite and non-degenerate for any z. Then the one-step estimators
Proof of Lemma 1
Let
where →a.s. denotes the convergence almost surely. Let
Then
We divide our proof into three steps. First, by the law of large numbers and Lemma 1 in Cai et al. [25]; we have
where g(z) is the density function of Z. Thus, B n contains 0 with probability approaching 1.
Second, let
where k 2 = ∫x 2 k(x)dx. By techniques similar to those used above,
in probability. Using methods similar to Lu and Zhang [17] and the uniform law of large numbers, we can obtain,
as n → ∞, ϵ → ∞ and ϵ z → ∞.
Third, as
A
z
is finite and non-degenerate by assumption, the map
Proof of Theorem 1
We need to prove the local consistency of estimators
Differentiating (11) with respect to t, we have
As (12) is a Cauchy problem, it has a unique solution. As well, it is readily seen from Eq. (4) that H
0, the true transformation function, satisfies (12). Hence, by Helly’s Lemma,
Next, we prove the asymptotic normality of
If it can be proven that if (13) is true, then it is clear that Theorem 1 is also true by the martingale central limit theorem. In order to prove (13), we first give the representation of
From Eqs. (4) and (5), we have
By some tedious calculations and results of the empirical process theory for Z-estimator and the definition of
Then it follows, for t ∈ (0, τ], that
Note that for any t ∈ (0, τ],
for the parameter β . Differentiating (16) with respect to β on both sides yields
Following arguments similar to Chen et al. [19] and Lu and Zhang [17]; it can be shown that
Next we give the representation of
Denote
Taking the derivative of U 1( β , f ) with respect to β , setting β = β 0 and f = f 0, and using the law of large numbers and (17), we obtain
Let
Then we have
where
where
However, as
Taking derivative with respect to β on both sides of (20), applying the law of large numbers and using arguments similar to those in Step 2, we obtain
Multiplying
Substituting (22) into E 1(z) leads to
where
where
Moreover,
Thus, if we define
and
then we have
Hence, combining (19), (23)–(25), we get
which yields
and
By the definition of
By the Taylor-series expansion and derivations analogous to Step 1,
Let
By arguments similar to those used in Steps 2 and 3, we have
However, because
we obtain
Substituting (26) into (28) yields
Define
Then
Let
On the other hand, substituting (26) into (29), we have
Let
Then by the law of large numbers, we obtain
By the assumption of Theorem 1, and from (31) and (32),
where
On the other hand, by the Taylor-series expansion and the definition of
and
Substituting (34) and (35) into (33) leads to (13). This completes the Proof of Theorem 1.
To prove Theorem 2, we need the following lemma:
Lemma 2
Assume that conditions (C1) to (C8) hold. Then
where
Remark 5
By using integration by parts, we can write (36) as a Fredholm integral equation of the second kind with the kernel
In order for (36) to yield a unique solution, we assume that
By arguments analogous argument to Lu and Zhang [17]; we can construct the following solution to (36):
where b(t, s) is the unique solution to
Thus, given condition (37),
Proof of Lemma 2
From (30), we have
Let
By (13) and arguments similar to (25), we have
where
This completes the proof.
Proof of Theorem 2
By the Taylor-series expansion, Lemma 2 and (13), we have
where
This yields
which can be shown to converge weakly to a mean zero Gaussian Process by the functional central limit theorem. This completes the proof.
Proof of Theorem 3
By Theorems 1 and 2, and applying the Taylor-series expansion, we have
However,
where
where
Substituting (4) into
By the martingale central limit theorem, we have
where
Let
Expanding
where
Substituting these two equations into (44), we get
Plugging (45) into the definition of I 2, and by nonparametric techniques, we can obtain
where
The Proof of Theorem 3 is complete by combining the results of (42), (43) and (46), and applying the Slutsky Theorem.
References
1. Turnbull, B. The empirical distribution function with arbitrarily grouped, censored and truncated data. J Roy Stat Soc 1976;38:290–5. https://doi.org/10.1111/j.2517-6161.1976.tb01597.x.Search in Google Scholar
2. Lagakos, S, Barraj, L, De Gruttola, V. Nonparametric analysis of truncated survival data with applications to AIDS. Biometrika 1988;75:515–23. https://doi.org/10.1093/biomet/75.3.515.Search in Google Scholar
3. Wang, M. Nonparametric estimation from cross-sectional survival data. J Am Stat Assoc 1991;86:343–54. https://doi.org/10.1080/01621459.1991.10475011.Search in Google Scholar
4. Asgharian, M, M’Lan, C, Wolfson, D. Length-biased sampling with right censoring: an unconditional approach. J Am Stat Assoc 2002;97:207–9. https://doi.org/10.1198/016214502753479347.Search in Google Scholar
5. Asgharian, M, Wolfson, D. Asymptotic behaviour of the unconditional NPMLE of the length-biased survival function from right censored prevalent cohort data. Ann Stat 2005;33:2109–31. https://doi.org/10.1214/009053605000000372.Search in Google Scholar
6. Gill, R, Vardi, Y, Wellner, J. Large sample theory of empirical distributions in biased sampling models. Ann Stat 1988;16:1069–112. https://doi.org/10.1214/aos/1176350948.Search in Google Scholar
7. Luo, X, Tsai, W. Nonparametric estimation for right-censored length-biased data: a pseudo-partial likelihood approach. Biometrika 2009;96:873–86. https://doi.org/10.1093/biomet/asp064.Search in Google Scholar
8. Vardi, Y. Nonparametric estimation in the presence of length bias. Ann Stat 1982;10:616–20. https://doi.org/10.1214/aos/1176345802.Search in Google Scholar
9. Vardi, Y. Empirical distribution in selection bias models. Ann Stat 1985;13:178–203. https://doi.org/10.1214/aos/1176346585.Search in Google Scholar
10. Ning, J, Qin, J, Shen, Y. Semiparametric accelerated failure time model for length-biased data with application to dementia study. Stat Sin 2014;24:313–33. https://doi.org/10.5705/ss.2011.197.Search in Google Scholar PubMed PubMed Central
11. Tsai, W. Pseudo-partial likelihood for proportional hazards models with biased-sampling data. Biometrika 2009;96:601–15. https://doi.org/10.1093/biomet/asp026.Search in Google Scholar PubMed PubMed Central
12. Wang, M. Hazards regression analysis for length-biased data. Biometrika 1996;83:343–54. https://doi.org/10.1093/biomet/83.2.343.Search in Google Scholar
13. Shen, Y, Ning, J, Qin, J. Analyzing length-biased data with semiparametric transformation and accelerated failure time models. J Am Stat Assoc 2009;104:1192–202. https://doi.org/10.1198/jasa.2009.tm08614.Search in Google Scholar PubMed PubMed Central
14. Cheng, Y, Huang, C. Combined estimating equation approaches for semiparametric transformation models with length-biased survival data. Biometrics 2014;70:608–18. https://doi.org/10.1111/biom.12170.Search in Google Scholar PubMed
15. Cheng, S, Wei, L, Ying, Z. Analysis of transformation models with censored data. Biometrika 1995;82:835–45. https://doi.org/10.1093/biomet/82.4.835.Search in Google Scholar
16. Wei, W, Wan, ATK, Zhou, Y. Partially linear transformation model for length-biased and right-censored data. J Nonparametric Stat 2018;30:332–67. https://doi.org/10.1080/10485252.2018.1424335.Search in Google Scholar
17. Lu, W, Zhang, H. On estimation of partially linear transformation models. J Am Stat Assoc 2010;105:683–91. https://doi.org/10.1198/jasa.2010.tm09302.Search in Google Scholar
18. Qiu, Z, Zhou, Y. Partially linear transformation models with varying coefficients for multivariate failure time data. J Multivariate Anal 2015;142:144–66. https://doi.org/10.1016/j.jmva.2015.08.008.Search in Google Scholar
19. Chen, K, Jin, Z, Ying, Z. Semiparametric analysis of transformation models with censored data. Biometrika 2002;89:659–68. https://doi.org/10.1093/biomet/89.3.659.Search in Google Scholar
20. Ning, J, Qin, J, Shen, Y. Buckley-James-type estimator with right-censored and length-biased data. Biometrics 2011;67:1369–78. https://doi.org/10.1111/j.1541-0420.2011.01568.x.Search in Google Scholar PubMed PubMed Central
21. Wang, H, Wang, L. Quantile regression analysis of length-biased survival data. Statistics 2014;3:31–47. https://doi.org/10.1002/sta4.42.Search in Google Scholar
22. Chen, X, Wan, ATK, Zhou, Y. A quantile varying-coefficient regression approach to length-biased data modeling. Electronic Journal of Statistics 2014;8:2514–40. https://doi.org/10.1214/14-ejs959.Search in Google Scholar
23. Huang, C, Qin, J. Composite partial likelihood estimation under length-biased sampling, with application to a prevalent cohort study of dementia. J Am Stat Assoc 2012;107:946–57. https://doi.org/10.1080/01621459.2012.682544.Search in Google Scholar PubMed PubMed Central
24. Carroll, R, Fan, J, Gijbels, I, Wand, M. Generalized partially linear single-index models. J Am Stat Assoc 1997;92:477–89. https://doi.org/10.1080/01621459.1997.10474001.Search in Google Scholar
25. Cai, J, Fan, J, Jiang, J, Zhou, H. Partially linear hazard regression for multivariate survival data. J Am Stat Assoc 2007;102:538–51. https://doi.org/10.1198/016214506000001374.Search in Google Scholar
26. Press, W, Flannery, B, Tuekolsky, S, Vetterling, W. Fredholm equations of the second kind. In: Numerical Recipes in FORTRAN: The Art of Scientific Computing, 2nd ed. Cambridge, New York; 1992. 782–5 pp. chap. 18.Search in Google Scholar
27. Gross, S, Lai, T. Bootstrap methods for truncated and censored data. Stat Sin 1996;6:509–30.Search in Google Scholar
28. Dabrowska, D, Doksum, K. Estimation and testing in the two-sample generalized odds-rate model. J Am Stat Assoc 1988;83:744–9. https://doi.org/10.1080/01621459.1988.10478657.Search in Google Scholar
29. Kim, J, Lu, W, Sit, T, Ying, Z. A unified approach to semiparametric transformation models under general biased sampling schemes. J Am Stat Assoc 2013;108:217–27. https://doi.org/10.1080/01621459.2012.746073.Search in Google Scholar PubMed PubMed Central
© 2022 Walter de Gruyter GmbH, Berlin/Boston