Generalized mean residual life models for case-cohort and nested case-control studies

Jin, Peng; Zeleniuch-Jacquotte, Anne; Liu, Mengling

doi:10.1007/s10985-020-09499-w

Generalized mean residual life models for case-cohort and nested case-control studies

Published: 11 June 2020

Volume 26, pages 789–819, (2020)
Cite this article

Lifetime Data Analysis Aims and scope Submit manuscript

558 Accesses
4 Citations
1 Altmetric
Explore all metrics

Abstract

Mean residual life (MRL) is the remaining life expectancy of a subject who has survived to a certain time point and can be used as an alternative to hazard function for characterizing the distribution of a time-to-event variable. Inference and application of MRL models have primarily focused on full-cohort studies. In practice, case-cohort and nested case-control designs have been commonly used within large cohorts that have long follow-up and study rare diseases, particularly when studying costly molecular biomarkers. They enable prospective inference as the full-cohort design with significant cost-saving benefits. In this paper, we study the modeling and inference of a family of generalized MRL models under case-cohort and nested case-control designs. Built upon the idea of inverse selection probability, the weighted estimating equations are constructed to estimate regression parameters and baseline MRL function. Asymptotic properties of the proposed estimators are established and finite-sample performance is evaluated by extensive numerical simulations. An application to the New York University Women’s Health Study is presented to illustrate the proposed models and demonstrate a model diagnostic method to guide practical implementation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Development of a life expectancy table for individuals with type 1 diabetes

Article Open access 26 July 2021

A Tutorial on Applying the Difference-in-Differences Method to Health Data

Article Open access 07 September 2023

Defining the Study Cohort: Inclusion and Exclusion Criteria

References

Cai T, Zheng Y (2013) Resampling procedures for making inference under nested case-control studies. J Am Stat Assoc 108(504):1532–1544
Article MathSciNet MATH Google Scholar
Chen YQ (2007) Additive expectancy regression. J Am Stat Assoc 102(477):153–166
Article MathSciNet MATH Google Scholar
Chen YQ, Cheng S (2005) Semiparametric regression analysis of mean residual life with censored survival data. Biometrika 92(1):19–29
Article MathSciNet MATH Google Scholar
Chen YQ, Cheng S (2006) Linear life expectancy regression with censored data. Biometrika 93(2):303–313
Article MathSciNet MATH Google Scholar
Chen K, Lo SH (1999) Case-cohort and case-control analysis with Cox’s model. Biometrika 86(4):755–764
Article MathSciNet MATH Google Scholar
Chen YQ, Jewell NP, Lei X, Cheng SC (2005) Semiparametric estimation of proportional mean residual life model in presence of censoring. Biometrics 61(1):170–178
Article MathSciNet MATH Google Scholar
Clendenen TV, Ge W, Koenig KL, Axelsson T, Liu M, Afanasyeva Y, Andersson A, Arslan AA, Chen Y, Hallmans G, Lenner P, Kirchhoff T, Lundin E, Shore RE, Sund M, Zeleniuch-Jacquotte A (2015) Genetic polymorphisms in vitamin D metabolism and signaling genes and risk of breast cancer: a nested case-control study. PLoS ONE 10(10):e0140478
Article Google Scholar
Efron B (1979) Bootstrap methods: another look at the Jackknife. Ann Stat 7(1):1–26
Article MathSciNet MATH Google Scholar
Fleming TR, Harrington DP (1991) Counting processes and survival analysis. John Wiley & Sons. Inc., New York
Gail MH, Brinton LA, Byar DP, Corle DK, Green SB, Schairer C, Mulvihill JJ (1989) Projecting individualized probabilities of developing breast cancer for white females who are being examined annually. J Natl Cancer Inst 81(24):1879–1886
Article Google Scholar
Ge W, Clendenen TV, Afanasyeva Y, Koenig KL, Agnoli C, Brinton LA, Dorgan JF, Eliassen AH, Falk RT, Hallmans G, Hankinson SE, Hoffman-Bolton J, Key TJ, Krogh V, Nichols HB, Sandler DP, Schoemaker MJ, Sluss PM, Sund M, Swerdlow AJ, Visvanathan K, Liu M, Zeleniuch-Jacquotte A (2018) Circulating anti-Müllerian hormone and breast cancer risk: a study in ten prospective cohorts. Int J Cancer 142(11):2215–2226
Article Google Scholar
James IR (1986) On estimating equations with censored data. Biometrika 73:35–42
Article MathSciNet MATH Google Scholar
Kupper LL, McMichael AJ, Spirtas R (1975) A hybrid epidemiologic study design useful in estimating relative risk. J Am Stat Assoc 70(351a):524–528. https://doi.org/10.1080/01621459.1975.10482466
Article Google Scholar
Lin DY, Wei LJ, Ying Z (1993) Checking the Cox model with cumulative sums of Martingale-based residuals. Biometrika 80(3):557–572
Article MathSciNet MATH Google Scholar
Liu M, Lu W, Shore RE, Zeleniuch-Jacquotte A (2010a) Cox regression model with time-varying coefficients in nested case-control studies. Biostatistics 11(4):693–706
Article MATH Google Scholar
Liu M, Lu W, Tseng Ch (2010b) Cox regression in nested case-control studies with auxiliary covariates. Biometrics 66(2):374–381
Article MathSciNet MATH Google Scholar
Lu W, Liu M (2012) On estimation of linear transformation models with nested case-control sampling. Lifetime Data Anal 18(1):80–93
Article MathSciNet MATH Google Scholar
Lu W, Tsiatis AA (2006) Semiparametric transformation models for the case-cohort study. Biometrika 93(1):207–214
Article MathSciNet MATH Google Scholar
Ma H, Shi J, Zhou Y (2017) Proportional mean residual life model with censored survival data under case-cohort design. arXiv:1708.01634 [math, stat]
Maguluri G, Zhang CH (1994) Estimation in the mean residual life regression model. J R Stat Soc Ser B Methodol 56(3):477–489
MathSciNet MATH Google Scholar
Oakes D, Dasu T (1990) A note on residual life. Biometrika 77(2):409–410
Article MathSciNet MATH Google Scholar
Prentice RL (1986) A case-cohort design for epidemiologic cohort studies and disease prevention trials. Biometrika 73(1):1–11
Article MathSciNet MATH Google Scholar
Reid N (1981) Influence functions for censored data. Ann Stat 9(1):78–92
Article MathSciNet MATH Google Scholar
Samuelsen S (1997) A pseudolikelihood approach to analysis of nested case-control studies. Biometrika 84(2):379–394
Article MathSciNet MATH Google Scholar
Scarmo S, Afanasyeva Y, Lenner P, Koenig KL, Horst RL, Clendenen TV, Arslan AA, Chen Y, Hallmans G, Lundin E, Rinaldi S, Toniolo P, Shore RE, Zeleniuch-Jacquotte A (2013) Circulating levels of 25-hydroxyvitamin D and risk of breast cancer: a nested case-control study. Breast Cancer Res 15(1):R15
Scheike TH, Juul A (2004) Maximum likelihood estimation for Cox’s regression model under nested case-control sampling. Biostatistics 5(2):193–206
Article MATH Google Scholar
Sun L, Zhang Z (2009) A class of transformed mean residual life models with censored survival data. J Am Stat Assoc 104(486):803–815
Article MathSciNet MATH Google Scholar
Sun L, Song X, Zhang Z (2012) Mean residual life models with time-dependent coefficients under right censoring. Biometrika 99(1):185–197
Article MathSciNet MATH Google Scholar
Thomas DC (1977) Addendum to methods of cohort analysis: appraisal by application to asbestos mining by Liddell, F. D. K., McDonald, J. C., and Thomas, D. C. J R Stat Soc Ser A Gen 140(4):483–485
Google Scholar
Yang G, Zhou Y (2014) Semiparametric varying-coefficient study of mean residual life models. J Multivar Anal 128:226–238
Article MathSciNet MATH Google Scholar
Zhang LX (2000) A functional central limit theorem for asymptotically negatively dependent random fields. Acta Math Hungar 86(3):237–259
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

The authors would like to thank the Associate Editor and two referees for their valuable suggestions. The research was partially supported by NIH Grants RO1CA178949.

Author information

Authors and Affiliations

Department of Population Health, New York University School of Medicine, New York, NY, 10016, USA
Peng Jin, Anne Zeleniuch-Jacquotte & Mengling Liu
Department of Environmental Health, New York University School of Medicine, New York, NY, 10016, USA
Anne Zeleniuch-Jacquotte & Mengling Liu

Authors

Peng Jin
View author publications
You can also search for this author in PubMed Google Scholar
Anne Zeleniuch-Jacquotte
View author publications
You can also search for this author in PubMed Google Scholar
Mengling Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mengling Liu.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

1.1 Simulation study

We conducted numerical simulations to compare the IPCW method and the QPS estimator under the full cohort of 1000 subjects when censoring rates were approximately 10%, 30% and 80%. A total of 500 simulations were conducted for both the proportional and additive MRL models. We reported the bias, the standard deviation (SD) of the estimates, the average of estimated standard error (SE) and the coverage rate (CP) of 95% Wald-type confidence intervals (see results in Table 6). The SEs of the estimates were calculated through standard bootstrap method. Based on the simulation results, the two estimators had similar performance when censoring probability was low. The biases were all small and the means of estimated SEs were close to the empirical SDs of the parameter estimators. The 95% Wald-type confidence intervals had proper coverage rate. However, when the censoring rate was 80%, the IPCW performance dropped and underestimated SEs, which led to low coverage probabilities.

Table 6 Comparison between IPCW estimator and QPS estimator

Full size table

1.2 Regularity conditions

Let $u_{\tilde{\varvec{Z}}}(t;\varvec{\beta }), u_{\breve{\varvec{Z}}}(t;\varvec{\beta })$ and $u_{\bar{\varvec{Z}}}(t)$ be the limits of $\tilde{\varvec{Z}}(t;\varvec{\beta }), \breve{\varvec{Z}}(t;\varvec{\beta })$ and $\bar{\varvec{Z}}(t;\varvec{\beta })$, respectively. We assume the following regularity conditions:

(C1)
sup supp(F) $\le $ sup supp(G), where $F(\cdot )$ and $G(\cdot )$ are the distribution functions of T and C, respectively;
(C2)
$\varvec{Z}$ is bounded;
(C3)
$m_*(t)$ is continuously differentiable on [0,$\tau $];
(C4)
$A=\int _0^\tau E[(\varvec{Z}-u_{\bar{\varvec{Z}}}(t;\varvec{\beta }_*)) (\varvec{Z}-u_{\breve{\varvec{Z}}}(t;\varvec{\beta }_*))' (\dot{g}\{m_*(t)+\varvec{\beta }_*'\varvec{Z}\}dN(t) -Y(t)d\dot{g}\{m_*(t)+\varvec{\beta }_*'\varvec{Z}\})]$ is nonsingular;

Proof of Theorem 1(i)

First, we want to establish the consistency of the estimators $\hat{m}_0(t)$ and $\hat{\varvec{\beta }}$. Condition (C3) implies that $m_0(t)$ is of bounded variation on $[0,\tau ]$. Define $\mathcal {B}=\{\varvec{\beta }:||\varvec{\beta }-\varvec{\beta }_*||\le \epsilon \}$ for any $\epsilon >0$ and we have $E(w_i|\mathcal {F})=1$. By the strong law of large numbers and the fact that $E\{w_idM_i(t)\}=0$, for large n, $t\in [0,\tau ], \varvec{\beta }\in \mathcal {B}$, and sufficiently large $\theta $,

$$\begin{aligned}&\frac{1}{n}\sum _{i=1}^nw_i \left[ dN_i(t)-Y_i(t)\frac{dg\{m_0(t)+\theta +\varvec{\beta }'\varvec{Z}_i\}+dt}{g\{m_0(t)+\theta +\varvec{\beta }'\varvec{Z}_i\}}\right] <0, \end{aligned}$$

(11)

$$\begin{aligned}&\frac{1}{n}\sum _{i=1}^nw_i \left[ dN_i(t)-Y_i(t)\frac{dg\{m_0(t)-\theta +\varvec{\beta }'\varvec{Z}_i\}+dt}{g\{m_0(t)-\theta +\varvec{\beta }'\varvec{Z}_i\}}\right] >0. \end{aligned}$$

(12)

By (11), (12), and the monotonicity and continuity of g function, for any $t\in [0,\tau ]$ and $\varvec{\beta }\in \mathcal {B}$, there exists a unique $\hat{m}_0(t;\varvec{\beta })$ that satisfies

$$\begin{aligned} \frac{1}{n}\sum _{i=1}^nw_i\left[ dN_i(t)-Y_i(t)\frac{dg\{\hat{m}_0(t;\varvec{\beta })+\varvec{\beta }'\varvec{Z}_i\}+dt}{g\{\hat{m}_0(t;\varvec{\beta })+\varvec{\beta }'\varvec{Z}_i\}}\right] =0. \end{aligned}$$

(13)

Note that (11) and (12) hold for any $\theta >0$ when and only when $\varvec{\beta }=\varvec{\beta }_*$. Then we have that $\hat{m}_0(t;\varvec{\beta })$ converges to $m_0(t;\varvec{\beta })$ uniformly in $t\in [0,\tau ]$ and $\varvec{\beta }$ in a compact set which contains the true parameter $\varvec{\beta }_*$, and $m_0(t;\varvec{\beta }_*) = m_*(t)$. Thus, to prove the existence and uniqueness of $\hat{\varvec{\beta }}$ and $\hat{m}_0(t)$, it suffices to show that there exists a unique solution to $U(\varvec{\beta })=0$. Take derivative of (13) with respect to $\varvec{\beta }$, we have

$$\begin{aligned}&\frac{d\hat{m}_0(t)}{d\varvec{\beta }}\frac{\sum _{i=1}^nw_i[\dot{g}\{\hat{m}_0(t)+\varvec{\beta }'\varvec{Z}_i\}dN_i(t)-Y_i(t)d\dot{g}\{\hat{m}_0(t)+\varvec{\beta }'\varvec{Z}_i\}]}{\sum _{i=1}^nw_iY_i(t)\dot{g}\{\hat{m}_0(t)+\varvec{\beta }'\varvec{Z}_i\}} - d(\frac{d\hat{m}_0(t)}{d\varvec{\beta }}) \\&\quad = - \frac{\sum _{i=1}^nw_i\varvec{Z}_i[\dot{g}\{\hat{m}_0(t)+\varvec{\beta }'\varvec{Z}_i\}dN_i(t)-Y_i(t)d\dot{g}\{\hat{m}_0(t)+\varvec{\beta }'\varvec{Z}_i\}]}{\sum _{i=1}^nw_iY_i(t)\dot{g}\{\hat{m}_0(t)+\varvec{\beta }'\varvec{Z}_i\}}, \end{aligned}$$

which is a first-order linear ordinary differential equation about $d\hat{m}_0(t)/d\varvec{\beta }$. The solution is

$$\begin{aligned} \frac{d\hat{m}_0(t)}{d\varvec{\beta }}= \frac{d\hat{m}_0(t;\varvec{\beta })}{d\varvec{\beta }}=-\frac{1}{K(t;\varvec{\beta })}\int _t^\tau K(u;\varvec{\beta })Q(u;\varvec{\beta }) \equiv -\breve{\varvec{Z}}(t;\varvec{\beta }), \end{aligned}$$

where

$$\begin{aligned} K(t;\varvec{\beta })= & {} \exp \left\{ -\int _0^t\frac{\sum _{i=1}^nw_i[\dot{g} \{\hat{m}_0(t)+\varvec{\beta }'\varvec{Z}_i\}dN_i(t)-Y_i(t)d\dot{g} \{\hat{m}_0(t)+\varvec{\beta }'\varvec{Z}_i\}]}{\sum _{i=1}^nw_iY_i(t) \dot{g}\{\hat{m}_0(t)+\varvec{\beta }'\varvec{Z}_i\}}\right\} ,\\ Q(t;\varvec{\beta })= & {} \frac{\sum _{i=1}^nw_i\varvec{Z}_i[\dot{g}\{\hat{m}_0(t)+ \varvec{\beta }'\varvec{Z}_i\}dN_i(t)-Y_i(t)d\dot{g} \{\hat{m}_0(t)+\varvec{\beta }'\varvec{Z}_i\}]}{\sum _{i=1}^nw_iY_i(t)\dot{g} \{\hat{m}_0(t)+\varvec{\beta }'\varvec{Z}_i\}}. \end{aligned}$$

Let $\hat{A}(\varvec{\beta }_*) \doteq dU(\varvec{\beta })/d\varvec{\beta }|_{\varvec{\beta }=\varvec{\beta }_*}$. We have

$$\begin{aligned} \hat{A}(\varvec{\beta }_*)\doteq & {} \frac{1}{n}\sum _{i=1}^n\int _0^\tau w_i[\varvec{Z}_i-\varvec{\bar{Z}}(t;\varvec{\beta }_*)][\varvec{Z}_i-\varvec{\breve{Z}}(t;\varvec{\beta }_*)]'[\dot{g}\{m_*(t)+\varvec{\beta }_*'\varvec{Z}_i\}dN_i(t)\\&\quad -Y_i(t)d\dot{g}\{m_*(t)+\varvec{\beta }_*'\varvec{Z}_i\}]\\= & {} \frac{1}{n}\sum _{i=1}^n\int _0^\tau w_i[\varvec{Z}_i-u_{\bar{\varvec{Z}}}(t;\varvec{\beta }_*)][\varvec{Z}_i-u_{\breve{\varvec{Z}}}(t;\varvec{\beta }_*)]'[\dot{g}\{m_*(t)+\varvec{\beta }_*'\varvec{Z}_i\}dN_i(t)\\&\quad -Y_i(t)d\dot{g}\{m_*(t)+\varvec{\beta }_*'\varvec{Z}_i\}] +o_p(1)\\= & {} A+o_p(1) \end{aligned}$$

Thus, $\hat{A}(\varvec{\beta }_*)$ converges in probability to a nonrandom A. It is easy to check that $U(\varvec{\beta }_*)\rightarrow 0$ almost surely, and A is nonsingular by (C4). The convergence of $\hat{A}(\varvec{\beta }_*)$ and the continuity of ${A}(\varvec{\beta })$ imply that we can find a small neighborhood of $\varvec{\beta }_*$ in which $\hat{A}(\varvec{\beta }_*)$ is nonsingular when n is large enough. Therefore, it follows from the inverse function theorem that within a small neighborhood of $\varvec{\beta }_*$, there exists a unique solution $\hat{\varvec{\beta }}$ to $U(\varvec{\beta })=0$ for sufficiently large n. Thus, there exists unique estimators $\hat{\varvec{\beta }}$ and $\hat{m}(t)$. Since $\hat{\varvec{\beta }}$ is strongly consistent to $\varvec{\beta }_*$, then it follows the uniform convergence of $\hat{m}_0(t;\varvec{\beta })$ to $m_0(t;\varvec{\beta })$ that $\hat{m}_0(t)\doteq \hat{m}_0(t;\hat{\varvec{\beta }}) \rightarrow m_0(t;\varvec{\beta }_*) = m_*(t)$ almost surely in $[0,\tau ]$.

Proof of Theorem 1(ii)

In this section, we first prove the Theorem 1(ii) under the CC design, then prove the theorem under the NCC design following the proof from Lu and Liu (2012). We know from Eq. (8) that

$$\begin{aligned}&\frac{1}{n}\sum _{i=1}^n w_i\left[ g\{m_0(t)+\varvec{\beta }^{'}\varvec{Z}_i\}dN_i(t)-Y_i(t)d[g\{m_0(t)+\varvec{\beta }^{'}\varvec{Z}_i\}+t]\right] \\&\quad = \frac{1}{n}\sum _{i=1}^nw_ig\{m_0(t)+\varvec{\beta }'\varvec{Z}_idM_i(t)\} \\&\qquad \frac{1}{n}\sum _{i=1}^n w_i\left[ g\{\hat{m}_0(t)+\varvec{\beta }^{'}\varvec{Z}_i\}dN_i(t)-Y_i(t)d[g\{\hat{m}_0(t)+\varvec{\beta }^{'}\varvec{Z}_i\}+t]\right] =0 \end{aligned}$$

Subtract the above two equations and use Taylor expansion, we have,

$$\begin{aligned}&\frac{1}{n}\sum _{i=1}^nw_i\dot{g}\{m_0(t)+\varvec{\beta }'\varvec{Z}_i \}[\hat{m}_0(t)-m_0(t)]dN_i(t) \\&\qquad -\frac{1}{n}\sum _{i=1}^nw_iY_i(t)d\dot{g}\{m_0(t)+\varvec{\beta }'\varvec{Z}_i \}[\hat{m}_0(t)-m_0(t)] \\&\qquad -\frac{1}{n}\sum _{i=1}^nw_iY_i(t)d\dot{g}\{m_0(t)+\varvec{\beta }'\varvec{Z}_i \}[d\hat{m}_0(t)-dm_0(t)] \\&\quad =-\frac{1}{n}\sum _{i=1}^nw_ig\{m_0(t)+\varvec{\beta }'\varvec{Z}_i \}dM_i(t) \end{aligned}$$

Hence, following the first-order ordinary differential equation,

$$\begin{aligned}&[\hat{m}_0(t)-m_0(t)]\frac{\sum _{i=1}^nw_i[\dot{g}\{m_0(t)+\varvec{\beta }'\varvec{Z}_i \}dN_i(t)-Y_i(t)d\dot{g}\{m_0(t)+\varvec{\beta }'\varvec{Z}_i \}]}{\sum _{i=1}^nw_iY_i(t)\dot{g}\{m_0(t)+\varvec{\beta }'\varvec{Z}_i \}} \\&\qquad -[d\hat{m}_0(t)-dm_0(t)] =-\frac{\sum _{i=1}^nw_ig\{m_0(t)+\varvec{\beta }'\varvec{Z}_i \}dM_i(t)}{\sum _{i=1}^nw_iY_i(t)\dot{g}\{m_0(t)+\varvec{\beta }'\varvec{Z}_i \}}, \\&\hat{m}_0(t)-m_0(t) = -\frac{1}{K(t;\varvec{\beta })}\int _t^\tau K(u;\varvec{\beta })\frac{\sum _{i=1}^nw_ig\{m_0(t)+\varvec{\beta }'\varvec{Z}_i \}dM_i(t)}{\sum _{i=1}^nw_iY_i(t)\dot{g}\{m_0(t)+\varvec{\beta }'\varvec{Z}_i \}}. \end{aligned}$$

Let $U(\varvec{\beta }_*) \doteq U(\varvec{\beta }_*,\hat{m}_0(t;\varvec{\beta }_*))$ and we have

$$\begin{aligned} U(\varvec{\beta }_*,m_*(t))= & {} \frac{1}{n}\sum _{i=1}^n\int _0^{\tau }w_i\varvec{Z}_i[g\{m_*(t)+\varvec{\beta }_*'\varvec{Z}_i\}dN_i(t) \\&-Y_i(t)dg\{m_*(t)+\varvec{\beta }_*'\varvec{Z}_i\}-Y_i(t)dt]. \end{aligned}$$

By using Taylor expansion again in $U(\varvec{\beta }_*,\hat{m}_0(t;\varvec{\beta }_*)) - U(\varvec{\beta }_*,m_*(\cdot ))$, we have

$$\begin{aligned}&U(\varvec{\beta }_*,\hat{m}_0(t;\varvec{\beta }_*)) - U(\varvec{\beta }_*,m_*(\cdot )) \\&\quad = \frac{1}{n}\sum _{i=1}^n\int _0^\tau w_i\varvec{Z}_i[\hat{m}_0(t)-m_0(t)]\dot{g}\{m_*(t)+\varvec{\beta }_*'\varvec{Z}_i\}dN_i(t) \\&\qquad -w_i\varvec{Z}_iY_i(t)[\hat{m}_0(t)-m_0(t)]d\dot{g}\{m_*(t)+\varvec{\beta }_*'\varvec{Z}_i\} \\&\qquad -w_i\varvec{Z}_iY_i(t)[d\hat{m}_0(t)-dm_0(t)]\dot{g}\{m_*(t)+\varvec{\beta }_*'\varvec{Z}_i\} \\&\quad =\frac{1}{n}\sum _{i=1}^n\int _0^\tau w_i[\varvec{Z}_i-\bar{\varvec{Z}}(t;\varvec{\beta }_*)][\hat{m}_0(t)-m_0(t)][\dot{g}\{m_*(t)+\varvec{\beta }_*'\varvec{Z}_i\}dN_i(t) \\&\qquad -Y_i(t)d\dot{g}\{m_*(t)+\varvec{\beta }_*'\varvec{Z}_i\}] - w_i\bar{\varvec{Z}}(t;\varvec{\beta }_*)g\{m_*(t)+\varvec{\beta }_*'\varvec{Z}_i\}dM_i(t) \\&\quad =-\frac{1}{n}\sum _{i=1}^n\int _0^\tau w_i[\tilde{\varvec{Z}}(t;\varvec{\beta }_*)+\bar{\varvec{Z}}(t;\varvec{\beta }_*)]g\{m_*(t)+\varvec{\beta }_*'\varvec{Z}_i\}dM_i(t). \end{aligned}$$

Thus,

$$\begin{aligned}&\sqrt{n}U(\varvec{\beta }_*) = \sqrt{n}U(\varvec{\beta }_*,\hat{m}_0(t;\varvec{\beta }_*)) \\&\quad = \sqrt{n}U(\varvec{\beta }_*,m_*)+\sqrt{n}[U(\varvec{\beta }_*,\hat{m}_0(t;\varvec{\beta }_*))-U(\varvec{\beta }_*,m_*)] \\&\quad =\frac{1}{\sqrt{n}}\sum _{i=1}^n\int _0^\tau w_i[\varvec{Z}_i-\bar{\varvec{Z}}(t;\varvec{\beta }_*)-\tilde{\varvec{Z}}(t;\varvec{\beta }_*)]g\{m_*(t)+\varvec{\beta }_*'\varvec{Z}_i\}dM_i(t;\varvec{\beta }_*,m_*) \\&\quad =\frac{1}{\sqrt{n}}\sum _{i=1}^n\int _0^\tau w_i[\varvec{Z}_i-u_{\bar{\varvec{Z}}}(t;\varvec{\beta }_*)-u_{\tilde{\varvec{Z}}}(t;\varvec{\beta }_*)]g\{m_*(t)+\varvec{\beta }_*'\varvec{Z}_i\}dM_i(t;\varvec{\beta }_*,m_*)\\&\qquad + o_p(1) \\&\quad =\frac{1}{\sqrt{n}}\sum _{i=1}^n\int _0^\tau [\varvec{Z}_i-u_{\bar{\varvec{Z}}}(t;\varvec{\beta }_*)-u_{\tilde{\varvec{Z}}}(t;\varvec{\beta }_*)]g\{m_*(t)+\varvec{\beta }_*'\varvec{Z}_i\}dM_i(t;\varvec{\beta }_*,m_*)\\&\qquad +\frac{1}{\sqrt{n}}\sum _{i=1}^n\int _0^\tau (w_i-1)[\varvec{Z}_i-u_{\bar{\varvec{Z}}}(t;\varvec{\beta }_*)-u_{\tilde{\varvec{Z}}}(t;\varvec{\beta }_*)] \\&\qquad \quad g\{m_*(t)+\varvec{\beta }_*'\varvec{Z}_i\}dM_i(t;\varvec{\beta }_*,m_*) + o_p(1)\\&\quad =\frac{1}{\sqrt{n}}\sum _{i=1}^n\int _0^\tau [\varvec{Z}_i-u_{\bar{\varvec{Z}}}(t;\varvec{\beta }_*)-u_{\tilde{\varvec{Z}}}(t;\varvec{\beta }_*)]g\{m_*(t)+\varvec{\beta }_*'\varvec{Z}_i\}dM_i(t;\varvec{\beta }_*,m_*)\\&\qquad -\frac{1}{\sqrt{n}}\sum _{i=1}^n\int _0^\tau (1-\delta _i)(1-\gamma _i/p_{0i})[\varvec{Z}_i-u_{\bar{\varvec{Z}}}(t;\varvec{\beta }_*)-u_{\tilde{\varvec{Z}}}(t;\varvec{\beta }_*)] \\&\qquad \quad g\{m_*(t)+\varvec{\beta }_*'\varvec{Z}_i\}dM_i(t;\varvec{\beta }_*,m_*) + o_p(1). \end{aligned}$$

As we defined in our manuscript, let $\mathcal {G}_i$ be the $\sigma $-field generated by $\{\tilde{T}_i,\delta _i,\varvec{Z}_i, i=1,\ldots ,n\}$ and $\mathcal {F}_i$ be the $\sigma $-field generated by $\{\tilde{T}_i,\delta _i, i=1,\ldots ,n\}$. We denote

$$\begin{aligned} \eta _i = (1-\delta _i)\int _0^\tau [\varvec{Z}_i-u_{\bar{\varvec{Z}}}(t;\varvec{\beta }_*)- u_{\tilde{\varvec{Z}}}(t;\varvec{\beta }_*)]g\{m_*(t)+\varvec{\beta }_*' \varvec{Z}_i\}dM_i(t;\varvec{\beta }_*,m_*). \end{aligned}$$

It is evident that $E(1-\gamma _i/p_{0i}|\mathcal {F}_i)=0$ and $E\{\eta _i(1-\gamma _i/p_{0i}|\mathcal {F}_i)\}=E\{\eta _i E (1-\gamma _i/p_{0i}|\mathcal {F}_i)\}$=0. Following the proof in Lu and Tsiatis (2006), $var\{\eta _i(1-\gamma _i/p_{0i})\}= E \{\eta _i^{\otimes 2}(1-p_{0i})/p_{0i}\}- E [\eta _i(1-p_{0i})/p_{0i}]^{\otimes 2} = \Sigma _2$. Condition on $\mathcal {F}_i, \{\eta _i(1-\gamma _i/p_{0i}),i=1,\ldots ,n\}$ and the first term of $\sqrt{n}U(\varvec{\beta }_*)$ are uncorrelated. Therefore, $\sqrt{n}U(\varvec{\beta }_*)$ is asymptotically normal with mean zero and variance-covariance matrix $\Sigma =\Sigma _1+\Sigma _2$. By Taylor expansion and consistency of $\hat{\varvec{\beta }}$, it follows

$$\begin{aligned}&\sqrt{n}(\hat{\varvec{\beta }}-\varvec{\beta }_*) \\&\quad = -A^{-1}\sqrt{n}U(\varvec{\beta }_*) +o_p(1) \\&\quad = -A^{-1}\frac{1}{\sqrt{n}}\sum _{i=1}^n\int _0^\tau w_i[\varvec{Z}_i-u_{\bar{\varvec{Z}}}(t;\varvec{\beta }_*)-u_{\tilde{\varvec{Z}}}(t;\varvec{\beta }_*)]\\&\qquad g\{m_*(t)+\varvec{\beta }_*'\varvec{Z}_i\}dM_i(t;\varvec{\beta }_*,m_*)+o_p(1), \end{aligned}$$

thus $\sqrt{n}(\hat{\varvec{\beta }}-\varvec{\beta }_*)\rightarrow N\{A^{-1}\Sigma (A^{-1})^{'}\}.$

Under the nested case-control design, the asymptotic distribution of $\hat{\varvec{\beta }}$ is more difficult to derive because the NCC sampling scheme is a dynamic process. The probability of being selected as a control is neither a constant not independent. Thus, we consider the idea of central limit theory for asymptotically negatively dependent random variables (Zhang 2000), which has been used in the proof of Lu and Liu (2012). Based on the following asymptotical representation, we have

$$\begin{aligned} U(\varvec{\beta }_*)= & {} \frac{1}{n}\sum _{i=1}^n\int _0^\tau w_i[\varvec{Z}_i-u_{\bar{\varvec{Z}}}(t;\varvec{\beta }_*)-u_{\tilde{\varvec{Z}}}(t;\varvec{\beta }_*)] \\&g\{m_*(t)+\varvec{\beta }_*'\varvec{Z}_i\}dM_i(t;\varvec{\beta }_*,m_*) +o_p(1) \\\equiv & {} U_1(\varvec{\beta }_*) + U_2(\varvec{\beta }_*) +o_p(\frac{1}{\sqrt{n}}). \end{aligned}$$

By martingale central limit theorem, $\sqrt{n}U_1(\varvec{\beta }_*) = \frac{1}{\sqrt{n}}\sum _{i=1}^n\int _0^\tau w_i[\varvec{Z}_i-u_{\bar{\varvec{Z}}}(t;\varvec{\beta }_*)-u_{\tilde{\varvec{Z}}}(t;\varvec{\beta }_*)]g\{m_*(t)+\varvec{\beta }_*'\varvec{Z}_i\}dM_i(t;\varvec{\beta }_*,m_*) \rightarrow N(0,\Sigma _1)$ as $n \rightarrow \infty $. Since $E(w_i-1|\mathcal {F}_i)$ = 0, it is evident that $U_1(\varvec{\beta }_*)$ and $U_2(\varvec{\beta }_*)$ are uncorrelated. However, because of the NCC sampling scheme, $w_i$ and $w_j$ ($i \ne j$) are correlated even after conditioning on $\mathcal {F}$. Since $(w_i-1)^2 = (1-\delta _i)(\gamma _i-p_{0i}^2)/p_{0i}^2$, then $E\{(w_i-1)^2|\mathcal {F}\} = (1-\delta _i)(1-p_{0i})/p_{0i}$. Thus, the conditional variance of $\sqrt{n}U_2(\varvec{\beta }_*)$ can be written as

$$\begin{aligned}&\frac{1}{n}\sum _{i=1}^n\frac{1-p_{0i}}{p_{0i}}(1-\delta _i) \left[ \int _0^\tau [\varvec{Z}_i-u_{\bar{\varvec{Z}}}(t;\varvec{\beta }_*)-u_{\tilde{\varvec{Z}}}(t;\varvec{\beta }_*)]g\{m_*(t)+\varvec{\beta }_*'\varvec{Z}_i\}dM_i(t)\right] ^{\otimes 2} \\&\qquad +\frac{1}{n}\sum _{i\ne j}E\{(\frac{\gamma _i}{p_{0i}}-1)(\frac{\gamma _j}{p_{0j}}-1)|\mathcal {F}\} \\&\qquad *(1-\delta _i)(1-\delta _j)\int _0^\tau [\varvec{Z}_i-u_{\bar{\varvec{Z}}}(t;\varvec{\beta }_*)-u_{\tilde{\varvec{Z}}}(t;\varvec{\beta }_*)]g\{m_*(t)+\varvec{\beta }_*'\varvec{Z}_i\}dM_i(t) \\&\qquad \quad \left[ \int _0^\tau [\varvec{Z}_j-u_{\bar{\varvec{Z}}}(t;\varvec{\beta }_*)-u_{\tilde{\varvec{Z}}}(t;\varvec{\beta }_*)]g\{m_*(t)+\varvec{\beta }_*'\varvec{Z}_j\}dM_j(t)\right] '. \end{aligned}$$

According to Samuelsen (1997), for $i\ne j, Cov(\gamma _i,\gamma _j|\mathcal {F})=\rho _{ij}(1-p_{0i})(1-p_{0j})$, where $\rho _{ij}=-\frac{m}{n}\int _0^{min(\tilde{T_i},\tilde{T_j})} \frac{\bar{g}_1(t)}{\bar{y}(t)}dm_*(t)+\frac{\bar{g}_2(t)}{\bar{y}(t)}dt+O_p(n^{-2})$. Thus, with some algebra, the Var$\{\sqrt{n}U_2(\varvec{\beta }_*)|\mathcal {F}\}$ can be written as

$$\begin{aligned}&\frac{1}{n}\sum _{i=1}^n(1-\delta _i)\frac{1-p_{0i}}{p_{0i}}\left[ \int _0^\tau [\varvec{Z}_i-u_{\bar{\varvec{Z}}}(t;\varvec{\beta }_*)-u_{\tilde{\varvec{Z}}}(t;\varvec{\beta }_*)]g\{m_*(t)+\varvec{\beta }_*'\varvec{Z}_i\}dM_i(t)\right] ^{\otimes 2} \\&\qquad -m\int _0^\tau [\frac{1}{n}\sum _{i=1}^nY_i(t)\frac{1-p_{0i}}{p_{0i}}(1-\delta _i)\int _0^\tau [\varvec{Z}_i-u_{\bar{\varvec{Z}}}(t;\varvec{\beta }_*)-u_{\tilde{\varvec{Z}}}(t;\varvec{\beta }_*)] \\&\qquad \quad g\{m_*(t)+\varvec{\beta }_*'\varvec{Z}_i\}dM_i(t)]^{\otimes 2}\left( \frac{\bar{g}_1(t)}{\bar{y}(t)}dm_*(t) +\frac{\bar{g}_2(t)}{\bar{y}(t)}dt\right) +o_p(1), \end{aligned}$$

where $\bar{g}_1(t) = \sum _{j=1}^n\frac{Y_j(t)\dot{g}\{\hat{m}_0(t)+\hat{\varvec{\beta }}'\varvec{Z}_j\}}{g\{\hat{m}_0(t)+\hat{\varvec{\beta }}'\varvec{Z}_j\}}, \bar{g}_2(t) = \sum _{j=1}^n\frac{Y_j(t)}{g\{\hat{m}_0(t)+\hat{\varvec{\beta }}'\varvec{Z}_j\}}, \bar{y}(t)=\sum _{j=1}^nY_j(t)$. According to strong law of large numbers, we have

$$\begin{aligned} \Sigma _2\equiv & {} lim_{n\rightarrow \infty } Var\{\sqrt{n}U_2(\varvec{\beta }_*)|\mathcal {F}\} \\= & {} E\left[ \frac{1-s_0}{s_0}(1-\delta )[[\varvec{Z}_i-u_{\bar{\varvec{Z}}}(t;\varvec{\beta }_*)-u_{\tilde{\varvec{Z}}}(t;\varvec{\beta }_*)]g\{m_*(t)+\varvec{\beta }_*'\varvec{Z}_i\}dM_i(t)]^{\otimes 2}\right] \\&-m\int _0^\tau (E[Y(t)(1-\delta )\frac{1-s_0}{s_0} \\&\quad \int _0^\tau [\varvec{Z}_i-u_{\bar{\varvec{Z}}}(t;\varvec{\beta }_*)-u_{\tilde{\varvec{Z}}}(t;\varvec{\beta }_*)]g\{m_*(t)+\varvec{\beta }_*'\varvec{Z}_i\}dM(t)]) ^{\otimes 2} \\&\qquad \left( \frac{\bar{g}_1(t)}{\bar{y}(t)}dm_*(t)+\frac{\bar{g}_2(t)}{\bar{y}(t)}dt\right) , \end{aligned}$$

where

$$\begin{aligned} s_{0i} = lim_{n\rightarrow \infty } p_{0i} = 1-\exp \left\{ -m\int _0^{\tilde{T}_i} \frac{\bar{g}_1(t)}{\bar{y}(t)}dm_0(t)+\frac{\bar{g}_2(t)}{\bar{y}(t)}dt \right\} . \end{aligned}$$

Thus, by the central limit theory for asymptotically negatively dependent random variables (Zhang 2000), we have $\sqrt{n}U_2(\varvec{\beta }_*) \rightarrow N(0,\Sigma _2)$ as $n\rightarrow \infty $, and

$$\begin{aligned} \sqrt{n}U(\varvec{\beta }_*) \rightarrow N(0,\Sigma _1+\Sigma _2), \end{aligned}$$

in distribution as $n \rightarrow \infty $. It is easy to see that $\Sigma _1+\Sigma _2 = \Sigma $. Follow by Taylor expansion and consistency of $\hat{\varvec{\beta }}$, we have $\sqrt{n}(\hat{\varvec{\beta }}-\varvec{\beta }_*)\rightarrow N\{A^{-1}\Sigma (A^{-1})^{'}\}$.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jin, P., Zeleniuch-Jacquotte, A. & Liu, M. Generalized mean residual life models for case-cohort and nested case-control studies. Lifetime Data Anal 26, 789–819 (2020). https://doi.org/10.1007/s10985-020-09499-w

Download citation

Received: 29 September 2019
Accepted: 25 May 2020
Published: 11 June 2020
Issue Date: October 2020
DOI: https://doi.org/10.1007/s10985-020-09499-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Generalized mean residual life models for case-cohort and nested case-control studies

Abstract

Access this article

Similar content being viewed by others

Development of a life expectancy table for individuals with type 1 diabetes

A Tutorial on Applying the Difference-in-Differences Method to Health Data

Defining the Study Cohort: Inclusion and Exclusion Criteria

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix

1.1 Simulation study

1.2 Regularity conditions

Proof of Theorem 1(i)

Proof of Theorem 1(ii)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Generalized mean residual life models for case-cohort and nested case-control studies

Abstract

Access this article

Similar content being viewed by others

Development of a life expectancy table for individuals with type 1 diabetes

A Tutorial on Applying the Difference-in-Differences Method to Health Data

Defining the Study Cohort: Inclusion and Exclusion Criteria

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix

Appendix

1.1 Simulation study

1.2 Regularity conditions

Proof of Theorem 1(i)

Proof of Theorem 1(ii)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation