Case-cohort studies for clustered failure time data with a cure fraction

Xie, Ping; Han, Bo; Wang, Xiaoguang

doi:10.1007/s00362-023-01448-7

Case-cohort studies for clustered failure time data with a cure fraction

Regular Article
Published: 17 April 2023

(2023)
Cite this article

Statistical Papers Aims and scope Submit manuscript

332 Accesses
Explore all metrics

Abstract

In epidemiological studies, the case-cohort design is a widely used method for their outstanding cost-effectiveness. Most of the existing works for the case-cohort design are focused on univariate failure time data. However, clustered failure time data are commonly encountered in epidemiological studies. In this article, we study the marginal nonmixture cure model for clustered failure time data with a cure fraction in the context of case-cohort design. A sieve semiparametric likelihood method is proposed to estimate the parametric and nonparametric components. The proposed method is easy to implement. The resulting estimators are shown to be strongly consistent and asymptotically normal. Simulation studies are carried out to assess the finite sample performance of the proposed method. We also analyze a real dataset from the Prostate, Lung, Colorectal, and Ovarian Cancer Screening Trial to illustrate our method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Optimal generalized case-cohort analysis with accelerated failure time model

Article 18 November 2016

Semiparametric regression and risk prediction with competing risks data under missing cause of failure

Article Open access 25 January 2020

Varying coefficient transformation cure models for failure time data

Article 09 October 2019

References

Amico M, Van Keilegom I (2018) Cure models in survival analysis. Annu Rev Stat Appl 5:311–345
Article MathSciNet Google Scholar
Bahari F, Parsi S, Ganjali M (2021) Empirical likelihood inference in general linear model with missing values in response and covariates by MNAR mechanism. Stat Pap 62(2):591–622
Article MathSciNet MATH Google Scholar
Barlow WE (1994) Robust variance estimation for the case-cohort design. Biometrics 50(4):1064–1072
Article MATH Google Scholar
Berkson J, Gage RP (1952) Survival curve for cancer patients following treatment. J Am Stat Assoc 47(259):501–515
Article Google Scholar
Boag JW (1949) Maximum likelihood estimates of the proportion of patients cured by cancer therapy. J R Stat Soc Ser B Stat Methodol 11(1):15–53
MATH Google Scholar
Box GEP, Cox DR (1964) An analysis of transformations. J R Stat Soc Ser B 26:211–252
MATH Google Scholar
Breslow NE, Wellner JA (2007) Weighted likelihood for semiparametric models and two-phase stratified samples, with application to Cox regression. Scand J Stat 34(1):86–102
Article MathSciNet MATH Google Scholar
Chen HY (2001) Weighted semiparametric likelihood method for fitting a proportional odds regression model to data from the case-cohort design. J Am Stat Assoc 96(456):1446–1457
Article MathSciNet MATH Google Scholar
Chen CM, Lu TFC (2012) Marginal analysis of multivariate failure time data with a surviving fraction based on semiparametric transformation cure models. Comput Stat Data Anal 56(3):645–655
Article MathSciNet MATH Google Scholar
Chen MH, Ibrahim JG, Sinha D (1999) A new Bayesian model for survival data with a surviving fraction. J Am Stat Assoc 94(447):909–919
Article MathSciNet MATH Google Scholar
Chen MH, Ibrahim JG, Sinha D (2002) Bayesian inference for multivariate survival data with a cure fraction. J Multivar Anal 80(1):101–126
Article MathSciNet MATH Google Scholar
Chen YH, Chatterjee N, Carroll RJ (2008) Retrospective analysis of haplotype-based case-control studies under a flexible model for gene-environment association. Biostatistics 9(1):81–99
Article MATH Google Scholar
Chen CM, Lu TFC, Hsu CM (2013) Association estimation for clustered failure time data with a cure fraction. Comput Stat Data Anal 57:210–222
Article MathSciNet MATH Google Scholar
Cox DR (1972) Regression models and life-tables (with discussion). J R Stat Soc B 34:187–220
MATH Google Scholar
Deng LF, Ding JL, Liu YY et al (2018) Regression analysis for the proportional hazards model with parameter constraints under case-cohort design. Comput Stat Data Anal 117:194–206
Article MathSciNet MATH Google Scholar
Ding JL, Chen XL, Fang HY et al (2018) Case-cohort design for accelerated hazards model. Stat Interface 11(4):657–668
Article MathSciNet MATH Google Scholar
Farewell VT (1982) The use of mixture models for the analysis of survival data with long-term survivors. Biometrics 38(4):1041–1046
Article Google Scholar
Han B, Wang XG (2020) Semiparametric estimation for the non-mixture cure model in case-cohort and nested case-control studies. Comput Stat Data Anal 144(106):874
MathSciNet MATH Google Scholar
Hu T, Xiang LM (2013) Efficient estimation for semiparametric cure models with interval-censored data. J Multivar Anal 121:139–151
Article MathSciNet MATH Google Scholar
June CH, O’Connor RS, Kawalekar OU et al (2018) CAR T cell immunotherapy for human cancer. Science 359(6382):1361–1365
Article Google Scholar
Kalbfleisch JD, Lawless JF (1988) Likelihood analysis of multi-state models for disease incidence and mortality. Stat Med 7(1–2):149–160
Article Google Scholar
Kuk AYC, Chen CH (1992) A mixture model combining logistic regression with proportional hazards regression. Biometrika 79(3):531–541
Article MATH Google Scholar
Lai X, Yau KKW (2008) Long-term survivor model with bivariate random effects: applications to bone marrow transplant and carcinoma study data. Stat Med 27(27):5692–5708
Article MathSciNet Google Scholar
Li Y, Panagiotou OA, Black A et al (2016) Multivariate piecewise exponential survival modeling. Biometrics 72(2):546–553
Article MathSciNet MATH Google Scholar
Li W, Li RS, Feng ZD et al (2020) Semiparametric isotonic regression analysis for risk assessment under nested case-control and case-cohort designs. Stat Methods Med Res 29(8):2328–2343
Article MathSciNet Google Scholar
Lorentz GG (1986) Bernstein polynomials. Chelsea Publishing Co, New York
MATH Google Scholar
Lu SE, Shih JH (2006) Case-cohort designs and analysis for clustered failure time data. Biometrics 62(4):1138–1148
Article MathSciNet MATH Google Scholar
Ma SG (2007) Additive risk model with case-cohort sampled current status data. Stat Pap 48(4):595–608
Article MathSciNet MATH Google Scholar
Maller RA, Zhou S (1992) Estimating the proportion of immunes in a censored sample. Biometrika 79(4):731–739
Article MathSciNet MATH Google Scholar
Niu Y, Peng Y (2013) A semiparametric marginal mixture cure model for clustered survival data. Stat Med 32(14):2364–2373
Article MathSciNet Google Scholar
Niu Y, Peng Y (2014) Marginal regression analysis of clustered failure time data with a cure fraction. J Multivar Anal 123:129–142
Article MathSciNet MATH Google Scholar
Peng YW, Taylor JMG (2011) Mixture cure model with random effects for the analysis of a multi-center tonsil cancer study. Stat Med 30(3):211–223
Article MathSciNet Google Scholar
Peng YW, Taylor JMG (2014) Cure models in handbook of survival analysis. Chapman and Hall, Boca Raton
Google Scholar
Peng YW, Xu JF (2012) An extended cure model and model selection. Lifetime Data Anal 18(2):215–233
Article MathSciNet MATH Google Scholar
Peng YW, Taylor JMG, Yu BB (2007) A marginal regression model for multivariate failure time data with a surviving fraction. Lifetime Data Anal 13(3):351–369
Article MathSciNet MATH Google Scholar
Pollard D (1984) Convergence of stochastic processes. Springer, New York
Book MATH Google Scholar
Portier F, El Ghouch A, Van Keilegom I (2017) Efficiency and bootstrap in the promotion time cure model. Bernoulli 23(4B):3437–3468
Article MathSciNet MATH Google Scholar
Prentice RL (1986) A case-cohort design for epidemiologic cohort studies and disease prevention trials. Biometrika 73(1):1–11
Article MathSciNet MATH Google Scholar
Prorok PC, Andriole GL, Bresalier RS et al (2000) Design of the prostate, lung, colorectal and ovarian (PLCO) cancer screening trial. Control Clin Trials 21:273S-309S
Article Google Scholar
Segal MR, Neuhaus JM, James IR (1997) Dependence estimation for marginal models of multivariate survival data. Lifetime Data Anal 3(3):251–268
Article MATH Google Scholar
Self SG, Prentice RL (1988) Asymptotic distribution theory and efficiency results for case-cohort studies. Ann Stat 16(1):64–81
Article MathSciNet MATH Google Scholar
Shen XT (1997) On methods of sieves and penalization. Ann Stat 25(6):2555–2591
Article MathSciNet MATH Google Scholar
Shen XT, Wong WH (1994) Convergence rate of sieve estimates. Ann Stat 22(2):580–615
Article MathSciNet MATH Google Scholar
Stone CJ (1982) Optimal global rates of convergence for nonparametric regression. Ann Stat 10(4):1040–1053
Article MathSciNet MATH Google Scholar
Sy JP, Taylor JMG (2000) Estimation in a Cox proportional hazards cure model. Biometrics 56(1):227–236
Article MathSciNet MATH Google Scholar
Taylor JMG (1995) Semi-parametric estimation in failure time mixture models. Biometrics 51(3):899–907
Article MATH Google Scholar
Tsodikov A (1998) A proportional hazards model taking account of long-term survivors. Biometrics 54(4):1508–1516
Article MathSciNet MATH Google Scholar
Tsodikov AD, Ibrahim JG, Yakovlev AY (2003) Estimating cure rates from survival data: an alternative to two-component mixture models. J Am Stat Assoc 98(464):1063–1078
Article MathSciNet Google Scholar
van de Geer SA (2000) Applications of empirical process theory. Cambridge University Press, Cambridge
MATH Google Scholar
van der Vaart AW (1998) Asymptotic statistics, Cambridge series in statistical and probabilistic mathematics, vol 3. Cambridge University Press, Cambridge
Google Scholar
van der Vaart AW, Wellner JA (1996) Weak convergence and empirical processes. Springer, New York
Book MATH Google Scholar
Xu J, Peng Y (2014) Nonparametric cure rate estimation with covariates. Can J Stat 42(1):1–17
Article MathSciNet MATH Google Scholar
Yakovlev AY, Tsodikov AD (1996) Stochastic models of tumor latency and their biostatistical applications. World Scientific, Singapore
Book MATH Google Scholar
Yau KKW, Ng ASK (2001) Long-term survivor mixture model with random effects: application to a multi-centre clinical trial of carcinoma. Stat Med 20(11):1591–1607
Article Google Scholar
Zhang H, Schaubel DE, Kalbfleisch JD (2011) Proportional hazards regression for the analysis of clustered survival data from case-cohort studies. Biometrics 67(1):18–28
Article MathSciNet MATH Google Scholar
Zhao W, Chen YQ, Hsu L (2017) On estimation of time-dependent attributable fraction from population-based case-control studies. Biometrics 73(3):866–875
Article MathSciNet Google Scholar

Download references

Acknowledgements

The authors thank the associate editor and two reviewers for their constructive and insightful comments. They are grateful to the National Cancer Institute for access to NCI’s data collected by the Prostate, Lung, Colorectal and Ovarian (PLCO) Cancer Screening Trial. The statements contained herein are solely those of the authors and do not represent concurrence by NCI. The second author was partially supported by the China Postdoctoral Science Foundation (Grant No. 2021TQ0349). The third author was partially supported by Dalian High-level Talent Innovation Project (Grant No. 2020RD09).

Author information

Authors and Affiliations

School of Mathematical Sciences, Dalian University of Technology, Dalian, 116024, Liaoning, China
Ping Xie & Xiaoguang Wang
NCMIS, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, 100190, China
Bo Han

Authors

Ping Xie
View author publications
You can also search for this author in PubMed Google Scholar
Bo Han
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoguang Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiaoguang Wang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (eps 14 KB)

Supplementary file 2 (pdf 95 KB)

Appendix

In this Appendix, we will present the proofs of Theorems 1–3. Let $W_{\cdot j}=\{T_{\cdot j},\delta _{\cdot j},\varvec{Z}_{\cdot j}\}, j=1,\ldots ,K$ denote the data for a generic cluster and $\varvec{W}=\{W_{\cdot 1}, \ldots , W_{\cdot K}\}$. Similarly, we denote $W^{\xi }_{\cdot j}=\{T_{\cdot j},\delta _{\cdot j},\xi _{\cdot j}\varvec{Z}_{\cdot j},\xi _{\cdot j}\}, j=1,\ldots ,K$ as a single observation for a generic cluster under the case-cohort design, and $\varvec{W}^{\xi }=\{W^{\xi }_{\cdot 1}, \ldots , W^{\xi }_{\cdot K}\}$. Furthermore, we define the function class $\mathcal {L}_{n}=\{l_{K}^{w}(\varvec{\theta };\varvec{W}^{\xi })=\sum _{j=1}^{K}l^{w}(\varvec{\theta };W_{\cdot j}^{\xi }):\varvec{\theta }\in \varvec{\Theta }_{n}\}$. In the whole proofs, let $Pg=\int g(x)\textrm{d}P(x)$, the expectation of g(x) under the distribution P, and $P_{n}g=n^{-1}\sum _{i=1}^{n} g(X_{i})$, the expectation of g(X) under the empirical measure $P_n$. We employ $\widetilde{C}$ to represent a universal positive constant, which may vary from position to position.

For any $\epsilon >0$, the covering number $N(\epsilon , \mathcal {L}_{n}, L_1(P_n))$ is defined as the smallest positive integer $\kappa $, then there exists $\{\varvec{\theta }^{(1)},\ldots ,\varvec{\theta }^{(\kappa )}\}$ such that

$$\begin{aligned} \min _{k\in \{1,\ldots ,\kappa \}}\frac{1}{n}\sum ^{n}_{i=1}|l_{K}^w(\varvec{\theta };\varvec{W}_{i}^{\xi })-l_{K}^w(\varvec{\theta }^{(k)};\varvec{W}_{i}^{\xi })|< \epsilon , \end{aligned}$$

for all $\varvec{\theta }\in \varvec{\Theta }_{n}$, where $k=1,\ldots ,\kappa , \varvec{\theta }^{(k)}=(\varvec{\beta }^{(k)}, F^{(k)})\in \varvec{\Theta }_{n}$. If $\kappa $ does not exist, we define $N(\epsilon ,\mathcal {L}_{n},L_1(P_n))=\infty $.

Lemma 1

Under conditions (C1)–(C3), the covering number of the function class $\mathcal {L}_{n}$ satisfies

$$\begin{aligned} N(\epsilon ,\mathcal {L}_{n},L_1(P_n))\le \widetilde{C}M_{n}^{(m+1)}\epsilon ^{-(m+p+1)}, \end{aligned}$$

where $\widetilde{C}$ is a constant, $m=o(n^{\nu })$ with $0<\nu <1$ and the size of the sieve space $\varvec{\Theta }_{n}$ is controlled by $M_{n}=O(n^{c})$ with a constant $c \in (0,\infty )$.

Proof of Lemma 1

For any $\varvec{\theta }^{1}=(\varvec{\beta }^{1},F^{1}), \varvec{\theta }^{2}=(\varvec{\beta }^{2},F^{2})\in \varvec{\Theta }_{n}$, under conditions (C1)–(C3), there exists a large enough constant $\widetilde{C}$ such that

$$\begin{aligned} \mid l_{K}^{w}(\varvec{\theta }^{1};\varvec{W}^{\xi }) - l_{K}^{w}(\varvec{\theta }^{2};\varvec{W}^{\xi })\mid \le \widetilde{C} (\Vert \varvec{\beta }^{1}-\varvec{\beta }^{2}\Vert +\Vert F^{1}-F^{2}\Vert _{\infty }), \end{aligned}$$

(7)

where $\Vert g\Vert _{\infty }=\sup _t|g(t)|$ for a function g. Denote $\varvec{\gamma }^{j}=(\gamma _{0,j},\ldots ,\gamma _{m,j})^{T}$ as the Bernstein coefficients vector corresponding to $F^{j},j=1,2$. Then, we obtain that

$$\begin{aligned} \Vert F^{1}-F^{2}\Vert _{\infty } ={}&\sup _{t} \Bigg | \sum _{k=0}^{m}\gamma _{k,1}B_{k}(t,m,\tau )- \sum _{k=0}^{m}\gamma _{k,2}B_{k}(t,m,\tau ) \Bigg | \nonumber \\ \le {}&\max _{0\le k\le m}\mid \gamma _{k,1}- \gamma _{k,2}\mid \nonumber \\ := {}&\Vert \varvec{\gamma }^{1} -\varvec{\gamma }^{2}\Vert _{\infty }. \end{aligned}$$

(8)

By plugging (7) into (8), it is easy to show

$$\begin{aligned} \mid l_{K}^{w}(\varvec{\theta }^{1};\varvec{W}^{\xi }) - l_{K}^{w}(\varvec{\theta }^{2};\varvec{W}^{\xi })\mid \le \widetilde{C} (\Vert \varvec{\beta }^{1}-\varvec{\beta }^{2}\Vert +\Vert \varvec{\gamma }^{1} -\varvec{\gamma }^{2}\Vert _{\infty }). \end{aligned}$$

Thus, for any $\varvec{\theta }\in \varvec{\Theta }_{n}$,

$$\begin{aligned} \frac{1}{n}\sum ^{n}_{i=1}\mid l_{K}^{w}(\varvec{\theta };\varvec{W}^{\xi }_i) - l_{K}^{w}(\varvec{\theta }^{(k)};\varvec{W}^{\xi }_i)\mid \le \widetilde{C}(\Vert \varvec{\beta }-\varvec{\beta }^{(k)}\Vert +\Vert \varvec{\gamma }-\varvec{\gamma }^{(k)}\Vert _{\infty }), \end{aligned}$$

where $k=1,\ldots ,\kappa $. According to Lemma 2.5 of van de Geer (2000), we can show that $\{\varvec{\beta }\in \mathbb {R}^{p}: \Vert \varvec{\beta }\Vert \le M\}$ is covered by $(10\widetilde{C}M/\epsilon )^{p}$ balls with radius $\epsilon /(2\widetilde{C})$ and $\{\varvec{\gamma }\in R^{m+1},\sum _{k=0}^{m}| \gamma _{k}|\le M_{n}\}$ is covered by $(10\widetilde{C}M_{n}/\epsilon )^{m+1}$ balls with radius $\epsilon /(2\widetilde{C})$. As a consequence, the covering number of the function class $\mathcal {L}_{n}$ satisfies

$$\begin{aligned} N(\epsilon ,\mathcal {L}_{n},L_1(P_n))\le \left( \frac{10\widetilde{C}M}{\epsilon } \right) ^{p}\cdot \left( \frac{10\widetilde{C}M_{n}}{\epsilon } \right) ^{m+1} \le \widetilde{C}M_{n}^{(m+1)}\epsilon ^{-(m+p+1)}. \end{aligned}$$

We finish the proof of Lemma 1. $\square $

Lemma 2

Under conditions (C1)–(C3), we have

$$\begin{aligned} \sup _{\varvec{\theta }\in \varvec{\Theta }_{n}}|P_{n}l_{K}^{w}(\varvec{\theta };\varvec{W}^{\xi })-Pl_{K}^{w}(\varvec{\theta };\varvec{W}^{\xi })|\rightarrow 0, \end{aligned}$$

almost surely.

Proof of Lemma 2

Under conditions (C1)–(C3), we have the uniform bound of $l_{K}^{w}(\varvec{\theta };\varvec{W}^{\xi })$. Without loss of generality, we assume $\sup _{\varvec{\theta }\in \varvec{\Theta }}|l_{K}^{w}(\varvec{\theta };\varvec{W}^{\xi })|\le 1$. Then we have $P(l_{K}^{w}(\varvec{\theta };\varvec{W}^{\xi }))^{2}\le P(\sup _{\varvec{\theta }\in \varvec{\Theta }}|l_{K}^{w}(\varvec{\theta };\varvec{W}^{\xi })|)^{2}\le 1$. Let $\alpha _{n}= n^{-1/2+\iota }(\log n)^{1/2}$ with $\nu /2<\iota <1/2$. It is easy to see that $\{\alpha _{n}\}$ is a nonincreasing sequence of positive numbers. Let $\epsilon _{n}=\epsilon \alpha _{n}$ with $\epsilon >0$. For any $\varvec{\theta }\in \varvec{\Theta }_{n}$ and sufficiently large n, we then have

$$\begin{aligned} \textrm{var} (P_{n}l_{K}^{w}(\varvec{\theta };\varvec{W}^{\xi }) )/(4\epsilon _{n})^{2}\le \frac{(1/n)P(l_{K}^{w}(\varvec{\theta };\varvec{W}^{\xi }))^{2}}{16\epsilon ^{2}\alpha _{n}^{2}} \le \frac{1}{16\epsilon ^{2}n^{2\iota }\log n}\le \frac{1}{2}. \end{aligned}$$

(9)

Denote the observations as $\mathcal {W}^{\xi }=\{\varvec{W}^{\xi }_1, \ldots , \varvec{W}^{\xi }_n\}$. Following Pollard (1984), we denote $P^{0}_{n}$ as the signed measure that places mass $\pm n^{-1}$ at each of the observations $\mathcal {W}^{\xi }$. According to (p.31 Pollard 1984) and the formula (9), we have the following inequality

$$\begin{aligned} P(\sup _{\varvec{\theta }\in \varvec{\Theta }_{n}}|P_{n}l_{K}^{w}(\varvec{\theta };\varvec{W}^{\xi })-Pl_{K}^{w}(\varvec{\theta };\varvec{W}^{\xi })|>8\epsilon _{n}) \le 4P (\sup _{\varvec{\theta }\in \varvec{\Theta }_{n}} |P^{o}_{n}l_{K}^{w}(\varvec{\theta };\varvec{W}^{\xi }) |>2\epsilon _{n}). \end{aligned}$$

Given $\mathcal {W}^{\xi }$, by the definition of the covering number, we can choose $\varvec{\theta }^{(1)},\ldots ,\varvec{\theta }^{(\kappa )}$, where $\kappa =N(\epsilon _{n}/2,\mathcal {L}_n,L_{1}(P_{n}))$, such that

$$\begin{aligned} \min _{k \in \{1,\ldots ,\kappa \}}P_{n}|l_{K}^{w}(\varvec{\theta };\varvec{W}^{\xi })-l_{K}^{w}(\varvec{\theta }^{(k)};\varvec{W}^{\xi })|< \epsilon _{n}/2, \end{aligned}$$

for all $\varvec{\theta }\in \varvec{\Theta }_{n}$. For each $\varvec{\theta }\in \varvec{\Theta }_{n}$, write $\varvec{\theta }^{*}$ for the $\varvec{\theta }^{(k)}$ at which the minimum is achieved. By some simple calculations, we have the following inequality

$$\begin{aligned} |P_{n}^{o}l_{K}^{w}(\varvec{\theta };\varvec{W}^{\xi })-P_{n}^{o}l_{K}^{w}(\varvec{\theta }^{*};\varvec{W}^{\xi }) | \le P_{n}|l_{K}^{w}(\varvec{\theta };\varvec{W}^{\xi })-l_{K}^{w}(\varvec{\theta }^{*};\varvec{W}^{\xi })|. \end{aligned}$$

(10)

According to the formula (10), we obtain

$$\begin{aligned} {}&P (\sup _{\varvec{\theta }\in \varvec{\Theta }_{n}} |P^{o}_{n}l_{K}^{w}(\varvec{\theta };\varvec{W}^{\xi }) |>2\epsilon _{n}|\mathcal {W}^{\xi } )\nonumber \\ {}&\quad \le P (\sup _{\varvec{\theta }\in \varvec{\Theta }_{n}} \{ |P^{o}_{n}l_{K}^{w}(\varvec{\theta }^{*};\varvec{W}^{\xi }) |+ P_{n}|l_{K}^{w}(\varvec{\theta };\varvec{W}^{\xi })-l_{K}^{w}(\varvec{\theta }^{*};\varvec{W}^{\xi })| \}>2\epsilon _{n}|\mathcal {W}^{\xi } ) \nonumber \\ {}&\quad \le P (\max _{l\in \{1,\ldots ,\kappa \}} |P^{o}_{n}l_{K}^{w}(\varvec{\theta }^{(k)};\varvec{W}^{\xi }) |>3\epsilon _{n}/2|\mathcal {W}^{\xi } )\nonumber \\ {}&\quad \le N(\epsilon _{n}/2,\mathcal {L}_{n},L_{1}(P_n)) \max _{l\in \{1,\ldots ,\kappa \}}P ( |P^{o}_{n}l_{K}^{w}(\varvec{\theta }^{(k)};\varvec{W}^{\xi }) |>3\epsilon _{n}/2|\mathcal {W}^{\xi }). \end{aligned}$$

(11)

By the definition of the covering number $N(\epsilon _{n}/2,\mathcal {L}_{n},L_{1}(P_n))$, for each $\varvec{\theta }^{(k)}$, there exists $\tilde{\varvec{\theta }}^{(k)} \in \varvec{\Theta }_{n}$ such that $P_{n}|l_{K}^{w}(\varvec{\theta }^{(k)};\varvec{W}^{\xi })-l_{K}^{w}(\tilde{\varvec{\theta }}^{(k)};\varvec{W}^{\xi })|<\epsilon _{n}/2$. Then we have

$$\begin{aligned} P ( |P^{o}_{n}l_{K}^{w}(\varvec{\theta }^{(k)};\varvec{W}^{\xi }) |>3\epsilon _{n}/2|\mathcal {W}^{\xi } ) \le {}&P ( (P_{n} |l_{K}^{w}(\varvec{\theta }^{(k)};\varvec{W}^{\xi })-l_{K}^{w}(\tilde{\varvec{\theta }}^{(k)};\varvec{W}^{\xi })| \\ {}&+|P^{o}_{n}l_{K}^{w}(\tilde{\varvec{\theta }}^{(k)};\varvec{W}^{\xi }) | )>3\epsilon _{n}/2|\mathcal {W}^{\xi } ) \\ \le {}&P ( |P^{o}_{n}l_{K}^{w}(\tilde{\varvec{\theta }}^{(k)};\varvec{W}^{\xi }) | >\epsilon _{n}|\mathcal {W}^{\xi }). \end{aligned}$$

Combining Lemma 2.2.7 of van der Vaart and Wellner (1996) with $|l_{K}^{w}(\tilde{\varvec{\theta }}^{(k)};\varvec{W}^{\xi })|\le 1$, we then have

$$\begin{aligned} P ( |P^{o}_{n}l_{K}^{w}(\tilde{\varvec{\theta }}^{(k)};\varvec{W}^{\xi }) | >\epsilon _{n}|\mathcal {W}^{\xi } ) \le 2\exp (-n\epsilon _{n}^2/2). \end{aligned}$$

(12)

According to Lemma 1, the formulas (11) and (12), we have

$$\begin{aligned} P (\sup _{\varvec{\theta }\in \varvec{\Theta }_{n}} |P^{o}_{n}l_{K}^{w}(\varvec{\theta };\varvec{W}^{\xi }) |>2\epsilon _{n}|\mathcal {W}^{\xi } ) {}&\quad \le 2N(\epsilon _{n}/2,\mathcal {L}_{n},L_{1}(P_n))\exp (-n\epsilon _{n}^2/2)\\ {}&\quad \le 2\widetilde{C}M_n^{(m+1)}(\epsilon _{n}/2)^{-(p+m+1)}\exp (-n\epsilon _{n}^2/2). \end{aligned}$$

By taking expectations over $\mathcal {W}^{\xi }$, we obtain

$$\begin{aligned} P (\sup _{\varvec{\theta }\in \varvec{\Theta }_{n}} |P^{o}_{n}l_{K}^{w}(\varvec{\theta };\varvec{W}^{\xi })) |>2\epsilon _{n} ) {}&\le 2\widetilde{C}M_n^{(m+1)}(\epsilon _{n}/2)^{-(p+m+1)}\exp (-n\epsilon _{n}^2/2). \end{aligned}$$

According to $M_n=O(n^c)$, $m=o(n^\nu )$ and $\iota > \nu /2$, we can show that

$$\begin{aligned} {}&P (\sup _{\varvec{\theta }\in \varvec{\Theta }_{n}} |P_{n}l_{K}^{w}(\varvec{\theta };\varvec{W}^{\xi })-Pl_{K}^{w}(\varvec{\theta };\varvec{W}^{\xi }) |>8\epsilon _{n} ) \le 4P (\sup _{\varvec{\theta }\in \varvec{\Theta }_{n}} |P^{o}_{n}l_{K}^{w}(\varvec{\theta };\varvec{W}^{\xi }) |>2\epsilon _{n} )\\ {}&\quad \le 8\widetilde{C}M_n^{(m+1)}(\epsilon _{n}/2)^{-(p+m+1)}\exp (-n\epsilon _{n}^2/2)\\ {}&\quad \le 8\widetilde{C}\exp \{(m+1)c\log n-(p+m+1)[\log (\epsilon n^{-1/2+\iota }(\log n)^{1/2})-\log 2]\\ {}&\qquad -n\epsilon ^2n^{-1+2\iota }\log n/2\}\\ {}&\quad \le 8\widetilde{C}\exp \{(p+m+1)[(c+1/2-\iota )\log n-\log \log n/2 -\log \epsilon +\log 2 ]\\ {}&\qquad -\epsilon ^2n^{2\iota }\log n/2\}\\ {}&\quad \le 8\widetilde{C}\exp (-\widetilde{C}n^{2\iota } \log n). \end{aligned}$$

Hence, we obtain that

$$\begin{aligned} \sum _{n=1}^{\infty } P(\sup _{\varvec{\theta }\in \varvec{\Theta }_{n}}|P_{n}l_{K}^{w}(\varvec{\theta };\varvec{W}^{\xi })-Pl_{K}^{w}(\varvec{\theta };\varvec{W}^{\xi })|>8\epsilon _{n}) <\infty . \end{aligned}$$

By the Borel–Cantelli lemma, we have $\sup _{\varvec{\theta }\in \varvec{\Theta }_{n}}|P_{n}l_{K}^{w}(\varvec{\theta };\varvec{W}^{\xi })-Pl_{K}^{w}(\varvec{\theta };\varvec{W}^{\xi })|\rightarrow 0$ almost surely. $\square $

Proof of Theorem 1

According to Lemmas 1 and 2, we can show that

$$\begin{aligned} N(\epsilon ,\mathcal {L}_{n},L_1(P_n))\le \widetilde{C}M_{n}^{(m+1)}\epsilon ^{-(m+p+1)}, \end{aligned}$$

and

$$\begin{aligned} \sup _{\varvec{\theta }\in \varvec{\Theta }_{n}}|P_{n}l_{K}^{w}(\varvec{\theta };\varvec{W}^{\xi })-Pl_{K}^{w}(\varvec{\theta };\varvec{W}^{\xi })|\rightarrow 0, \end{aligned}$$

(13)

almost surely as $n\rightarrow \infty $. Define $\varvec{\Theta }_{\epsilon }=\{\varvec{\theta }:\textrm{d}(\varvec{\theta },\varvec{\theta }_0)\ge \epsilon ,\varvec{\theta }\in \varvec{\Theta }_n \}$ for $\epsilon > 0$,

$$\begin{aligned} {}&\zeta _{1n}=\sup _{\varvec{\theta }\in \varvec{\Theta }_n}|P_nM_{K}(\varvec{\theta };\varvec{W}^{\xi })-PM_{K}(\varvec{\theta };\varvec{W}^{\xi })|, \end{aligned}$$

and

$$\begin{aligned} {}&\zeta _{2n}=P_nM_{K}(\varvec{\theta }_0;\varvec{W}^{\xi })-PM_{K}(\varvec{\theta }_0;\varvec{W}^{\xi }), \end{aligned}$$

where $M_{K}(\varvec{\theta };\varvec{W}^{\xi })=-l_{K}^w(\varvec{\theta };\varvec{W}^{\xi })$. We have

$$\begin{aligned} \inf _{\varvec{\Theta }_{\epsilon }}PM_{K}(\varvec{\theta };\varvec{W}^{\xi })={}&\inf _{\varvec{\Theta }_{\epsilon }}\{PM_{K}(\varvec{\theta };\varvec{W}^{\xi })-P_nM_{K}(\varvec{\theta };\varvec{W}^{\xi }) +P_nM_{K}(\varvec{\theta };\varvec{W}^{\xi })\}\\ \le {}&\zeta _{1n}+ \inf _{\varvec{\Theta }_{\epsilon }}P_nM_{K}(\varvec{\theta };\varvec{W}^{\xi }). \end{aligned}$$

If $\hat{\varvec{\theta }}_n\in \varvec{\Theta }_{\epsilon }$, we then have

$$\begin{aligned} \inf _{\varvec{\Theta }_{\epsilon }}P_nM_{K}(\varvec{\theta };\varvec{W}^{\xi })=P_nM_{K}(\hat{\varvec{\theta }}_n;\varvec{W}^{\xi }) {}&\le P_nM_{K}(\varvec{\theta }_0;\varvec{W}^{\xi })=\zeta _{2n}+PM_{K}(\varvec{\theta }_0;\varvec{W}^{\xi }). \end{aligned}$$

By identification of the model (1), we obtain that $\delta _\epsilon =\inf _{\varvec{\Theta }_{\epsilon }}PM_{K}(\varvec{\theta };\varvec{W}^{\xi })-PM_{K}(\varvec{\theta }_0;\varvec{W}^{\xi })>0$. By conditions (C1)-(C3), it is easy to show

$$\begin{aligned} \inf _{\varvec{\Theta }_{\epsilon }}PM_{K}(\varvec{\theta };\varvec{W}^{\xi })\le \zeta _{1n}+\zeta _{2n}+PM_{K}(\varvec{\theta }_0;\varvec{W}^{\xi })=\zeta _{n}+PM_{K}(\varvec{\theta }_0;\varvec{W}^{\xi }), \end{aligned}$$

with $\zeta _{n}=\zeta _{1n}+\zeta _{2n}$. Hence, we have $\zeta _{n}\ge \delta _\epsilon $. It is indicated that $\{\textrm{d}(\hat{\varvec{\theta }}_n,\varvec{\theta }_0)\ge \epsilon \}\subseteq \{\zeta _n\ge \delta _\epsilon \}$. According to the formula (13) and the strong law of large numbers, we obtain that both $\zeta _{1n}\rightarrow 0$ and $\zeta _{2n}\rightarrow 0$ almost surely as $n \rightarrow \infty $. Thus, $\bigcup ^{\infty }_{k=1}\bigcap ^{\infty }_{n=k} \{\textrm{d}(\hat{\varvec{\theta }}_n,\varvec{\theta }_0)\ge \epsilon \} \subseteq \bigcup ^{\infty }_{k=1}\bigcap ^{\infty }_{n=k} \{\zeta _n\ge \delta _\epsilon \}$, which proves that $\textrm{d}(\hat{\varvec{\theta }}_n,\varvec{\theta }_0)\rightarrow 0$ almost surely. It is easy to see that $\Vert \hat{\varvec{\beta }}_n- \varvec{\beta }_0\Vert \le \textrm{d}(\hat{\varvec{\theta }}_n,\varvec{\theta }_0)$ and $\Vert \hat{F}_n- F_0\Vert _2 \le \textrm{d}(\hat{\varvec{\theta }}_n,\varvec{\theta }_0)$. Therefore, we have $\Vert \hat{\varvec{\beta }}_n- \varvec{\beta }_0\Vert \rightarrow 0$ and $\Vert \hat{F}_n- F_0\Vert _2 \rightarrow 0$ almost surely. We complete the proof of Theorem 1. $\square $

Proof of Theorem 2

Now we will obtain the convergence rate of $\hat{\varvec{\theta }}_n$. By Theorem 1.6.2 of Lorentz (1986), there exists a Bernstein polynomial $F_{n0}$ which satisfies $\Vert F_{n0}-F_0\Vert _\infty =O(m^{-r/2})$. For convenience, we define $\varvec{\theta }_{n0}=(\varvec{\beta }_0,F_{n0})$. It is easy to see that $\textrm{d}(\varvec{\theta }_{n0},\varvec{\theta }_0)=O(n^{-r\nu /2})$. For any $\eta >0$, we define the class of functions

$$\begin{aligned} \mathcal {F}_{\eta }=\{l_{K}^{w}(\varvec{\theta };\varvec{W}^{\xi })-l_{K}^{w}(\varvec{\theta }_{n0};\varvec{W}^{\xi }):\varvec{\theta }\in \varvec{\Theta }_{n}, \eta /2<\textrm{d}(\varvec{\theta },\varvec{\theta }_{n0})\le \eta \}, \end{aligned}$$

for a given observation $\varvec{W}^{\xi }$. One can easily obtain that $P(l_{K}^{w}(\varvec{\theta }_{0};\varvec{W}^{\xi })-l_{K}^{w}(\varvec{\theta }_{n0};\varvec{W}^{\xi }))\le \widetilde{C}\textrm{d}(\varvec{\theta },\varvec{\theta }_{n0})\le \widetilde{C}n^{-r\nu /2}$. Using the relationship between Hellinger distance and Kullback–Leibler information, we then have

$$\begin{aligned} P(l_{K}^{w}(\varvec{\theta }_{0};\varvec{W}^{\xi })-l_{K}^{w}(\varvec{\theta };\varvec{W}^{\xi }))\ge \widetilde{C}\textrm{d}^2(\varvec{\theta }_0,\varvec{\theta }). \end{aligned}$$

Therefore, for sufficiently large n, it yields that

$$\begin{aligned} P(l_{K}^{w}(\varvec{\theta };\varvec{W}^{\xi })-l_{K}^{w}(\varvec{\theta }_{n0};\varvec{W}^{\xi })) ={}&P(l_{K}^{w}(\varvec{\theta };\varvec{W}^{\xi })- l_{K}^{w}(\varvec{\theta }_0;\varvec{W}^{\xi }))+P(l_{K}^{w}(\varvec{\theta }_0;\varvec{W}^{\xi }) \\ {}&-l_{K}^{w}(\varvec{\theta }_{n0};\varvec{W}^{\xi })) \\ \le {}&-\widetilde{C}\eta ^2+\widetilde{C}n^{-r\nu /2}\\ ={}&-\widetilde{C}\eta ^2, \end{aligned}$$

for any $l_{K}^{w}(\varvec{\theta };\varvec{W}^{\xi })-l_{K}^{w}(\varvec{\theta }_{n0};\varvec{W}^{\xi }) \in \mathcal {F}_{\eta }$.

Note that $\mathcal {F}_{\eta }$ is uniformly bounded under conditions (C1)–(C3). Furthermore, by some algebraic manipulations, we obtain $P(l_{K}^{w}(\varvec{\theta };\varvec{W}^{\xi })-l_{K}^{w}(\varvec{\theta }_{n0};\varvec{W}^{\xi }))^2\le \widetilde{C}\eta ^2$ for any $l^{w}(\varvec{\theta };W^{\xi })-l^{w}(\varvec{\theta }_{n0};W^{\xi }) \in \mathcal {F}_{\eta }$. By Lemma 3.4.2 of van der Vaart and Wellner (1996), we can prove that

$$\begin{aligned} E_P \Vert n^{1/2}(P_n-P) \Vert _{\mathcal {F}_{\eta }} \le \widetilde{C}J_{[]}\{\eta ,\mathcal {F}_{\eta },L_2(P)\} \left[ 1+\frac{J_{[]}\{\eta ,\mathcal {F}_{\eta },L_2(P)\}}{\eta ^2n^{1/2}} \right] , \end{aligned}$$

where $J_{[]}\{\eta ,\mathcal {F}_{\eta },L_2(P)\}=\int ^{\eta }_0 [1+\log N_{[]}\{\epsilon ,\mathcal {F}_{\eta },L_2(P)\}]^{1/2}\textrm{d}\epsilon $. For $0<\epsilon <\eta $, by (Shen and Wong 1994, p. 597), we have $\log N_{[]}\{\epsilon ,\mathcal {F}_{\eta },L_2(P)\}\le \widetilde{C}N\log (\eta /\epsilon )$ with $N=m+1$. Then we can show that $J_{[]}\{\eta ,\mathcal {F}_{\eta },L_2(P)\} \le \widetilde{C}N^{1/2}\eta $. This yields $\varphi _n(\eta )=N^{1/2}\eta +N/n^{1/2}$. One can easily show that $\varphi _n(\eta )/\eta $ is monotonically decreasing with respect to $\eta $ and $r_n^2\varphi _n(1/r_n)=r_nN^{1/2}+r_n^2N/n^{1/2}\le \widetilde{C}n^{1/2}$, where $r_n=N^{-1/2}n^{1/2}=n^{(1-\nu )/2}$.

Note that $P_n(l_{K}^{w}(\hat{\varvec{\theta }}_n;\varvec{W}^{\xi })-l_{K}^{w}(\varvec{\theta }_{n0};\varvec{W}^{\xi }))\ge 0$ and $\textrm{d}(\hat{\varvec{\theta }}_n,\varvec{\theta }_{n0})\le \textrm{d}(\hat{\varvec{\theta }}_n,\varvec{\theta }_{0})+\textrm{d}(\varvec{\theta }_0,\varvec{\theta }_{n0})\rightarrow 0$ in probability. Therefore, by Theorem 3.4.1 of van der Vaart and Wellner (1996), we have $n^{(1-\nu )/2}\textrm{d}(\hat{\varvec{\theta }}_n,\varvec{\theta }_{n0})=O_p(1)$. Together with $\textrm{d}(\varvec{\theta }_{n0},\varvec{\theta }_{0})=O(n^{-r\nu /2})$, it is easy to show that $\textrm{d}(\hat{\varvec{\theta }}_n,\varvec{\theta }_{0})=O_p(n^{-(1-\nu )/2}+n^{-r\nu /2})$. According to $\Vert \hat{\varvec{\beta }}_n- \varvec{\beta }_0\Vert \le \textrm{d}(\hat{\varvec{\theta }}_n,\varvec{\theta }_0)$ and $\Vert \hat{F}_n- F_0\Vert _2 \le \textrm{d}(\hat{\varvec{\theta }}_n,\varvec{\theta }_0)$, we have $\Vert \hat{\varvec{\beta }}_n-\varvec{\beta }_0\Vert = O_p(n^{-\min (rv/2,(1-v)/2)})$ and $\Vert \hat{F}_n-F_0\Vert _2 = O_p(n^{-\min (rv/2,(1-v)/2)})$. The proof of Theorem 2 is completed. $\square $

Proof of Theorem 3

First, we will present some necessary notations. The linear span of $\varvec{\Theta }-\varvec{\theta }_0$ is denoted as $\mathcal {V}$, where $\varvec{\Theta }$ is the parameter space and $\varvec{\theta }_0$ is the true value of $\varvec{\theta }$. For convenience, we denote $\varrho _n=n^{-\min \{rv/2,\frac{1-v}{2}\}}$. Following the arguments of van der Vaart (1998, p. 296), for any $\varvec{\theta }\in \{\varvec{\theta }\in \varvec{\Theta }:\Vert \varvec{\theta }-\varvec{\theta }_0\Vert =O(\varrho _n)\}$, the first order directional derivative of $l_{K}^w(\varvec{\theta };\varvec{W}^\xi )$ at the direction $\psi \in \mathcal {V}$ is defined as

$$\begin{aligned} \dot{l}_{K}^w(\varvec{\theta };\varvec{W}^\xi )[\psi ]=\frac{\textrm{d}l_{K}^w(\varvec{\theta }+t\psi ;\varvec{W}^\xi )}{\textrm{d}t}\biggl |_{t=0}. \end{aligned}$$

The second order directional derivative of $l_{K}^w(\varvec{\theta };\varvec{W}^\xi )$ is given by

$$\begin{aligned} \ddot{l}_{K}^w(\varvec{\theta };\varvec{W}^\xi )[\psi ,\tilde{\psi }]= \frac{\textrm{d}^2l_{K}^w(\varvec{\theta }+t\psi +\tilde{t}\tilde{\psi };\varvec{W}^\xi )}{\textrm{d}\tilde{t}\textrm{d}t}\biggl |_{t=0,\tilde{t}=0} =\frac{\textrm{d}\dot{l}_{K}^w(\varvec{\theta }+\tilde{t}\tilde{\psi };\varvec{W}^\xi )}{\textrm{d}\tilde{t}}\biggl |_{\tilde{t}=0}. \end{aligned}$$

The Fisher inner product on the space $\mathcal {V}$ is denoted by $\langle \psi ,\tilde{\psi } \rangle =P(\dot{l}_{K}^w(\varvec{\theta };\varvec{W}^\xi )[\psi ]\dot{l}_{K}^w(\varvec{\theta };\varvec{W}^\xi )[\tilde{\psi }])$. Write a smooth functional of $\varvec{\theta }$ as follows

$$\begin{aligned} \Lambda (\varvec{\theta }) = \varvec{\mu }_1^T\varvec{\beta }+ \int _0^{\tau }\mu _2(t)F(t)dt, \end{aligned}$$

(14)

where $||\varvec{\mu }_1||\le 1$ and $\mu _2(t) \in \mathcal {F}$. The Fisher norm for $\psi \in \mathcal {V}$ is defined as $\Vert \psi \Vert ^{1/2}=\langle \psi ,\psi \rangle $. The closed linear span of $\mathcal {V}$ is denoted as $\overline{\mathcal {V}}$ under the Fisher norm. We can easily find that $(\overline{\mathcal {V}},\Vert \cdot \Vert )$ is a Hilbert space. Similar to $\dot{l}_{K}^w(\varvec{\theta };\varvec{W}^\xi )[\psi ]$, for any $\psi \in \mathcal {V}$, the first directional derivative of $\Lambda (\varvec{\theta })$ at $\varvec{\theta }_0$ is given by

$$\begin{aligned} \dot{\Lambda }(\varvec{\theta }_0)[\psi ]=\frac{\textrm{d}\Lambda (\varvec{\theta }_0+t\psi )}{\textrm{d}t}\biggl |_{t=0}. \end{aligned}$$

Similar to Shen (1997), $\dot{\Lambda }(\varvec{\theta }_0)[\psi ]$ is linear in $\psi $ and

$$\begin{aligned} \Vert \dot{\Lambda }(\varvec{\theta }_0)\Vert =\sup _{\psi \in \overline{\mathcal {V}}:\Vert \psi \Vert >0} \frac{|\dot{\Lambda }(\varvec{\theta }_0)[\psi ]|}{\Vert \psi \Vert }<\infty . \end{aligned}$$

(15)

For any $\psi ^*\in \varvec{\Theta }$, there exists $\psi _n^*\in \varvec{\Theta }_n$ such that $\Vert \psi _n^*-\psi ^*\Vert =O(n^{-\frac{rv}{2}})$. Then we have $\varrho _n\Vert \psi _n^*-\psi ^*\Vert =o(n^{-1/2})$ with $r>1$ and $r\nu >1/2$. For convenience, we define $\rho [\varvec{\theta }-\varvec{\theta }_0;\varvec{W}^{\xi }]=l_{K}^{w}(\varvec{\theta };\varvec{W}^{\xi })-l_{K}^{w}(\varvec{\theta }_0;\varvec{W}^{\xi })- \dot{l}_{K}^{w}(\varvec{\theta }_0;\varvec{W}^{\xi })[\varvec{\theta }-\varvec{\theta }_0]$ and let $\epsilon _n$ be any positive sequence satisfying $\epsilon _n=o(n^{-1/2})$. Some algebra yields that

$$\begin{aligned} 0 \le {}&P_n[l_{K}^{w}(\hat{\varvec{\theta }}_n;\varvec{W}^{\xi })-l_{K}^{w}(\hat{\varvec{\theta }}_n\pm \epsilon _n\psi _n^*;\varvec{W}^{\xi })]\\ ={}&P_n\{[l_{K}^{w}(\hat{\varvec{\theta }}_n;\varvec{W}^{\xi })-l_{K}^{w}(\varvec{\theta }_0;\varvec{W}^{\xi })] - [l_{K}^{w}(\hat{\varvec{\theta }}_n\pm \epsilon _n\psi _n^*;\varvec{W}^{\xi })-l_{K}^{w}(\varvec{\theta }_0;\varvec{W}^{\xi })]\}\\ ={}&\mp \epsilon _n P_n\dot{l}_{K}^{w}(\varvec{\theta }_0;\varvec{W}^{\xi })[\psi _n^*]+P_n\{\rho [\hat{\varvec{\theta }}_n-\varvec{\theta }_0;\varvec{W}^{\xi }]-\rho [\hat{\varvec{\theta }}_n\pm \epsilon _n\psi _n^*-\varvec{\theta }_0;\varvec{W}^{\xi }]\}\\ ={}&\mp \epsilon _n P_n\dot{l}_{K}^{w}(\varvec{\theta }_0;\varvec{W}^{\xi })[\psi ^*]\mp \epsilon _n P_n\dot{l}_{K}^{w}(\varvec{\theta }_0;\varvec{W}^{\xi })[\psi _n^*-\psi ^*] +(P_n-P)\\ {}&\cdot \{\rho [\hat{\varvec{\theta }}_n-\varvec{\theta }_0;\varvec{W}^{\xi }]-\rho [\hat{\varvec{\theta }}_n\pm \epsilon _n\psi _n^*-\varvec{\theta }_0;\varvec{W}^{\xi }]\}+P\{\rho [\hat{\varvec{\theta }}_n-\varvec{\theta }_0;\varvec{W}^{\xi }]\\ {}&-\rho [\hat{\varvec{\theta }}_n\pm \epsilon _n\psi _n^*-\varvec{\theta }_0;\varvec{W}^{\xi }]\}\\ :={}&\mp \epsilon _n P_n\dot{l}_{K}^{w}(\varvec{\theta }_0;\varvec{W}^{\xi })[\psi ^*]+ I_1 + I_2 + I_3. \end{aligned}$$

Next, we will discuss the calculation details of $I_1, I_2$ and $I_3$ respectively. For $I_1$, according to the Chebyshev’s inequality, $\Vert \psi _n^*-\psi ^*\Vert =O(n^{-\frac{rv}{2}})$ and $P\dot{l}_{K}^{w}(\varvec{\theta }_0;\varvec{W}^{\xi })[\psi _n^*-\psi ^*]=0$, it is easy to show that

$$\begin{aligned} {}&P\left( \frac{\mid P_n\dot{l}_{K}^{w}(\varvec{\theta }_0;\varvec{W}^{\xi })[\psi _n^*-\psi ^*] \mid }{n^{-1/2}} \ge \epsilon \right) \\ {}&\quad \le \frac{\textrm{Var}(P_n\dot{l}_{K}^{w}(\varvec{\theta }_0;\varvec{W}^{\xi })[\psi _n^*-\psi ^*])}{n^{-1}\epsilon ^2} \\ {}&\quad = \frac{\textrm{Var}(\dot{l}_{K}^{w}(\varvec{\theta }_0;\varvec{W}^{\xi })[\psi _n^*-\psi ^*])}{\epsilon ^{2}} \\ {}&\quad = \frac{P(\dot{l}_{K}^{w}(\varvec{\theta }_0;\varvec{W}^{\xi })[\psi _n^*-\psi ^*]\dot{l}_{K}^{w}(\varvec{\theta }_0;\varvec{W}^{\xi })[\psi _n^*-\psi ^*])}{\epsilon ^2} \\ {}&\quad = \frac{\Vert \psi _n^*-\psi ^* \Vert ^2}{\epsilon ^2} \\ {}&\quad \rightarrow 0, \end{aligned}$$

which implies that $P_n\dot{l}_{K}^{w}(\varvec{\theta }_0;\varvec{W}^{\xi })[\psi _n^*-\psi ^*]=o_p(n^{-1/2})$ and $I_1=\epsilon _n\times o_p(n^{-1/2})$.

For $I_2$, by some algebraic calculations, we have

$$\begin{aligned} I_2 ={}&(P_n-P)(\rho [\hat{\varvec{\theta }}_n-\varvec{\theta }_0;\varvec{W}^{\xi }]-\rho [\hat{\varvec{\theta }}_n\pm \epsilon _n\psi _n^*-\varvec{\theta }_0;\varvec{W}^{\xi }])\\ ={}&(P_n-P)(\dot{l}_{K}^{w}(\tilde{\varvec{\theta }};\varvec{W}^{\xi })[\mp \epsilon _n\psi _n^*] \pm \epsilon _n\dot{l}_{K}^{w}(\varvec{\theta }_0;\varvec{W}^{\xi })[\psi _n^*])\\ ={}&\mp \epsilon _n (P_n-P)(\dot{l}_{K}^{w}(\tilde{\varvec{\theta }};\varvec{W}^{\xi })[\psi _n^*] -\dot{l}_{K}^{w}(\varvec{\theta }_0;\varvec{W}^{\xi })[\psi _n^*]), \end{aligned}$$

where $\tilde{\varvec{\theta }}$ belongs between $\hat{\varvec{\theta }}_n$ and $\hat{\varvec{\theta }}_n\pm \epsilon _n\psi _n^*$. Denote the function class $\mathcal {L}_3=\{\dot{l}_{K}^{w}(\varvec{\theta };\varvec{W}^{\xi })[\psi _n^*]-\dot{l}_{K}^{w}(\varvec{\theta }_0;\varvec{W}^{\xi })[\psi _n^*]:\varvec{\theta }\in \Theta _n \text { and } \Vert \varvec{\theta }-\varvec{\theta }_0\Vert =O(\varrho _n)\}$. For any $\dot{l}_{K}^{w}(\varvec{\theta }_i;\varvec{W}^{\xi })[\psi _n^*]-\dot{l}_{K}^{w}(\varvec{\theta }_0;\varvec{W}^{\xi })[\psi _n^*]\in \mathcal {L}_3 $ $(i=1,2)$, we have $\bigl |\dot{l}_{K}^{w}(\varvec{\theta }_1;\varvec{W}^{\xi })[\psi _n^*]-\dot{l}_{K}^{w}(\varvec{\theta }_2;\varvec{W}^{\xi })[\psi _n^*]\bigr |\le \widetilde{C}\Vert \varvec{\theta }_1-\varvec{\theta }_2\Vert $. Thus, it follows that

$$\begin{aligned} N(\epsilon ,\mathcal {L}_3,L_2(Q))\le N(\epsilon ,\{\varvec{\theta }:\varvec{\theta }\in \varvec{\Theta }_n \text { and } \Vert \varvec{\theta }-\varvec{\theta }_0\Vert \le \widetilde{C}\varrho _n\},\Vert \cdot \Vert ). \end{aligned}$$

Note that $N(\epsilon ,\mathcal {L}_3,L_2(Q)) \le \widetilde{C}e^{3/\epsilon }$ such that $\int _0^{\infty }\sup _{Q}\sqrt{\log N(\epsilon ,\mathcal {L}_3,L_2(Q))}d\epsilon = \int _0^{\widetilde{C}\varrho _n}\sup _{Q}\sqrt{\log N(\epsilon ,\mathcal {L}_3,L_2(Q))}d\epsilon $ $+ \int _{\widetilde{C}\varrho _n}^{\infty } 0 d\epsilon < \infty $ with $v>1/2$ and $r>1$. Under conditions (C1)-(C3), $\dot{l}_{K}^{w}(\varvec{\theta };\varvec{W}^{\xi })[\psi _n^*]$ is uniformly bounded. According to Theorem 2.8.3 of van der Vaart and Wellner (1996), we know that $\mathcal {L}_3$ is a Donsker class. Furthermore, it follows from Corollary 2.3.12 of van der Vaart and Wellner (1996) that

$$\begin{aligned} (P_n-P)(\dot{l}_{K}^{w}(\tilde{\varvec{\theta }};\varvec{W}^{\xi })[\psi _n^*] -\dot{l}_{K}^{w}(\varvec{\theta }_0;\varvec{W}^{\xi })[\psi _n^*]) =o_p(n^{-1/2}). \end{aligned}$$

Therefore, we have $I_2=\epsilon _n\times o_p(n^{-1/2})$.

For $I_3$, note that $P(\ddot{l}_{K}^{w}(\varvec{\theta }_0;\varvec{W}^{\xi })[\hat{\varvec{\theta }}_n-\varvec{\theta }_0,\hat{\varvec{\theta }}_n-\varvec{\theta }_0])= -P(\dot{l}_{K}^{w}(\varvec{\theta }_0;\varvec{W}^{\xi })[\hat{\varvec{\theta }}_n-\varvec{\theta }_0]\dot{l}_{K}^{w}(\varvec{\theta }_0;\varvec{W}^{\xi })[\hat{\varvec{\theta }}_n-\varvec{\theta }_0])$. For any $\varvec{\theta }\in \{\varvec{\theta }:\textrm{d}(\varvec{\theta }-\varvec{\theta }_0)=O(\varrho _n)\}$, we have $P(\ddot{l}_{K}^{w}(\varvec{\theta };\varvec{W}^{\xi })[\varvec{\theta }-\varvec{\theta }_0,\varvec{\theta }-\varvec{\theta }_0] - \ddot{l}_{K}^{w}(\varvec{\theta }_0;\varvec{W}^{\xi })[\varvec{\theta }-\varvec{\theta }_0,\varvec{\theta }-\varvec{\theta }_0]) = O(\varrho _n^3)$ and $\varrho _n^3 = o(n^{-1})$ with $v<1/3$ and $r>2$. Then, it yields that

$$\begin{aligned} {}&P(\rho [\hat{\varvec{\theta }}_n-\varvec{\theta }_0;\varvec{W}^\xi ]) \\ {}&\quad = P(l_{K}^{w}(\hat{\varvec{\theta }}_n;\varvec{W}^\xi ) - l_{K}^{w}(\varvec{\theta }_0;\varvec{W}^\xi ) - \dot{l}_{K}^{w}(\varvec{\theta }_0;\varvec{W}^\xi )[\hat{\varvec{\theta }}_n-\varvec{\theta }_0]) \\ {}&\quad = \frac{1}{2}P(\ddot{l}_{K}^{w}(\tilde{\varvec{\theta }};\varvec{W}^\xi )[\hat{\varvec{\theta }}_n-\varvec{\theta }_0,\hat{\varvec{\theta }}_n-\varvec{\theta }_0] - \ddot{l}_{K}^{w}(\varvec{\theta }_0;\varvec{W}^\xi )[\hat{\varvec{\theta }}_n-\varvec{\theta }_0,\hat{\varvec{\theta }}_n-\varvec{\theta }_0])\\ {}&\qquad \ + \frac{1}{2}P(\ddot{l}_{K}^{w}(\varvec{\theta }_0;\varvec{W}^\xi )[\hat{\varvec{\theta }}_n-\varvec{\theta }_0,\hat{\varvec{\theta }}_n-\varvec{\theta }_0]) \\ {}&\quad = \epsilon _n \times o_p(n^{-1/2}) + \frac{1}{2}P(\ddot{l}_{K}^{w}(\varvec{\theta }_0;\varvec{W}^\xi )[\hat{\varvec{\theta }}_n-\varvec{\theta }_0,\hat{\varvec{\theta }}_n-\varvec{\theta }_0]) \\ {}&\quad = \epsilon _n \times o_p(n^{-1/2}) - \frac{1}{2}\Vert \hat{\varvec{\theta }}_n-\varvec{\theta }_0\Vert ^2, \end{aligned}$$

where $\tilde{\varvec{\theta }}$ is between $\hat{\varvec{\theta }}_n$ and $\varvec{\theta }_0$. Combining the Cauchy–Schwarz inequality, $\Vert \psi _n^*\Vert ^2\rightarrow \Vert \psi ^*\Vert ^2<\infty $ and $\varrho _n\Vert \psi _n^*-\psi ^*\Vert =o(n^{-1/2})$, we have that

$$\begin{aligned} I_3 ={}&- \frac{1}{2}\Vert \hat{\varvec{\theta }}_n-\varvec{\theta }_0\Vert ^2 + \frac{1}{2}\Vert \hat{\varvec{\theta }}_n\pm \epsilon _n\psi _n^*-\varvec{\theta }_0\Vert ^2 + \epsilon _n\times o_p(n^{-1/2})\\ ={}&\pm \epsilon _n \langle \hat{\varvec{\theta }}_n-\varvec{\theta }_0,\psi _n^*\rangle +\frac{1}{2}\Vert \epsilon _n\psi _n^*\Vert ^2 + \epsilon _n\times o_p(n^{-1/2}) \\ ={}&\pm \epsilon _n\langle \hat{\varvec{\theta }}_n-\varvec{\theta }_0,\psi ^*\rangle +\frac{1}{2}\epsilon _n^2\Vert \psi _n^*\Vert ^2 + \epsilon _n\times o_p(n^{-1/2}) \\ ={}&\pm \epsilon _n\langle \hat{\varvec{\theta }}_n-\varvec{\theta }_0,\psi ^*\rangle + \epsilon _n\times o_p(n^{-1/2}). \end{aligned}$$

According to the results of $I_1,I_2$ and $I_3$, we obtain

$$\begin{aligned} 0 \le {}&P_n( l_{K}^{w}(\hat{\varvec{\theta }}_n;\varvec{W}^\xi ) - l_{K}^{w}(\hat{\varvec{\theta }}_n\pm \epsilon _n\psi _n^*;\varvec{W}^\xi ) )\nonumber \\ ={}&\mp \epsilon _nP_n\dot{l}_{K}^{w}(\varvec{\theta }_0;\varvec{W}^\xi )[\psi ^*] \pm \epsilon _n\langle \hat{\varvec{\theta }}_n-\varvec{\theta }_0,\psi ^*\rangle + \epsilon _n\times o_p(n^{-1/2}). \end{aligned}$$

By some algebraic calculations, we then have $\sqrt{n}\langle \hat{\varvec{\theta }}_n-\varvec{\theta }_0,\psi ^*\rangle = \sqrt{n}(P_n-P)\dot{l}_{K}^{w}(\varvec{\theta }_0;\varvec{W}^\xi )[\psi ^*] + o_p(1)$ with $P\dot{l}_{K}^{w}(\varvec{\theta }_0;\varvec{W}^\xi )[\psi ^*]=0$ and $Var(\dot{l}_{K}^{w}(\varvec{\theta }_0;\varvec{W}^\xi )[\psi ^*]) = \Vert \psi ^*\Vert ^2<\infty $. By the central limit theorem, it yields that $\sqrt{n}\langle \hat{\varvec{\theta }}_n-\varvec{\theta }_0,\psi ^*\rangle \rightarrow N(0,\Vert \psi ^*\Vert ^2)$. By the Riesz representation theorem, there exists $\psi ^*\in {\bar{\mathcal {V}}} \text { such that } \dot{\Lambda }(\varvec{\theta }_0)[\psi ]=\langle \psi ,\psi ^*\rangle $ for any $\psi \in \bar{\mathcal {V}}$ and $\Vert \psi ^*\Vert = \Vert \dot{\Lambda }(\varvec{\theta }_0)\Vert $. Note that $\Lambda (\varvec{\theta })-\Lambda (\varvec{\theta }_0)=\dot{\Lambda }(\varvec{\theta }_0)[\varvec{\theta }-\varvec{\theta }_0]$. Therefore, we have that

$$\begin{aligned} \sqrt{n}(\Lambda (\hat{\varvec{\theta }}_n)-\Lambda (\varvec{\theta }_0))=\sqrt{n}(P_n-P)(\dot{l}_{K}^{w}(\varvec{\theta }_0;\varvec{W}^\xi )[\psi ^*]) + o_p(1) \rightarrow N(0,\Vert \dot{\Lambda }(\varvec{\theta }_0)\Vert ^2), \end{aligned}$$

in distribution. That is $\sqrt{n}[\varvec{\mu }_1^T(\hat{\varvec{\beta }}_n-\varvec{\beta }_0)+ \int _0^{\tau }\mu _2(t)(\hat{F}_n(t)-F_0(t))dt] \rightarrow N(0, \Vert \dot{\Lambda }(\varvec{\theta }_0)\Vert ^2)$ in distribution.

In particular, if we set $\mu _2(\cdot )=0$ in the formula (14), then we have $\Lambda _{\varvec{\beta }}(\varvec{\theta })=\varvec{\mu }_1^T\varvec{\beta }$. The first order directional derivative of $\Lambda _{\varvec{\beta }}(\varvec{\theta })$ is defined as $\dot{\Lambda }_{\varvec{\beta }}(\varvec{\theta }_0)[\psi ]=\frac{\textrm{d}\Lambda _{\varvec{\beta }}(\varvec{\theta }_0+t\psi )}{\textrm{d}t}\Big |_{t=0}$ and

$$\begin{aligned} \Vert \dot{\Lambda }_{\varvec{\beta }}(\varvec{\theta }_0)\Vert =\sup _{\psi \in \overline{\mathcal {V}}:\Vert \psi \Vert >0} \frac{|\dot{\Lambda }_{\varvec{\beta }}(\varvec{\theta }_0)[\psi ]|}{\Vert \psi \Vert }. \end{aligned}$$

(16)

Similarly, it follows that $\sqrt{n}\varvec{\mu }_1^T(\hat{\varvec{\beta }}_n-\varvec{\beta }_0) \rightarrow N(0, \Vert \dot{\Lambda }_{\varvec{\beta }}(\varvec{\theta }_0)\Vert ^2)$ in distribution. The proof of Theorem 3 is completed. $\square $

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Xie, P., Han, B. & Wang, X. Case-cohort studies for clustered failure time data with a cure fraction. Stat Papers (2023). https://doi.org/10.1007/s00362-023-01448-7

Download citation

Received: 24 September 2021
Revised: 05 February 2023
Published: 17 April 2023
DOI: https://doi.org/10.1007/s00362-023-01448-7

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Case-cohort studies for clustered failure time data with a cure fraction

Abstract

Access this article

Similar content being viewed by others

Optimal generalized case-cohort analysis with accelerated failure time model

Semiparametric regression and risk prediction with competing risks data under missing cause of failure

Varying coefficient transformation cure models for failure time data

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Supplementary Information

Supplementary file 1 (eps 14 KB)

Supplementary file 2 (pdf 95 KB)

Appendix

Lemma 1

Proof of Lemma 1

Lemma 2

Proof of Lemma 2

Proof of Theorem 1

Proof of Theorem 2

Proof of Theorem 3

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Case-cohort studies for clustered failure time data with a cure fraction

Abstract

Access this article

Similar content being viewed by others

Optimal generalized case-cohort analysis with accelerated failure time model

Semiparametric regression and risk prediction with competing risks data under missing cause of failure

Varying coefficient transformation cure models for failure time data

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Supplementary Information

Supplementary file 1 (eps 14 KB)

Supplementary file 2 (pdf 95 KB)

Appendix

Appendix

Lemma 1

Proof of Lemma 1

Lemma 2

Proof of Lemma 2

Proof of Theorem 1

Proof of Theorem 2

Proof of Theorem 3

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation