Assessing model prediction performance for the expected cumulative number of recurrent events

Bouaziz, Olivier

doi:10.1007/s10985-023-09610-x

Assessing model prediction performance for the expected cumulative number of recurrent events

Published: 17 November 2023

Volume 30, pages 262–289, (2024)
Cite this article

Lifetime Data Analysis Aims and scope Submit manuscript

Olivier Bouaziz ORCID: orcid.org/0000-0003-3556-9531¹

300 Accesses
1 Altmetric
Explore all metrics

Abstract

In a recurrent event setting, we introduce a new score designed to evaluate the prediction ability, for a given model, of the expected cumulative number of recurrent events. This score can be seen as an extension of the Brier Score for single time to event data but works for recurrent events with or without a terminal event. Theoretical results are provided that show that under standard assumptions in a recurrent event context, our score can be asymptotically decomposed as the sum of the theoretical mean squared error between the model and the true expected cumulative number of recurrent events and an inseparability term that does not depend on the model. This decomposition is further illustrated on simulations studies. It is also shown that this score should be used in comparison with a reference model, such as a nonparametric estimator that does not include the covariates. Finally, the score is applied for the prediction of hospitalisations on a dataset of patients suffering from atrial fibrillation and a comparison of the prediction performances of different models, such as the Cox model, the Aalen Model or the Ghosh and Lin model, is investigated.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Nonparametric inference for the joint distribution of recurrent marked variables and recurrent survival time

Article 30 September 2015

Penalised logistic regression and dynamic prediction for discrete-time recurrent event data

Article 28 January 2015

Joint analysis of recurrent event data with a dependent terminal event

Article 12 December 2017

References

Andersen PK, Angst J, Ravn H (2019) Modeling marginal features in studies of recurrent events in the presence of a terminal event. Lifetime Data Anal 25:681–695
Article MathSciNet Google Scholar
Andersen PK, Borgan Ø, Gill RD, Keiding N (1993) Statistical models based on counting processes. Springer series in statistics. Springer-Verlag, New York
Book Google Scholar
Andersen PK, Gill RD (1982) Cox’s regression model for counting processes: a large sample study. Ann Stat 10(4):1100–1120
Article MathSciNet Google Scholar
Bouaziz O, Geffray S, Lopez O (2015) Semiparametric inference for the recurrent events process by means of a single-index model. Statistics 49:361–385
Article MathSciNet Google Scholar
Bouaziz O, Lopez O (2010) Conditional density estimation in a censored single-index regression model. Bernoulli 16:514–542
Article MathSciNet Google Scholar
Bradley AA, Schwartz SS, Hashino T (2008) Sampling uncertainty and confidence intervals for the brier score and brier skill score. Weather Forecast 23:992–1006
Article Google Scholar
Cook RJ, Lawless J (2007) The statistical analysis of recurrent events. Springer Science & Business Media, New-York, USA
Google Scholar
Cook RJ, Lawless JF (1997) Marginal analysis of recurrent events and a terminating event. Stat Med 16:911–924
Article Google Scholar
Cox DR (1972) Regression models and life-tables. J R Stat Soc Ser B (Methodol) 34:187–202
MathSciNet Google Scholar
Dabrowska DM (1989) Uniform consistency of the kernel conditional Kaplan-Meier estimate. Ann. Stat 17(3):1157–1167
Article MathSciNet Google Scholar
Gerds TA, Kattan MW, Schumacher M, Yu C (2013) Estimating a time-dependent concordance index for survival prediction models with covariate dependent censoring. Stat Med 32:2173–2184
Article MathSciNet Google Scholar
Gerds TA, Schumacher M (2006) Consistent estimation of the expected brier score in general survival models with right-censored event times. Biom J 48:1029–1040
Article MathSciNet Google Scholar
Ghosh D, Lin D (2003) Semiparametric analysis of recurrent events data in the presence of dependent censoring. Biometrics 59:877–885
Article MathSciNet Google Scholar
Ghosh D, Lin DY (2002) Marginal regression models for recurrent and terminal events. Stat Sin 12(3):663–688
MathSciNet Google Scholar
Graf E, Schmoor C, Sauerbrei W, Schumacher M (1999) Assessment and comparison of prognostic classification schemes for survival data. Stat Med 18:2529–2545
Article Google Scholar
Harrell FE Jr, Lee KL, Mark DB (1996) Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Stat Med 15:361–387
Article Google Scholar
Heagerty PJ, Zheng Y (2005) Survival model predictive accuracy and ROC curves. Biometrics 61:92–105
Article MathSciNet Google Scholar
Hougaard P, Hougaard P (2000) Analysis of multivariate survival data, vol 564. Springer, New York, USA
Book Google Scholar
Ishwaran H, Kogalur UB, Blackstone EH, Lauer MS (2008) Random survival forests. Ann Appl Stat 2:841–860
Article MathSciNet Google Scholar
Kalbfleisch JD, Prentice RL (2002) The statistical analysis of failure time data, 2nd edn. Wiley series in probability and statistics. Wiley-Interscience (John Wiley & Sons), Hoboken
Book Google Scholar
Lin D, Wei L, Ying Z (1998) Accelerated failure time models for counting processes. Biometrika 85:605–618
Article MathSciNet Google Scholar
Lin DY, Wei L-J, Yang I, Ying Z (2000) Semiparametric regression for the mean and rate functions of recurrent events. J R Stat Soc Ser B (Stat Methodol) 62:711–730
Article MathSciNet Google Scholar
Liu L, Wolfe RA, Huang X (2004) Shared frailty models for recurrent events and a terminal event. Biometrics 60:747–756
Article MathSciNet Google Scholar
Prentice RL, Williams BJ, Peterson AV (1981) On the regression analysis of multivariate failure time data. Biometrika 68:373–379
Article MathSciNet Google Scholar
Rondeau V, Mathoulin-Pelissier S, Jacqmin-Gadda H, Brouste V, Soubeyran P (2007) Joint frailty models for recurring events and death using maximum penalized likelihood estimation: application on cancer events. Biostatistics 8:708–721
Article Google Scholar
Scheike TH (2002) The additive nonparametric and semiparametric Aalen model as the rate function for a counting process. Lifetime Data Anal 8:247–262
Article MathSciNet Google Scholar
Schoop R, Schumacher M, Graf E (2011) Measures of prediction error for survival data with longitudinal covariates. Biom J 53:275–293
Article MathSciNet Google Scholar
Schroder J, Bouaziz O, Agner BR, Martinussen T, Madsen PL, Li D, Dixen U (2019) Recurrent event survival analysis predicts future risk of hospitalization in patients with paroxysmal and persistent atrial fibrillation. PLoS One 14:e0217983
Article Google Scholar
Steyerberg EW, Vickers AJ, Cook NR, Gerds T, Gonen M, Obuchowski N, Pencina MJ, Kattan MW (2010) Assessing the performance of prediction models: a framework for some traditional and novel measures. Epidemiology (Cambridge, Mass) 21:128
Article Google Scholar
Van Oirbeek R, Lesaffre E (2016) Exploring the clustering effect of the frailty survival model by means of the brier score. Commun Stat-Simul Comput 45:3294–3306
Article MathSciNet Google Scholar

Download references

Acknowledgements

We thank the reviewers for their constructive criticisms and comments that have helped improve the paper.

Author information

Authors and Affiliations

Université Paris Cité, CNRS, MAP5, F-75006, Paris, France
Olivier Bouaziz

Authors

Olivier Bouaziz
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Olivier Bouaziz.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (pdf 5736 KB)

8 Appendix: proofs of the convergence of the prediction criterion for the expected cumulative number of recurrent events under the two scenarios

In the proof of Proposition 1, we need to verify the key equality from Eq. (3). This result depends on the modelling assumptions and has already been proved in all three different scenarios, see Section 1 of Supplementary Information, Sect. 2.2 of the main manuscript and Section 2 of Supplementary Information for the right-censoring case with no terminal event, the terminal event case, and the dependence on prior counts case, respectively. In the proof of Proposition 2, we also need to have $\mathbb E[\mu ^*(\tau \mid \bar{X}(\tau ))]<\infty$ which also depends on the different modelling assumptions made under each scenario.

1.1 8.1 Proof of Proposition 1

In all three scenarios, we directly have:

$$\begin{aligned} \textrm{MSE}(t,\mu )&= \mathbb E\left[ \big (\mu (t\mid {\bar{X}}(t))-\mu ^*(t\mid {\bar{X}}(t))\big )^2\right] + \mathbb E\left[ \Bigg (\int _0^t\frac{dN(u)}{G_c(u\mid {\bar{X}}(u))}-\mu ^*(t\mid {\bar{X}}(t))\Bigg )^2\right] \\&\quad + 2 \mathbb E\left[ \Bigg (\int _0^t\frac{dN(u)}{G_c(u\mid \bar{X}(u))}-\mu ^*(t\mid {\bar{X}}(t))\Bigg )\Bigg (\mu ^*(t\mid \bar{X}(t))-\mu (t\mid {\bar{X}}(t))\Bigg )\right] . \end{aligned}$$

Using the fact that $\mathbb E[\int _0^tdN(u)/(G_c(u\mid \bar{X}(u)))\mid {\bar{X}}(t)]=\mu ^*(t\mid {\bar{X}}(t))$, we conclude that

$$\begin{aligned} \textrm{MSE}(t,\mu )&=\mathbb E\left[ (\mu (t\mid \bar{X}(t))-\mu ^*(t\mid {\bar{X}}(t)))^2\right] +A(t), \end{aligned}$$

where

$$\begin{aligned} A(t)&=\mathbb E\left[ \left( \int _0^t\frac{dN(u)}{G_c(u\mid \bar{X}(u))}\right) ^2\right] -\mathbb E\left[ \left( \mu ^*(t\mid \bar{X}(t))\right) ^2 \right] . \end{aligned}$$

(13)

Now, using the remarkable identity $a^2-b^2=(a-b)(a+b)$ and observing that $\int _0^tdN(u)/(G_c(u\mid \bar{X}(u)))=\sum _{\text {ev.}\le t}\{1/(G_c(u\mid X(\text {ev.})))\}$ either equals 0 if no observed recurrent events occurred before time t or is greater than 1 if at least one recurrent event occurred before time t, we conclude that

$$\begin{aligned}&\left( \int _0^t\frac{dN(u)}{G_c(u\mid {\bar{X}}(u))}-\mu ^*(t\mid {\bar{X}}(t))\right) \left( \int _0^t\frac{dN(u)}{G_c(u\mid {\bar{X}}(u))}+\mu ^*(t\mid {\bar{X}}(t))\right) \\&\quad \ge \left( \int _0^t\frac{dN(u)}{G_c(u\mid \bar{X}(u))}-\mu ^*(t\mid {\bar{X}}(t))\right) \mu ^*(t\mid {\bar{X}}(t)), \end{aligned}$$

almost surely. Taking the expectation on both sides proves $A(t)\ge 0$.

1.2 8.2 Proof of Proposition 2

We start by proving that $\mathbb E[\mu ^*(\tau \mid \bar{X}(\tau ))]<\infty$ in the presence of a terminal event (the scenario without terminal event follows from the same arguments). We have for all $t\in [0,\tau ]: \mathbb P[C\ge t\mid {\bar{X}}(t)] \ge \mathbb P[T\ge t\mid {\bar{X}}(t)]\ge c$, from Assumption 2. From the same assumption, $N(\tau )$ is almost surely bounded by a constant. As a consequence,

$$\begin{aligned} \mu ^*(\tau \mid {\bar{X}}(\tau ))=\int _0^{\tau } \frac{\mathbb E[dN(t)\mid {\bar{X}}(t)]}{G_c(t\mid {\bar{X}}(t))} \end{aligned}$$

is almost surely bounded, where the equality has been proved in Sect. 2.2. In the dependence on prior counts case, we have for all $t\in [0,\tau ]: \mathbb P[C\ge t\mid {\bar{X}}(t)] \ge \mathbb P[T\ge t\mid \bar{X}(t)]=\sum _{l=1}^{L+1}\mathbb P[T\ge t,N(t-)=l-1\mid {\bar{X}}(t)]\ge (L+1)c>0$, where the two last bounds come from Assumption MSM in the Supplementary Information. From the same assumption, $N(\tau )$ is almost surely bounded by a constant. As a consequence,

$$\begin{aligned} \mu ^*(\tau \mid {\bar{X}}(\tau ))=\int _0^{\tau } \frac{\mathbb E[dN(t)\mid {\bar{X}}(t)]}{G_c(t\mid {\bar{X}}(t))} \end{aligned}$$

is almost surely bounded, where the equality has been proved in the Supplementary Information. The rest of the proof of Proposition 2 is identical in all three scenarios.

We first note $F_{X(t)}(x)=\mathbb P[X(t)\le x]$, we let $\mathcal X_{u,v}$ denote the support of the joint distribution (X(u), X(v)) and we note $F_{X(u),X(v)}(x,y)=\mathbb P[X(u)\le x,X(v)\le v]$. We then introduce the quantity

$$\begin{aligned} \xi (t)&=\int _{0\le u, v\le t}\int _{\mathcal X_{u,v}}\frac{\mathbb E[dN(u)dN(v) \mid X(u)=x,X(v)=y]}{{\hat{G}}_c(u\mid x){\hat{G}}_c(v\mid y)}dF_{X(u),X(v)}(x,y)\\&\quad -2 \int _{\mathcal X_{t}}\widehat{\mu }(t\mid x)\mu ^*(t\mid x)dF_{X(t)}(x)\\&\quad +\int _{\mathcal X_t} \left( \widehat{\mu }(t\mid x)\right) ^2dF_{X(t)}(x)=:\xi _1(t)+\xi _2(t)+\xi _3(t). \end{aligned}$$

Write:

$$\begin{aligned} \left| \widehat{\textrm{MSE}^1}(t,\hat{\mu })-\textrm{MSE}^1(t,\mu )\right|&\le \left| \xi (t)-\mathbb E\bigg [\bigg (\int _0^t \frac{dN(u)}{G_c(u\mid {\bar{X}}(u))}-\mu (t\mid {\bar{X}}(t))\bigg )^2\bigg ]\right| \\&\quad + \left| \frac{1}{n} \sum _{i=1}^n \bigg (\int _0^t \frac{dN_i(u)}{{\hat{G}}_c(u\mid {\bar{X}}_i(u))}-\widehat{\mu }(t\mid X_i(t))\bigg )^2-\xi (t)\right| \\&\le : C(t)+D(t). \end{aligned}$$

By decomposing the square term into three other terms, we bound C(t) in the following way: $C(t)\le |C_1(t)|+|C_2(t)|+|C_3(t)|$ with

$$\begin{aligned} C_1(t)&=\int _{0\le u,v\le t}\int _{\mathcal X_{u,v}}\frac{G_c(u\mid x)G_c(v\mid y)-{\hat{G}}_c(u\mid x){\hat{G}}_c(v\mid y)}{{\hat{G}}_c(u\mid x){\hat{G}}_c(v\mid y)G_c(u\mid x)G_c(v\mid y)}\\&\qquad \mathbb E[dN(u)dN(v) \mid X(u)=x,X(v)=y]dF_{X(u),X(v)}(x,y),\\ C_2(t)&=-2 \int _{\mathcal X_{t}}(\widehat{\mu }(t\mid x)- \mu (t\mid x))\mu ^*(t\mid x)dF_{X(t)}(x),\\ C_3(t)&=\int _{\mathcal X_t} \left( \left( \widehat{\mu }(t\mid x)\right) ^2-\left( \mu (t\mid x)\right) ^2\right) dF_{X(t)}(x). \end{aligned}$$

For $C_1(t)$ we have

$$\begin{aligned}&G_c(u\mid x)G_c(v\mid y)-{\hat{G}}_c(u-\mid x){\hat{G}}_c(v-\mid y)\\&=\left( {\hat{G}}(u-\mid x)-G(u-\mid x)\right) +\left( {\hat{G}}(v-\mid y)-G(v-\mid y)\right) \\&\quad +G(u-\mid x)\left( G(v-\mid y)-{\hat{G}}(v-\mid y)\right) +\hat{G}(v-\mid y)\left( G(u-\mid x)-{\hat{G}}(u-\mid x)\right) , \end{aligned}$$

and we can deal with all four terms in the same fashion. For instance, for the first term,

$$\begin{aligned}&\int _{0\le u, v\le t}\int _{\mathcal X_{u,v}}\frac{\left( {\hat{G}}(u-\mid x)-G(u-\mid x)\right) \mathbb E[dN(u)dN(v) \mid X(u)=x,X(v)=y]}{{\hat{G}}_c(u\mid x){\hat{G}}_c(v\mid y)G_c(u\mid x)G_c(v\mid y)}dF_{X(u),X(v)}(x,y)\\&\quad \le \int _{0}^{t}\int _{\mathcal X_{u}}\frac{\left| \hat{G}(u-\mid x)-G(u-\mid x)\right| \mathbb E[dN(u) \mid X(u)=x]}{\hat{G}_c(u\mid x)G_c(u\mid x)}dF_{X(u)}(x), \end{aligned}$$

using the fact that $\int _0^t dN(v)/({\hat{G}}_c(v\mid y)G_c(v\mid y))$ is bounded. Then, since $\int _0^t\mathbb E[dN(u)/(1-G(u-\mid X(u))) \mid X(u)=x]=\mu ^*(t\mid {\bar{X}}(t))$ and ${\hat{G}}_c(u\mid x)^{-1}$ is asymptotically uniformly bounded, we conclude that $|C_1(t)|$ tends toward 0 in probability using the uniform consistency of the censoring estimator.

For $C_2(t)$ we use the consistency of $\widehat{\mu }$ and the fact that $\mathbb E[\mu ^*(t\mid {\bar{X}}(t))]$ is finite to prove that $|C_2(t)|$ tends towards 0 in probability.

For $C_3(t)$, we directly write $(\widehat{\mu }(t\mid x))^2-(\mu (t\mid x))^2=(\widehat{\mu }(t\mid x)-\mu (t\mid x))(\widehat{\mu }(t\mid x)+\mu (t\mid x))$ and we use the fact that $\mu (t\mid x)$ is bounded and the consistency of $\widehat{\mu }$ to prove that $|C_3(t)|$ tends towards 0 in probability.

Similarly to C(t) we obtain the following bound: $D(t)\le |D_1(t)|+|D_2(t)|+|D_3(t)|$ with

$$\begin{aligned} D_1(t)&=\frac{1}{n} \sum _{i=1}^n \int _{0\le u, v\le t} \frac{dN_i(u)dN_i(v)}{{\hat{G}}_c(u\mid {\bar{X}}_i(u)){\hat{G}}_c(v\mid {\bar{X}}_i(v))}-\xi _1(t),\\ D_2(t)&=-\frac{2}{n} \sum _{i=1}^n\int _0^t \frac{dN_i(u)}{{\hat{G}}_c(u\mid {\bar{X}}_i(u))}\widehat{\mu }(t\mid X_i(t))-\xi _2(t),\\ D_3(t)&=\frac{1}{n} \sum _{i=1}^n \Big (\widehat{\mu }(t\mid \bar{X}_i(t))\Big )^2-\xi _3(t). \end{aligned}$$

We now use the bound $|D_1(t)|\le |D_{1,1}(t)| + |D_{1,2}(t)|+|D_{1,3}(t)|$ with

$$\begin{aligned} D_{1,1}(t)&=\frac{1}{n} \sum _{i=1}^n \int _{0\le u,v\le t} \frac{dN_i(u)dN_i(v)}{G_c(u\mid {\bar{X}}_i(u))G_c(v\mid {\bar{X}}_i(v))}\\&\quad - \int _{0\le u, v\le t}\int _{\mathcal X_{u,v}}\frac{\mathbb E[dN(u)dN(v) \mid X(u)=x,X(v)=y]}{G_c(u\mid x)G_c(v\mid y)}dF_{X(u),X(v)}(x,y),\\ D_{1,2}(t)&=\frac{1}{n} \sum _{i=1}^n \int _{0\le u,v\le t}\chi (u,v,X_i(u),X_i(v))dN_i(u)dN_i(v)\\ D_{1,3}(t)&=- \int _{0\le u, v\le t}\int _{\mathcal X_{u,v}}\chi (u,v,x,y)\mathbb E[dN(u)dN(v) \mid X(u)=x,X(v)=y]dF_{X(u),X(v)}(x,y) \end{aligned}$$

and

$$\begin{aligned} \chi (u,v,x,y)&=\left\{ \left( {\hat{G}}(u-\mid x)-G(u-\mid x)\right) +\left( {\hat{G}}(v-\mid y)-G(v-\mid y)\right) \right. \\&\quad +\left( {\hat{G}}(v-\mid y)-G(v-\mid y)\right) +G(u-\mid x)\left( G(v-\mid y)-{\hat{G}}(v-\mid y)\right) \\&\quad +\left. {\hat{G}}(v-\mid x)\left( G(u-\mid y)-{\hat{G}}(u-\mid y)\right) \right\} \\&\quad \times \frac{1}{{\hat{G}}_c(u\mid x){\hat{G}}_c(v\mid y)G_c(u-\mid x)G_c(v\mid y)}\cdot \end{aligned}$$

The term $|D_{1,1}(t)|$ converges towards 0 in probability from the strong law of large numbers. The term $|D_{1,2}(t)|$ is bounded by

$$\begin{aligned} \sup _{u,v,x,y} |\chi (u,v,x,y)| \frac{1}{n} \sum _{i=1}^n \int _{0\le u< v\le t}dN_i(u)dN_i(v), \end{aligned}$$

$\sup _{u,v,x,y} |\chi (u,v,x,y)|$ converges towards 0 from the uniform consistency of ${\hat{G}}$ while the other term converges towards a bounded quantity from the law of large numbers. The same argument applies to $|D_{1,3}(t)|$ which also converges towards 0 in probability.

For $D_2(t)$ we write $|D_2(t)|\le |D_{2,1}(t)| + |D_{2,2}(t)|+ |D_{2,3}(t)|+ |D_{2,4}(t)|$ with

$$\begin{aligned} D_{2,1}(t)&=-\frac{2}{n} \sum _{i=1}^n\int _0^t \frac{dN_i(u)}{G_c(u\mid {\bar{X}}_i(u)}\mu (t\mid {\bar{X}}_i(t))+2 \int _{\mathcal X_{t}}\mu (t\mid x)\mu ^*(t\mid x)dF_{X(t)}(x),\\ D_{2,2}(t)&=\frac{2}{n} \sum _{i=1}^n\int _0^t \frac{dN_i(u)}{G_c(u\mid {\bar{X}}_i(u)}(\mu (t\mid {\bar{X}}_i(t))-\widehat{\mu }(t\mid {\bar{X}}_i(t)))\\ D_{2,3}(t)&=2 \int _{\mathcal X_{t}}(\widehat{\mu }(t\mid x)-\mu (t\mid x))\mu ^*(t\mid x)dF_{X(t)}(x),\\ D_{2,4}(t)&=\frac{2}{n} \sum _{i=1}^n\int _0^t \frac{(G(u-\mid \bar{X}_i(u))-{\hat{G}}(u-\mid {\bar{X}}_i(u)))dN_i(u)}{G_c(u\mid \bar{X}_i(u)){\hat{G}}_c(u-\mid {\bar{X}}_i(u))}\widehat{\mu }(t\mid {\bar{X}}_i(t)). \end{aligned}$$

The $D_{2,1}(t)$ term converges towards 0 in probability from the law of large numbers. For $D_{2,2}(t)$, $D_{2,3}(t)$ and $D_{2,4}(t)$ we use the consistency of $\widehat{\mu }$, the convergence in probability of $\sum _i\int _0^t dN_i(u))G_c(u\mid X_i(u))/n$, the boundedness of $\mathbb E[\mu ^*(t\mid {\bar{X}}(t))]$, the uniform consistency of ${\hat{G}}$ and the asymptotic boundedness of $\widehat{\mu }$ and ${\hat{G}}_c(u\mid x)^{-1}$ to prove that all three terms converge towards 0 in probability.

Finally, for $D_3(t)$, we write

$$\begin{aligned} D_3(t)&= \frac{1}{n} \sum _{i=1}^n \Big (\mu (t\mid {\bar{X}}_i(t))\Big )^2-\int _{\mathcal X_t}\Big (\mu (t\mid x)\Big )^2dF_{X(t)}(x)\\&\quad + \frac{1}{n} \sum _{i=1}^n\left( \Big (\widehat{\mu }(t\mid {\bar{X}}_i(t))\Big )^2-\Big (\mu (t\mid {\bar{X}}_i(t))\Big )^2\right) \\&\quad +\int _{\mathcal X_t}\left( \Big (\widehat{\mu }(t\mid x)\Big )^2-\Big (\mu (t\mid x)\Big )^2\right) dF_{X(t)}(x).\\ \end{aligned}$$

Each of the three terms converges towards 0 in probability using the law of large numbers for the first term and the uniform consistency of $\widehat{\mu }$ for the other two.

1.3 8.3 Proof of Proposition 3

First, note that the Brier score can be written in the following way:

$$\begin{aligned} \mathrm {MSE^{Brier}}(t,\pi )= \mathbb E[S(t\mid X)] - 2\mathbb E[S(t\mid X)\pi (t\mid X)]+\mathbb E[(\pi (t\mid X))^2]. \end{aligned}$$

We now study, our prediction score $\textrm{MSE}'(t,\pi )$. Using standard martingale properties (see for instance Andersen et al. (1993)), we directly have that $\mathbb E[dN(t)\mid X]=H(t\mid X)\lambda ^*(t\mid X)dt$, where $H(t\mid X)=\mathbb P[T>t\mid X]=S(t\mid X) G_c(t\mid X)$ under independent censoring and $\lambda ^*$ is the hazard rate of $T^*$. As a consequence,

$$\begin{aligned} \mathbb E\left[ \int _0^t \frac{dN(u)}{G_c(u\mid X)}\mid X\right] =\int _0^t S(u\mid X)\lambda ^*(u\mid X)du=1-S(t\mid X), \end{aligned}$$

(14)

since $S(u\mid X)\lambda ^*(u\mid X)$ is equal to the conditional density function of $T^*$. Also, it is important to notice that

$$\begin{aligned} \mathbb E\left[ \left( \int _0^t \frac{dN(u)}{G_c(u\mid X)}\right) ^2\right] =\mathbb E\left[ \int _0^t \frac{dN(u)}{(G_c(u\mid X))^2}\right] =\mathbb E\left[ \int _0^t \frac{S(u\mid X)}{G_c(u\mid X)}\lambda ^*(u\mid X)du\right] , \end{aligned}$$

where the first equality is due to the fact that N can only jump once and thus $(\int _0^t dN(u)/(G_c(u\mid X)))^2$ is simply equal to $\Delta I(T\le t)/(G_c(T\mid X))^2$. Now,

$$\begin{aligned} \textrm{MSE}'(t,\pi )&= \mathbb E\left[ \left( 1-\int _0^t \frac{dN(u)}{G_c(u\mid X)}\right) ^2\right] -2 \mathbb E\left[ \left( 1-\int _0^t \frac{dN(u)}{G_c(u\mid X)}\right) \pi (t\mid X)\right] \\&\quad +\mathbb E[(\pi (t\mid X))^2]\\&=1-2\mathbb E[(1-S(t\mid X))]+\mathbb E\left[ \left( \int _0^t \frac{dN(u)}{G_c(u\mid X)}\right) ^2\right] -2\mathbb E[S(t\mid X)\pi (t\mid X)]\\&\quad +\mathbb E[(\pi (t\mid X))^2]\\&=\mathrm {MSE^{Brier}}(t,\pi )+B(t), \end{aligned}$$

with

$$\begin{aligned} B(t)=-\mathbb E[1-S(t\mid X)]+\mathbb E\left[ \int _0^t \frac{S(u\mid X)}{G_c(u\mid X)}\lambda ^*(u\mid X)du\right] . \end{aligned}$$

Now, using Eq. (14), we can rewrite B(t) in the following way:

$$\begin{aligned} B(t)&=-\mathbb E\left[ \int _0^t S(u\mid X)\lambda ^*(u\mid X)du\right] +\mathbb E\left[ \int _0^t \frac{S(u\mid X)}{G_c(u\mid X)}\lambda ^*(u\mid X)du\right] \\&=\mathbb E\left[ \int _0^t \frac{G(u-)}{G_c(u\mid X)}S(u\mid X)\lambda ^*(u\mid X)du\right] . \end{aligned}$$

This shows that $B(t)\ge 0$ and that this quantity does not depend on $\pi$.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Bouaziz, O. Assessing model prediction performance for the expected cumulative number of recurrent events. Lifetime Data Anal 30, 262–289 (2024). https://doi.org/10.1007/s10985-023-09610-x

Download citation

Received: 27 January 2023
Accepted: 19 September 2023
Published: 17 November 2023
Issue Date: January 2024
DOI: https://doi.org/10.1007/s10985-023-09610-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Assessing model prediction performance for the expected cumulative number of recurrent events

Abstract

Access this article

Similar content being viewed by others

Nonparametric inference for the joint distribution of recurrent marked variables and recurrent survival time

Penalised logistic regression and dynamic prediction for discrete-time recurrent event data

Joint analysis of recurrent event data with a dependent terminal event

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Supplementary Information

Supplementary file 1 (pdf 5736 KB)

8 Appendix: proofs of the convergence of the prediction criterion for the expected cumulative number of recurrent events under the two scenarios

1.1 8.1 Proof of Proposition 1

1.2 8.2 Proof of Proposition 2

1.3 8.3 Proof of Proposition 3

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Assessing model prediction performance for the expected cumulative number of recurrent events

Abstract

Access this article

Similar content being viewed by others

Nonparametric inference for the joint distribution of recurrent marked variables and recurrent survival time

Penalised logistic regression and dynamic prediction for discrete-time recurrent event data

Joint analysis of recurrent event data with a dependent terminal event

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Supplementary Information

Supplementary file 1 (pdf 5736 KB)

8 Appendix: proofs of the convergence of the prediction criterion for the expected cumulative number of recurrent events under the two scenarios

8 Appendix: proofs of the convergence of the prediction criterion for the expected cumulative number of recurrent events under the two scenarios

1.1 8.1 Proof of Proposition 1

1.2 8.2 Proof of Proposition 2

1.3 8.3 Proof of Proposition 3

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation