Skip to main content
Log in

Assessing model prediction performance for the expected cumulative number of recurrent events

  • Published:
Lifetime Data Analysis Aims and scope Submit manuscript

Abstract

In a recurrent event setting, we introduce a new score designed to evaluate the prediction ability, for a given model, of the expected cumulative number of recurrent events. This score can be seen as an extension of the Brier Score for single time to event data but works for recurrent events with or without a terminal event. Theoretical results are provided that show that under standard assumptions in a recurrent event context, our score can be asymptotically decomposed as the sum of the theoretical mean squared error between the model and the true expected cumulative number of recurrent events and an inseparability term that does not depend on the model. This decomposition is further illustrated on simulations studies. It is also shown that this score should be used in comparison with a reference model, such as a nonparametric estimator that does not include the covariates. Finally, the score is applied for the prediction of hospitalisations on a dataset of patients suffering from atrial fibrillation and a comparison of the prediction performances of different models, such as the Cox model, the Aalen Model or the Ghosh and Lin model, is investigated.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  • Andersen PK, Angst J, Ravn H (2019) Modeling marginal features in studies of recurrent events in the presence of a terminal event. Lifetime Data Anal 25:681–695

    Article  MathSciNet  Google Scholar 

  • Andersen PK, Borgan Ø, Gill RD, Keiding N (1993) Statistical models based on counting processes. Springer series in statistics. Springer-Verlag, New York

    Book  Google Scholar 

  • Andersen PK, Gill RD (1982) Cox’s regression model for counting processes: a large sample study. Ann Stat 10(4):1100–1120

    Article  MathSciNet  Google Scholar 

  • Bouaziz O, Geffray S, Lopez O (2015) Semiparametric inference for the recurrent events process by means of a single-index model. Statistics 49:361–385

    Article  MathSciNet  Google Scholar 

  • Bouaziz O, Lopez O (2010) Conditional density estimation in a censored single-index regression model. Bernoulli 16:514–542

    Article  MathSciNet  Google Scholar 

  • Bradley AA, Schwartz SS, Hashino T (2008) Sampling uncertainty and confidence intervals for the brier score and brier skill score. Weather Forecast 23:992–1006

    Article  Google Scholar 

  • Cook RJ, Lawless J (2007) The statistical analysis of recurrent events. Springer Science & Business Media, New-York, USA

    Google Scholar 

  • Cook RJ, Lawless JF (1997) Marginal analysis of recurrent events and a terminating event. Stat Med 16:911–924

    Article  Google Scholar 

  • Cox DR (1972) Regression models and life-tables. J R Stat Soc Ser B (Methodol) 34:187–202

    MathSciNet  Google Scholar 

  • Dabrowska DM (1989) Uniform consistency of the kernel conditional Kaplan-Meier estimate. Ann. Stat 17(3):1157–1167

    Article  MathSciNet  Google Scholar 

  • Gerds TA, Kattan MW, Schumacher M, Yu C (2013) Estimating a time-dependent concordance index for survival prediction models with covariate dependent censoring. Stat Med 32:2173–2184

    Article  MathSciNet  Google Scholar 

  • Gerds TA, Schumacher M (2006) Consistent estimation of the expected brier score in general survival models with right-censored event times. Biom J 48:1029–1040

    Article  MathSciNet  Google Scholar 

  • Ghosh D, Lin D (2003) Semiparametric analysis of recurrent events data in the presence of dependent censoring. Biometrics 59:877–885

    Article  MathSciNet  Google Scholar 

  • Ghosh D, Lin DY (2002) Marginal regression models for recurrent and terminal events. Stat Sin 12(3):663–688

    MathSciNet  Google Scholar 

  • Graf E, Schmoor C, Sauerbrei W, Schumacher M (1999) Assessment and comparison of prognostic classification schemes for survival data. Stat Med 18:2529–2545

    Article  Google Scholar 

  • Harrell FE Jr, Lee KL, Mark DB (1996) Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Stat Med 15:361–387

    Article  Google Scholar 

  • Heagerty PJ, Zheng Y (2005) Survival model predictive accuracy and ROC curves. Biometrics 61:92–105

    Article  MathSciNet  Google Scholar 

  • Hougaard P, Hougaard P (2000) Analysis of multivariate survival data, vol 564. Springer, New York, USA

    Book  Google Scholar 

  • Ishwaran H, Kogalur UB, Blackstone EH, Lauer MS (2008) Random survival forests. Ann Appl Stat 2:841–860

    Article  MathSciNet  Google Scholar 

  • Kalbfleisch JD, Prentice RL (2002) The statistical analysis of failure time data, 2nd edn. Wiley series in probability and statistics. Wiley-Interscience (John Wiley & Sons), Hoboken

    Book  Google Scholar 

  • Lin D, Wei L, Ying Z (1998) Accelerated failure time models for counting processes. Biometrika 85:605–618

    Article  MathSciNet  Google Scholar 

  • Lin DY, Wei L-J, Yang I, Ying Z (2000) Semiparametric regression for the mean and rate functions of recurrent events. J R Stat Soc Ser B (Stat Methodol) 62:711–730

    Article  MathSciNet  Google Scholar 

  • Liu L, Wolfe RA, Huang X (2004) Shared frailty models for recurrent events and a terminal event. Biometrics 60:747–756

    Article  MathSciNet  Google Scholar 

  • Prentice RL, Williams BJ, Peterson AV (1981) On the regression analysis of multivariate failure time data. Biometrika 68:373–379

    Article  MathSciNet  Google Scholar 

  • Rondeau V, Mathoulin-Pelissier S, Jacqmin-Gadda H, Brouste V, Soubeyran P (2007) Joint frailty models for recurring events and death using maximum penalized likelihood estimation: application on cancer events. Biostatistics 8:708–721

    Article  Google Scholar 

  • Scheike TH (2002) The additive nonparametric and semiparametric Aalen model as the rate function for a counting process. Lifetime Data Anal 8:247–262

    Article  MathSciNet  Google Scholar 

  • Schoop R, Schumacher M, Graf E (2011) Measures of prediction error for survival data with longitudinal covariates. Biom J 53:275–293

    Article  MathSciNet  Google Scholar 

  • Schroder J, Bouaziz O, Agner BR, Martinussen T, Madsen PL, Li D, Dixen U (2019) Recurrent event survival analysis predicts future risk of hospitalization in patients with paroxysmal and persistent atrial fibrillation. PLoS One 14:e0217983

    Article  Google Scholar 

  • Steyerberg EW, Vickers AJ, Cook NR, Gerds T, Gonen M, Obuchowski N, Pencina MJ, Kattan MW (2010) Assessing the performance of prediction models: a framework for some traditional and novel measures. Epidemiology (Cambridge, Mass) 21:128

    Article  Google Scholar 

  • Van Oirbeek R, Lesaffre E (2016) Exploring the clustering effect of the frailty survival model by means of the brier score. Commun Stat-Simul Comput 45:3294–3306

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

We thank the reviewers for their constructive criticisms and comments that have helped improve the paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Olivier Bouaziz.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (pdf 5736 KB)

8 Appendix: proofs of the convergence of the prediction criterion for the expected cumulative number of recurrent events under the two scenarios

8 Appendix: proofs of the convergence of the prediction criterion for the expected cumulative number of recurrent events under the two scenarios

In the proof of Proposition 1, we need to verify the key equality from Eq. (3). This result depends on the modelling assumptions and has already been proved in all three different scenarios, see Section 1 of Supplementary Information, Sect. 2.2 of the main manuscript and Section 2 of Supplementary Information for the right-censoring case with no terminal event, the terminal event case, and the dependence on prior counts case, respectively. In the proof of Proposition 2, we also need to have \(\mathbb E[\mu ^*(\tau \mid \bar{X}(\tau ))]<\infty\) which also depends on the different modelling assumptions made under each scenario.

1.1 8.1 Proof of Proposition 1

In all three scenarios, we directly have:

$$\begin{aligned} \textrm{MSE}(t,\mu )&= \mathbb E\left[ \big (\mu (t\mid {\bar{X}}(t))-\mu ^*(t\mid {\bar{X}}(t))\big )^2\right] + \mathbb E\left[ \Bigg (\int _0^t\frac{dN(u)}{G_c(u\mid {\bar{X}}(u))}-\mu ^*(t\mid {\bar{X}}(t))\Bigg )^2\right] \\&\quad + 2 \mathbb E\left[ \Bigg (\int _0^t\frac{dN(u)}{G_c(u\mid \bar{X}(u))}-\mu ^*(t\mid {\bar{X}}(t))\Bigg )\Bigg (\mu ^*(t\mid \bar{X}(t))-\mu (t\mid {\bar{X}}(t))\Bigg )\right] . \end{aligned}$$

Using the fact that \(\mathbb E[\int _0^tdN(u)/(G_c(u\mid \bar{X}(u)))\mid {\bar{X}}(t)]=\mu ^*(t\mid {\bar{X}}(t))\), we conclude that

$$\begin{aligned} \textrm{MSE}(t,\mu )&=\mathbb E\left[ (\mu (t\mid \bar{X}(t))-\mu ^*(t\mid {\bar{X}}(t)))^2\right] +A(t), \end{aligned}$$

where

$$\begin{aligned} A(t)&=\mathbb E\left[ \left( \int _0^t\frac{dN(u)}{G_c(u\mid \bar{X}(u))}\right) ^2\right] -\mathbb E\left[ \left( \mu ^*(t\mid \bar{X}(t))\right) ^2 \right] . \end{aligned}$$
(13)

Now, using the remarkable identity \(a^2-b^2=(a-b)(a+b)\) and observing that \(\int _0^tdN(u)/(G_c(u\mid \bar{X}(u)))=\sum _{\text {ev.}\le t}\{1/(G_c(u\mid X(\text {ev.})))\}\) either equals 0 if no observed recurrent events occurred before time t or is greater than 1 if at least one recurrent event occurred before time t, we conclude that

$$\begin{aligned}&\left( \int _0^t\frac{dN(u)}{G_c(u\mid {\bar{X}}(u))}-\mu ^*(t\mid {\bar{X}}(t))\right) \left( \int _0^t\frac{dN(u)}{G_c(u\mid {\bar{X}}(u))}+\mu ^*(t\mid {\bar{X}}(t))\right) \\&\quad \ge \left( \int _0^t\frac{dN(u)}{G_c(u\mid \bar{X}(u))}-\mu ^*(t\mid {\bar{X}}(t))\right) \mu ^*(t\mid {\bar{X}}(t)), \end{aligned}$$

almost surely. Taking the expectation on both sides proves \(A(t)\ge 0\).

1.2 8.2 Proof of Proposition 2

We start by proving that \(\mathbb E[\mu ^*(\tau \mid \bar{X}(\tau ))]<\infty\) in the presence of a terminal event (the scenario without terminal event follows from the same arguments). We have for all \(t\in [0,\tau ]: \mathbb P[C\ge t\mid {\bar{X}}(t)] \ge \mathbb P[T\ge t\mid {\bar{X}}(t)]\ge c\), from Assumption 2. From the same assumption, \(N(\tau )\) is almost surely bounded by a constant. As a consequence,

$$\begin{aligned} \mu ^*(\tau \mid {\bar{X}}(\tau ))=\int _0^{\tau } \frac{\mathbb E[dN(t)\mid {\bar{X}}(t)]}{G_c(t\mid {\bar{X}}(t))} \end{aligned}$$

is almost surely bounded, where the equality has been proved in Sect. 2.2. In the dependence on prior counts case, we have for all \(t\in [0,\tau ]: \mathbb P[C\ge t\mid {\bar{X}}(t)] \ge \mathbb P[T\ge t\mid \bar{X}(t)]=\sum _{l=1}^{L+1}\mathbb P[T\ge t,N(t-)=l-1\mid {\bar{X}}(t)]\ge (L+1)c>0\), where the two last bounds come from Assumption MSM in the Supplementary Information. From the same assumption, \(N(\tau )\) is almost surely bounded by a constant. As a consequence,

$$\begin{aligned} \mu ^*(\tau \mid {\bar{X}}(\tau ))=\int _0^{\tau } \frac{\mathbb E[dN(t)\mid {\bar{X}}(t)]}{G_c(t\mid {\bar{X}}(t))} \end{aligned}$$

is almost surely bounded, where the equality has been proved in the Supplementary Information. The rest of the proof of Proposition 2 is identical in all three scenarios.

We first note \(F_{X(t)}(x)=\mathbb P[X(t)\le x]\), we let \(\mathcal X_{u,v}\) denote the support of the joint distribution (X(u), X(v)) and we note \(F_{X(u),X(v)}(x,y)=\mathbb P[X(u)\le x,X(v)\le v]\). We then introduce the quantity

$$\begin{aligned} \xi (t)&=\int _{0\le u, v\le t}\int _{\mathcal X_{u,v}}\frac{\mathbb E[dN(u)dN(v) \mid X(u)=x,X(v)=y]}{{\hat{G}}_c(u\mid x){\hat{G}}_c(v\mid y)}dF_{X(u),X(v)}(x,y)\\&\quad -2 \int _{\mathcal X_{t}}\widehat{\mu }(t\mid x)\mu ^*(t\mid x)dF_{X(t)}(x)\\&\quad +\int _{\mathcal X_t} \left( \widehat{\mu }(t\mid x)\right) ^2dF_{X(t)}(x)=:\xi _1(t)+\xi _2(t)+\xi _3(t). \end{aligned}$$

Write:

$$\begin{aligned} \left| \widehat{\textrm{MSE}^1}(t,\hat{\mu })-\textrm{MSE}^1(t,\mu )\right|&\le \left| \xi (t)-\mathbb E\bigg [\bigg (\int _0^t \frac{dN(u)}{G_c(u\mid {\bar{X}}(u))}-\mu (t\mid {\bar{X}}(t))\bigg )^2\bigg ]\right| \\&\quad + \left| \frac{1}{n} \sum _{i=1}^n \bigg (\int _0^t \frac{dN_i(u)}{{\hat{G}}_c(u\mid {\bar{X}}_i(u))}-\widehat{\mu }(t\mid X_i(t))\bigg )^2-\xi (t)\right| \\&\le : C(t)+D(t). \end{aligned}$$

By decomposing the square term into three other terms, we bound C(t) in the following way: \(C(t)\le |C_1(t)|+|C_2(t)|+|C_3(t)|\) with

$$\begin{aligned} C_1(t)&=\int _{0\le u,v\le t}\int _{\mathcal X_{u,v}}\frac{G_c(u\mid x)G_c(v\mid y)-{\hat{G}}_c(u\mid x){\hat{G}}_c(v\mid y)}{{\hat{G}}_c(u\mid x){\hat{G}}_c(v\mid y)G_c(u\mid x)G_c(v\mid y)}\\&\qquad \mathbb E[dN(u)dN(v) \mid X(u)=x,X(v)=y]dF_{X(u),X(v)}(x,y),\\ C_2(t)&=-2 \int _{\mathcal X_{t}}(\widehat{\mu }(t\mid x)- \mu (t\mid x))\mu ^*(t\mid x)dF_{X(t)}(x),\\ C_3(t)&=\int _{\mathcal X_t} \left( \left( \widehat{\mu }(t\mid x)\right) ^2-\left( \mu (t\mid x)\right) ^2\right) dF_{X(t)}(x). \end{aligned}$$

For \(C_1(t)\) we have

$$\begin{aligned}&G_c(u\mid x)G_c(v\mid y)-{\hat{G}}_c(u-\mid x){\hat{G}}_c(v-\mid y)\\&=\left( {\hat{G}}(u-\mid x)-G(u-\mid x)\right) +\left( {\hat{G}}(v-\mid y)-G(v-\mid y)\right) \\&\quad +G(u-\mid x)\left( G(v-\mid y)-{\hat{G}}(v-\mid y)\right) +\hat{G}(v-\mid y)\left( G(u-\mid x)-{\hat{G}}(u-\mid x)\right) , \end{aligned}$$

and we can deal with all four terms in the same fashion. For instance, for the first term,

$$\begin{aligned}&\int _{0\le u, v\le t}\int _{\mathcal X_{u,v}}\frac{\left( {\hat{G}}(u-\mid x)-G(u-\mid x)\right) \mathbb E[dN(u)dN(v) \mid X(u)=x,X(v)=y]}{{\hat{G}}_c(u\mid x){\hat{G}}_c(v\mid y)G_c(u\mid x)G_c(v\mid y)}dF_{X(u),X(v)}(x,y)\\&\quad \le \int _{0}^{t}\int _{\mathcal X_{u}}\frac{\left| \hat{G}(u-\mid x)-G(u-\mid x)\right| \mathbb E[dN(u) \mid X(u)=x]}{\hat{G}_c(u\mid x)G_c(u\mid x)}dF_{X(u)}(x), \end{aligned}$$

using the fact that \(\int _0^t dN(v)/({\hat{G}}_c(v\mid y)G_c(v\mid y))\) is bounded. Then, since \(\int _0^t\mathbb E[dN(u)/(1-G(u-\mid X(u))) \mid X(u)=x]=\mu ^*(t\mid {\bar{X}}(t))\) and \({\hat{G}}_c(u\mid x)^{-1}\) is asymptotically uniformly bounded, we conclude that \(|C_1(t)|\) tends toward 0 in probability using the uniform consistency of the censoring estimator.

For \(C_2(t)\) we use the consistency of \(\widehat{\mu }\) and the fact that \(\mathbb E[\mu ^*(t\mid {\bar{X}}(t))]\) is finite to prove that \(|C_2(t)|\) tends towards 0 in probability.

For \(C_3(t)\), we directly write \((\widehat{\mu }(t\mid x))^2-(\mu (t\mid x))^2=(\widehat{\mu }(t\mid x)-\mu (t\mid x))(\widehat{\mu }(t\mid x)+\mu (t\mid x))\) and we use the fact that \(\mu (t\mid x)\) is bounded and the consistency of \(\widehat{\mu }\) to prove that \(|C_3(t)|\) tends towards 0 in probability.

Similarly to C(t) we obtain the following bound: \(D(t)\le |D_1(t)|+|D_2(t)|+|D_3(t)|\) with

$$\begin{aligned} D_1(t)&=\frac{1}{n} \sum _{i=1}^n \int _{0\le u, v\le t} \frac{dN_i(u)dN_i(v)}{{\hat{G}}_c(u\mid {\bar{X}}_i(u)){\hat{G}}_c(v\mid {\bar{X}}_i(v))}-\xi _1(t),\\ D_2(t)&=-\frac{2}{n} \sum _{i=1}^n\int _0^t \frac{dN_i(u)}{{\hat{G}}_c(u\mid {\bar{X}}_i(u))}\widehat{\mu }(t\mid X_i(t))-\xi _2(t),\\ D_3(t)&=\frac{1}{n} \sum _{i=1}^n \Big (\widehat{\mu }(t\mid \bar{X}_i(t))\Big )^2-\xi _3(t). \end{aligned}$$

We now use the bound \(|D_1(t)|\le |D_{1,1}(t)| + |D_{1,2}(t)|+|D_{1,3}(t)|\) with

$$\begin{aligned} D_{1,1}(t)&=\frac{1}{n} \sum _{i=1}^n \int _{0\le u,v\le t} \frac{dN_i(u)dN_i(v)}{G_c(u\mid {\bar{X}}_i(u))G_c(v\mid {\bar{X}}_i(v))}\\&\quad - \int _{0\le u, v\le t}\int _{\mathcal X_{u,v}}\frac{\mathbb E[dN(u)dN(v) \mid X(u)=x,X(v)=y]}{G_c(u\mid x)G_c(v\mid y)}dF_{X(u),X(v)}(x,y),\\ D_{1,2}(t)&=\frac{1}{n} \sum _{i=1}^n \int _{0\le u,v\le t}\chi (u,v,X_i(u),X_i(v))dN_i(u)dN_i(v)\\ D_{1,3}(t)&=- \int _{0\le u, v\le t}\int _{\mathcal X_{u,v}}\chi (u,v,x,y)\mathbb E[dN(u)dN(v) \mid X(u)=x,X(v)=y]dF_{X(u),X(v)}(x,y) \end{aligned}$$

and

$$\begin{aligned} \chi (u,v,x,y)&=\left\{ \left( {\hat{G}}(u-\mid x)-G(u-\mid x)\right) +\left( {\hat{G}}(v-\mid y)-G(v-\mid y)\right) \right. \\&\quad +\left( {\hat{G}}(v-\mid y)-G(v-\mid y)\right) +G(u-\mid x)\left( G(v-\mid y)-{\hat{G}}(v-\mid y)\right) \\&\quad +\left. {\hat{G}}(v-\mid x)\left( G(u-\mid y)-{\hat{G}}(u-\mid y)\right) \right\} \\&\quad \times \frac{1}{{\hat{G}}_c(u\mid x){\hat{G}}_c(v\mid y)G_c(u-\mid x)G_c(v\mid y)}\cdot \end{aligned}$$

The term \(|D_{1,1}(t)|\) converges towards 0 in probability from the strong law of large numbers. The term \(|D_{1,2}(t)|\) is bounded by

$$\begin{aligned} \sup _{u,v,x,y} |\chi (u,v,x,y)| \frac{1}{n} \sum _{i=1}^n \int _{0\le u< v\le t}dN_i(u)dN_i(v), \end{aligned}$$

\(\sup _{u,v,x,y} |\chi (u,v,x,y)|\) converges towards 0 from the uniform consistency of \({\hat{G}}\) while the other term converges towards a bounded quantity from the law of large numbers. The same argument applies to \(|D_{1,3}(t)|\) which also converges towards 0 in probability.

For \(D_2(t)\) we write \(|D_2(t)|\le |D_{2,1}(t)| + |D_{2,2}(t)|+ |D_{2,3}(t)|+ |D_{2,4}(t)|\) with

$$\begin{aligned} D_{2,1}(t)&=-\frac{2}{n} \sum _{i=1}^n\int _0^t \frac{dN_i(u)}{G_c(u\mid {\bar{X}}_i(u)}\mu (t\mid {\bar{X}}_i(t))+2 \int _{\mathcal X_{t}}\mu (t\mid x)\mu ^*(t\mid x)dF_{X(t)}(x),\\ D_{2,2}(t)&=\frac{2}{n} \sum _{i=1}^n\int _0^t \frac{dN_i(u)}{G_c(u\mid {\bar{X}}_i(u)}(\mu (t\mid {\bar{X}}_i(t))-\widehat{\mu }(t\mid {\bar{X}}_i(t)))\\ D_{2,3}(t)&=2 \int _{\mathcal X_{t}}(\widehat{\mu }(t\mid x)-\mu (t\mid x))\mu ^*(t\mid x)dF_{X(t)}(x),\\ D_{2,4}(t)&=\frac{2}{n} \sum _{i=1}^n\int _0^t \frac{(G(u-\mid \bar{X}_i(u))-{\hat{G}}(u-\mid {\bar{X}}_i(u)))dN_i(u)}{G_c(u\mid \bar{X}_i(u)){\hat{G}}_c(u-\mid {\bar{X}}_i(u))}\widehat{\mu }(t\mid {\bar{X}}_i(t)). \end{aligned}$$

The \(D_{2,1}(t)\) term converges towards 0 in probability from the law of large numbers. For \(D_{2,2}(t)\), \(D_{2,3}(t)\) and \(D_{2,4}(t)\) we use the consistency of \(\widehat{\mu }\), the convergence in probability of \(\sum _i\int _0^t dN_i(u))G_c(u\mid X_i(u))/n\), the boundedness of \(\mathbb E[\mu ^*(t\mid {\bar{X}}(t))]\), the uniform consistency of \({\hat{G}}\) and the asymptotic boundedness of \(\widehat{\mu }\) and \({\hat{G}}_c(u\mid x)^{-1}\) to prove that all three terms converge towards 0 in probability.

Finally, for \(D_3(t)\), we write

$$\begin{aligned} D_3(t)&= \frac{1}{n} \sum _{i=1}^n \Big (\mu (t\mid {\bar{X}}_i(t))\Big )^2-\int _{\mathcal X_t}\Big (\mu (t\mid x)\Big )^2dF_{X(t)}(x)\\&\quad + \frac{1}{n} \sum _{i=1}^n\left( \Big (\widehat{\mu }(t\mid {\bar{X}}_i(t))\Big )^2-\Big (\mu (t\mid {\bar{X}}_i(t))\Big )^2\right) \\&\quad +\int _{\mathcal X_t}\left( \Big (\widehat{\mu }(t\mid x)\Big )^2-\Big (\mu (t\mid x)\Big )^2\right) dF_{X(t)}(x).\\ \end{aligned}$$

Each of the three terms converges towards 0 in probability using the law of large numbers for the first term and the uniform consistency of \(\widehat{\mu }\) for the other two.

1.3 8.3 Proof of Proposition 3

First, note that the Brier score can be written in the following way:

$$\begin{aligned} \mathrm {MSE^{Brier}}(t,\pi )= \mathbb E[S(t\mid X)] - 2\mathbb E[S(t\mid X)\pi (t\mid X)]+\mathbb E[(\pi (t\mid X))^2]. \end{aligned}$$

We now study, our prediction score \(\textrm{MSE}'(t,\pi )\). Using standard martingale properties (see for instance Andersen et al. (1993)), we directly have that \(\mathbb E[dN(t)\mid X]=H(t\mid X)\lambda ^*(t\mid X)dt\), where \(H(t\mid X)=\mathbb P[T>t\mid X]=S(t\mid X) G_c(t\mid X)\) under independent censoring and \(\lambda ^*\) is the hazard rate of \(T^*\). As a consequence,

$$\begin{aligned} \mathbb E\left[ \int _0^t \frac{dN(u)}{G_c(u\mid X)}\mid X\right] =\int _0^t S(u\mid X)\lambda ^*(u\mid X)du=1-S(t\mid X), \end{aligned}$$
(14)

since \(S(u\mid X)\lambda ^*(u\mid X)\) is equal to the conditional density function of \(T^*\). Also, it is important to notice that

$$\begin{aligned} \mathbb E\left[ \left( \int _0^t \frac{dN(u)}{G_c(u\mid X)}\right) ^2\right] =\mathbb E\left[ \int _0^t \frac{dN(u)}{(G_c(u\mid X))^2}\right] =\mathbb E\left[ \int _0^t \frac{S(u\mid X)}{G_c(u\mid X)}\lambda ^*(u\mid X)du\right] , \end{aligned}$$

where the first equality is due to the fact that N can only jump once and thus \((\int _0^t dN(u)/(G_c(u\mid X)))^2\) is simply equal to \(\Delta I(T\le t)/(G_c(T\mid X))^2\). Now,

$$\begin{aligned} \textrm{MSE}'(t,\pi )&= \mathbb E\left[ \left( 1-\int _0^t \frac{dN(u)}{G_c(u\mid X)}\right) ^2\right] -2 \mathbb E\left[ \left( 1-\int _0^t \frac{dN(u)}{G_c(u\mid X)}\right) \pi (t\mid X)\right] \\&\quad +\mathbb E[(\pi (t\mid X))^2]\\&=1-2\mathbb E[(1-S(t\mid X))]+\mathbb E\left[ \left( \int _0^t \frac{dN(u)}{G_c(u\mid X)}\right) ^2\right] -2\mathbb E[S(t\mid X)\pi (t\mid X)]\\&\quad +\mathbb E[(\pi (t\mid X))^2]\\&=\mathrm {MSE^{Brier}}(t,\pi )+B(t), \end{aligned}$$

with

$$\begin{aligned} B(t)=-\mathbb E[1-S(t\mid X)]+\mathbb E\left[ \int _0^t \frac{S(u\mid X)}{G_c(u\mid X)}\lambda ^*(u\mid X)du\right] . \end{aligned}$$

Now, using Eq. (14), we can rewrite B(t) in the following way:

$$\begin{aligned} B(t)&=-\mathbb E\left[ \int _0^t S(u\mid X)\lambda ^*(u\mid X)du\right] +\mathbb E\left[ \int _0^t \frac{S(u\mid X)}{G_c(u\mid X)}\lambda ^*(u\mid X)du\right] \\&=\mathbb E\left[ \int _0^t \frac{G(u-)}{G_c(u\mid X)}S(u\mid X)\lambda ^*(u\mid X)du\right] . \end{aligned}$$

This shows that \(B(t)\ge 0\) and that this quantity does not depend on \(\pi\).

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bouaziz, O. Assessing model prediction performance for the expected cumulative number of recurrent events. Lifetime Data Anal 30, 262–289 (2024). https://doi.org/10.1007/s10985-023-09610-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10985-023-09610-x

Keywords

Navigation