An empirical comparison of methods for predicting net survival
Introduction
Cancer registries routinely publish estimates of net survival, and studies comparing cancer survival between countries or regions, and across time periods, are largely based on such estimates [1], [2], [3], [4]. A net survival curve shows the proportion of patients still alive at a given time, assuming the cancer of interest is the only possible cause of death. A rigorous theoretical explanation of net survival has been given by Perme et al. [5]. Unfortunately, for recently diagnosed patient cohorts, only short-term observed net survival is available because of the short follow-up time. Therefore, predictions are needed to estimate net survival for recently diagnosed patients. Throughout this article the term prediction is used for situations when follow-up is incomplete, whereas the term estimation is used when follow-up is complete.
Estimates of net survival can be obtained using the cohort approach, which requires complete potential follow-up on all patients, meaning that 5-year net survival can be estimated only for patients diagnosed at least 5 years ago. However, as treatment regimens change, estimates from the cohort approach quickly become outdated. Alternatively, one could estimate net survival by the so-called period approach which was introduced in 1996 [6]. The period approach fixes an observation window, and net survival is estimated by left truncation at the start of the window and right censoring at the end of the window. Table 1 illustrates how the period approach may be used to predict net survival up to 10 years for the cohort of patients diagnosed in 20082012. Here, patients diagnosed in the period 19982007 who are still alive are considered to be at risk from January 1st 2008 and to death or to the censoring date December 31st 2012. In situations where there is delayed recording of incident cases, a commonly implemented period analysis would not make use of the follow-up information available after the last year of recorded cases. Table 1 illustrates this by showing that the period approach does not use any of the follow-up information in 2013. A natural way to solve this would be to shift the observation window 1 year forward to 20092013. Doing this, all conditional survival estimates, apart from the first year, contain contributions from 5 potential years of diagnosis. In this situation the observation window should be widened for the first year to also include patients diagnosed in 2008 (see Table 1). This subtle change in implementation is referred to as the hybrid approach, and was introduced in 2004 [7]. The hybrid approach does not restrict follow-up to the last year of diagnosis. Instead, the estimate of net survival is obtained by letting the time at which individuals become at risk (the start of the observation window) differ according to the time of diagnosis.
Alternatively, predicted net survival can be obtained by fitting a model including historical data, predicting net survival using the estimated parameter values. This is a useful approach if the assumptions of the underlying model are met. In this study, predictions of net survival were obtained from a flexible parametric cumulative excess hazard model (FPM).
Several empirical studies have concluded that the period and hybrid approaches are useful for predicting net survival [8], [9], [10]. Studies have also concluded that model-based predictions of net survival are accurate [11], [12]. To our knowledge, no empirical comparison of predictions obtained from the period and hybrid approaches as well as predictions obtained from flexible parametric models have been done. The aim of this study was to empirically compare predictions of net survival obtained from a flexible parametric excess hazard model to predictions obtained using the period and hybrid approaches.
Section snippets
Data material
We included all diagnoses of cancer reported to the Cancer Registry of Norway between 1953 and 2008. Cancer cases were grouped into 23 categories based on topography, according to the annual report, Cancer in Norway (http://www.kreftregisteret.no/no/Generelt/Publikasjoner/Cancer-in-Norway/Cancer-in-Norway-2013/). Cancers diagnosed at autopsy were excluded from the analyses. A total of 453,202 cancers among 417,138 men and 419,386 cancers among 388,227 women were included in the analysis. Data
Results
The results for 5-year predicted net survival showed that the period approach never produced the lowest average prediction error. The hybrid approach produced the lowest average prediction error in nine of the 23 cancer sites: mouth/pharynx, colon, rectum, cervix uteri, ovary, central nervous system, thyroid, non-Hodgkin lymphoma and leukemia. FPM produced the lowest average prediction error for the remaining 14 cancer sites: esophagus, stomach, liver, gallbladder, pancreas, lung, skin
Discussion
This empirical evaluation of different methods for predicting net survival shows that FPM provides better predictions compared to standard non-parametric approaches. The analysis also indicates that the hybrid approach is better than the period approach. FPM systematically outperformed the other methods when predicting net survival for cancer sites with poor survival, and was also better at predicting net survival for cancer sites where survival has improved over time. FPM consistently gave the
Conclusions
We empirically compared predicted net survivals obtained from a flexible parametric cumulative excess hazard model, with predictions obtained from the period and hybrid approaches. When predicting net survival up to 5 years after diagnosis, the differences were quite small. When predicting net survival up to and exceeding 10 years after diagnosis, the differences between the methods increased. Overall, the FPM and the hybrid approach performed better than the period approach, with FPM being
Conflicts of interest
No conflicts of interest.
Authors contribution
Tor ÿge Myklebust formulated the research question, chose the methods to use, collected data, performed the analyses, wrote and reviewed the manuscript.
Bjarte Aagnes formulated the research question, chose the methods to use and critically reviewed the manuscript.
Bjørn Møller formulated the research question, chose the methods to use and critically reviewed the manuscript.
References (16)
- et al.
Cancer survival in Europe 19992007 by country and age: results of EUROCARE5-a population-based study
Lancet Oncol.
(2014) - et al.
Cancer survival in Australia, Canada, Denmark, Norway, Sweden, and the UK, 1995-2007 (the International Cancer Benchmarking Partnership): an analysis of population-based cancer registry data
Lancet
(2011) - et al.
Hybrid analysis for up-to-date long-term survival rates in cancer registries with delayed recording of incident cases
Eur. J. Cancer
(2004) - et al.
Period analysis for up-to-date' cancer survival data: theory, empirical evaluation, computational realisation and applications
Eur. J. Cancer
(2004) - et al.
Annual report to the nation on the status of cancer, 19752001, with a special feature regarding survival
Cancer
(2004) - et al.
Trends in the overall survival of cancer patients diagnosed 19642003 in the Nordic countries followed up to the end of 2006: the importance of case-mix
Acta Oncol.
(2010) - et al.
On estimation in relative survival
Biometrics
(2012) - et al.
An alternative approach to monitoring cancer patient survival
Cancer
(1996)