Modelling the joint distribution of competing risks survival times using copula functions
Introduction
The competing risks model with independent failure time random variables has been considered by a number of authors in the (bio)statistical, econometric, medical, demographic and actuarial literature, and the list of references, scattered throughout these areas, is extensive. We will mention here the textbooks by David and Moeschberger (1978), Elandt-Johnson and Johnson (1980) and Bowers et al. (1997), and also the recent papers by Salinas-Torres et al. (2002) and Bryant and Dignam (2004), and also by Zheng and Klein, 1994, Zheng and Klein, 1995 where statistical methods for estimating related survival functions are considered.
The competing risk model, alternatively referred to as the multiple-decrement model, has also been considered under the assumption of dependence between the failure times in the early work of Elandt-Johnson (1976). Later, Yashin et al. (1986) considered conditional independence of the times to death, given an assumed stochastic covariate process. More recently, Carriere (1994) and Escarela and Carriere (2003) modelled dependence between two failure times by a two-dimensional copula. Carriere (1994) has used a bivariate Gaussian copula to model the effect of complete elimination of one of two competing causes of death on human mortality. However, the mortality data used by Carriere (1994) was not complete with respect to older ages, and therefore it was not possible to calculate such important survival characteristics as expected lifetimes and life annuities and draw relevant conclusions. In Escarela and Carriere (2003), the bivariate Frank copula was fitted to a prostate cancer data set. The issues of identifiability of marginal survival functions in a copula-based competing risk model have been considered by Tsiatis (1975), Prentice et al. (1978), Heckman and Honore (1989) and later by Carriere (1994).
In this paper, we will consider further the copula-based competing risk model, studied by Carriere (1994). We will investigate its sensitivity with respect to alternative choices of bivariate copula and its parameter(s). For this purpose, we have closed the survival model by applying a method of spline extrapolation up to a limiting age 120 and have explored the Gaussian copula, the Student’s -copula, the Frank copula and the Plackett copula as alternatives. As discussed in Section 3, these copulas allow for modelling the dependence between failure times within the entire age range, from perfectly negative, to perfectly positive dependence. They belong to different families with different properties, and hence are appropriate for studying the sensitivity of the model. We develop this methodology, so as to model the effect of both partial and complete elimination of a cause of death on human mortality. The construction of multiple-decrement tables, derived from the multivariate competing risk model, is also addressed.
Since most real-life applications are truly multivariate, i.e., there are more than two mutually dependent competing causes of decrement, our further goal here will be to extend and explore the applicability of the model to the multi-dimensional case. This is in general a difficult task, since the bivariate copula theory does not extend to the multivariate case in a direct way. Although some fundamental results (e.g. Sklar’s theorem) hold, constructing a multivariate copula is related to some open problems, e.g., there is no unique multivariate dependence measure which extends the (bivariate) definitions of Kendall’s and Spirman’s and the computational complexity increases. This makes multivariate copula applications less appealing. Here we have explored the applicability of the four-dimensional Gaussian, - and Frank copulas to model the joint distribution of four competing risks, heart diseases, cancer, respiratory diseases and other causes of death, grouped together. The effect of simultaneously removing one, two or three of them on the overall survival, on the life expectancy at birth and at age 65, and on the value of a life annuity, which are important in pricing life insurance products, is also studied.
In the next section, we introduce the dependent multiple-decrement model and the related crude, net and overall survival functions. Section 3 is devoted to copulas and their properties, and provides background material on the Gaussian, the Student’s -, the Frank and the Plackett copulas, which we have used to implement the proposed methodology. In Section 4, we address the problem of selecting an appropriate copula function and estimating its parameter(s). Then, in Section 5, we describe how, given some estimates of the crude survival functions and an appropriately selected copula, one can evaluate the net survival functions for the corresponding competing risks, by solving a system of non-linear differential equations. In Section 6, we show how, by introducing an appropriate function, one can modify the net survival functions, obtained as solutions of the system of differential equations, so as to model not only complete but also partial elimination of any of the causes of death in the model. Finally, in Section 7 the proposed methodology is applied to the general US population, using a cause-specific mortality data set, provided by the National Center for Health Statistics NCHS (1999). Extensive numerical results and graphs illustrate the effect on survival of complete (partial) elimination of cancer, as a cause of death, in a two-dimensional decrement model, and the elimination of any combination of heart diseases, cancer and respiratory diseases in a four variate model. Details of how the raw mortality data was used to obtain the crude survival functions and the method of smoothing and extrapolating the latter up to age 120 are provided in the Appendix.
Section snippets
The dependent multiple-decrement model
We consider a group of lives, exposed to competing causes of death, i.e., to causes of withdrawal from the group. It is assumed that each individual may die from any single one of the causes. To make the problem more formally (mathematically) tractable it is assumed that, at birth, each individual is assigned a vector of times , representing his/her potential lifetime, if he/she were to die from each one of the causes. Obviously, the actual lifetime span is the
Copulas and their properties
Copulas provide a very convenient way to model and measure the dependence between failure time random variables since they give the dependence structure which relates the known marginal distributions of the failure times to their multivariate joint distribution. In order to see this, we first provide a short introduction on copulas.
If we assume that , an copula is conventionally defined as a multivariate cumulative distribution function with uniform margins. A
Estimating the model parameters
As noted in Section 2, in order to introduce and evaluate the functions of interest, arising from the competing risk model, one needs to specify a suitable copula, define its parameters and provide estimates of the crude survival functions, based on an appropriate multiple, cause-specific mortality table. This will be discussed in somewhat greater detail in this section.
Evaluating the net and overall survival functions
Having fixed the copula function, , one may use (8) and evaluate the joint survival function if the net survival functions were known. In order to find them, we may use the relationship between and the crude survival functions, given by Heckman and Honore (1989) and also by Carriere (1994). Thus, under the assumption of differentiability of with respect to and of with respect to
Partial and complete disease elimination
In order to study the effect of partial and complete disease elimination, we have adopted the following approach. Let us recall that, in our model, we have assumed that are the future lifetime spans of a newborn individual, under the operation of causes of death, i.e., all the survival functions, introduced up to now, refer to age zero. We will now need to adjust explicitly the adopted notation for the crude and net survival functions, by adding a subscript, indicating age at
Numerical results
In this section, we apply the methodology, described earlier, to a real data set, related to the US female general population, in which the data are grouped by causes of death, using “Table 10. Number of life table deaths from specific causes during age interval for the female population: United States, 1989–91” of the US Decennial Life Tables for 1989–91 (see NCHS (1999)). For ease of presentation, we consider the two-dimensional and the multi-dimensional competing risk models separately.
Conclusions
The objective of this paper is to demonstrate how copulas may be used in the modelling of dependences among causes of death for the purposes of analyzing the impact of the complete or partial elimination of causes of death on survival functions and related indices, expectations of life and annuity values.
The paper extends the earlier work of Carriere (1994) and Valdez (2001) to include more than two competing risks, to investigate the sensitivity of the model to the choice of copula and to
Acknowledgement
The authors would like to thank the anonymous referee for his valuable comments and suggestions which helped to improve the presentation of the paper.
References (35)
Bivariate analysis of survivorship and persistency
Insurance: Mathematics and Economics
(2001)- et al.(1997)
A Practical Guide to Splines
(2001)- et al.
Semiparametric models for cumulative incidence functions
Biometrics
(2004) Approaches and experiences in projecting mortality patterns for oldest-old
North American Actuarial Journal
(2004)Dependent decrement theory
Transactions, Society of Actuaries
(1994)- et al.
Copula Methods in Finance
(2004) - et al.
Revised regional model life tables at very low levels of mortality
Population Index
(1989) - et al.
Defects in data on old-age mortality in the United States: New procedures for calculating schedules and life tables at the higher ages
Asian and Pacific Population Forum
(1990) - et al.
The Theory of Competing Risks
(1978)