Modelling the joint distribution of competing risks survival times using copula functions

https://doi.org/10.1016/j.insmatheco.2006.11.006Get rights and content

Abstract

The problem of modelling the joint distribution of survival times in a competing risks model, using copula functions, is considered. In order to evaluate this joint distribution and the related overall survival function, a system of non-linear differential equations is solved, which relates the crude and net survival functions of the modelled competing risks, through the copula. A similar approach to modelling dependent multiple decrements was applied by Carriere [Carriere, J., 1994. Dependent decrement theory. Transactions, Society of Actuaries XLVI, 45–65] who used a Gaussian copula applied to an incomplete double-decrement model which makes it difficult to calculate any actuarial functions and draw relevant conclusions. Here, we extend this methodology by studying the effect of complete and partial elimination of up to four competing risks on the overall survival function, the life expectancy and life annuity values. We further investigate how different choices of the copula function affect the resulting joint distribution of survival times and in particular the actuarial functions which are of importance in pricing life insurance and annuity products. For illustrative purposes, we have used a real data set and used extrapolation to prepare a complete multiple-decrement model up to age 120. Extensive numerical results illustrate the sensitivity of the model with respect to the choice of copula and its parameter(s).

Introduction

The competing risks model with independent failure time random variables has been considered by a number of authors in the (bio)statistical, econometric, medical, demographic and actuarial literature, and the list of references, scattered throughout these areas, is extensive. We will mention here the textbooks by David and Moeschberger (1978), Elandt-Johnson and Johnson (1980) and Bowers et al. (1997), and also the recent papers by Salinas-Torres et al. (2002) and Bryant and Dignam (2004), and also by Zheng and Klein, 1994, Zheng and Klein, 1995 where statistical methods for estimating related survival functions are considered.

The competing risk model, alternatively referred to as the multiple-decrement model, has also been considered under the assumption of dependence between the failure times in the early work of Elandt-Johnson (1976). Later, Yashin et al. (1986) considered conditional independence of the times to death, given an assumed stochastic covariate process. More recently, Carriere (1994) and Escarela and Carriere (2003) modelled dependence between two failure times by a two-dimensional copula. Carriere (1994) has used a bivariate Gaussian copula to model the effect of complete elimination of one of two competing causes of death on human mortality. However, the mortality data used by Carriere (1994) was not complete with respect to older ages, and therefore it was not possible to calculate such important survival characteristics as expected lifetimes and life annuities and draw relevant conclusions. In Escarela and Carriere (2003), the bivariate Frank copula was fitted to a prostate cancer data set. The issues of identifiability of marginal survival functions in a copula-based competing risk model have been considered by Tsiatis (1975), Prentice et al. (1978), Heckman and Honore (1989) and later by Carriere (1994).

In this paper, we will consider further the copula-based competing risk model, studied by Carriere (1994). We will investigate its sensitivity with respect to alternative choices of bivariate copula and its parameter(s). For this purpose, we have closed the survival model by applying a method of spline extrapolation up to a limiting age 120 and have explored the Gaussian copula, the Student’s t-copula, the Frank copula and the Plackett copula as alternatives. As discussed in Section 3, these copulas allow for modelling the dependence between failure times within the entire age range, from perfectly negative, to perfectly positive dependence. They belong to different families with different properties, and hence are appropriate for studying the sensitivity of the model. We develop this methodology, so as to model the effect of both partial and complete elimination of a cause of death on human mortality. The construction of multiple-decrement tables, derived from the multivariate competing risk model, is also addressed.

Since most real-life applications are truly multivariate, i.e., there are more than two mutually dependent competing causes of decrement, our further goal here will be to extend and explore the applicability of the model to the multi-dimensional case. This is in general a difficult task, since the bivariate copula theory does not extend to the multivariate case in a direct way. Although some fundamental results (e.g. Sklar’s theorem) hold, constructing a multivariate copula is related to some open problems, e.g., there is no unique multivariate dependence measure which extends the (bivariate) definitions of Kendall’s τ and Spirman’s ρS and the computational complexity increases. This makes multivariate copula applications less appealing. Here we have explored the applicability of the four-dimensional Gaussian, t- and Frank copulas to model the joint distribution of four competing risks, heart diseases, cancer, respiratory diseases and other causes of death, grouped together. The effect of simultaneously removing one, two or three of them on the overall survival, on the life expectancy at birth and at age 65, and on the value of a life annuity, which are important in pricing life insurance products, is also studied.

In the next section, we introduce the dependent multiple-decrement model and the related crude, net and overall survival functions. Section 3 is devoted to copulas and their properties, and provides background material on the Gaussian, the Student’s t-, the Frank and the Plackett copulas, which we have used to implement the proposed methodology. In Section 4, we address the problem of selecting an appropriate copula function and estimating its parameter(s). Then, in Section 5, we describe how, given some estimates of the crude survival functions and an appropriately selected copula, one can evaluate the net survival functions for the corresponding competing risks, by solving a system of non-linear differential equations. In Section 6, we show how, by introducing an appropriate function, one can modify the net survival functions, obtained as solutions of the system of differential equations, so as to model not only complete but also partial elimination of any of the causes of death in the model. Finally, in Section 7 the proposed methodology is applied to the general US population, using a cause-specific mortality data set, provided by the National Center for Health Statistics NCHS (1999). Extensive numerical results and graphs illustrate the effect on survival of complete (partial) elimination of cancer, as a cause of death, in a two-dimensional decrement model, and the elimination of any combination of heart diseases, cancer and respiratory diseases in a four variate model. Details of how the raw mortality data was used to obtain the crude survival functions and the method of smoothing and extrapolating the latter up to age 120 are provided in the Appendix.

Section snippets

The dependent multiple-decrement model

We consider a group of lives, exposed to m competing causes of death, i.e., to m causes of withdrawal from the group. It is assumed that each individual may die from any single one of the m causes. To make the problem more formally (mathematically) tractable it is assumed that, at birth, each individual is assigned a vector of times T1,,Tm,0Tj<,j=1,,m, representing his/her potential lifetime, if he/she were to die from each one of the m causes. Obviously, the actual lifetime span is the

Copulas and their properties

Copulas provide a very convenient way to model and measure the dependence between failure time random variables since they give the dependence structure which relates the known marginal distributions of the failure times to their multivariate joint distribution. In order to see this, we first provide a short introduction on copulas.

If we assume that u=(u1,,um),uj[0,1], an m copula C(u) is conventionally defined as a multivariate cumulative distribution function with uniform margins. A

Estimating the model parameters

As noted in Section 2, in order to introduce and evaluate the functions of interest, arising from the competing risk model, one needs to specify a suitable copula, define its parameters and provide estimates of the crude survival functions, based on an appropriate multiple, cause-specific mortality table. This will be discussed in somewhat greater detail in this section.

Evaluating the net and overall survival functions

Having fixed the copula function, C(u1,,um), one may use (8) and evaluate the joint survival functionS(t1,,tm)=C(S(1)(t1),,S(m)(tm)) if the net survival functions S(j)(tj),j=1,,m were known. In order to find them, we may use the relationship between S(j)(t) and the crude survival functions, S(j)(t),j=1,,m given by Heckman and Honore (1989) and also by Carriere (1994). Thus, under the assumption of differentiability of C(u1,,um) with respect to uj(0,1) and of S(j)(tj) with respect to

Partial and complete disease elimination

In order to study the effect of partial and complete disease elimination, we have adopted the following approach. Let us recall that, in our model, we have assumed that T1,T2,,Tm are the future lifetime spans of a newborn individual, under the operation of m causes of death, i.e., all the survival functions, introduced up to now, refer to age zero. We will now need to adjust explicitly the adopted notation for the crude and net survival functions, by adding a 0 subscript, indicating age at

Numerical results

In this section, we apply the methodology, described earlier, to a real data set, related to the US female general population, in which the data are grouped by causes of death, using “Table 10. Number of life table deaths from specific causes during age interval for the female population: United States, 1989–91” of the US Decennial Life Tables for 1989–91 (see NCHS (1999)). For ease of presentation, we consider the two-dimensional and the multi-dimensional competing risk models separately.

Conclusions

The objective of this paper is to demonstrate how copulas may be used in the modelling of dependences among causes of death for the purposes of analyzing the impact of the complete or partial elimination of causes of death on survival functions and related indices, expectations of life and annuity values.

The paper extends the earlier work of Carriere (1994) and Valdez (2001) to include more than two competing risks, to investigate the sensitivity of the model to the choice of copula and to

Acknowledgement

The authors would like to thank the anonymous referee for his valuable comments and suggestions which helped to improve the presentation of the paper.

References (35)

  • E. Valdez

    Bivariate analysis of survivorship and persistency

    Insurance: Mathematics and Economics

    (2001)
  • N.L. Bowers et al.
    (1997)
  • C. De Boor

    A Practical Guide to Splines

    (2001)
  • J. Bryant et al.

    Semiparametric models for cumulative incidence functions

    Biometrics

    (2004)
  • T. Buettner

    Approaches and experiences in projecting mortality patterns for oldest-old

    North American Actuarial Journal

    (2004)
  • J. Carriere

    Dependent decrement theory

    Transactions, Society of Actuaries

    (1994)
  • U. Cherubini et al.

    Copula Methods in Finance

    (2004)
  • A. Coale et al.

    Revised regional model life tables at very low levels of mortality

    Population Index

    (1989)
  • A. Coale et al.

    Defects in data on old-age mortality in the United States: New procedures for calculating schedules and life tables at the higher ages

    Asian and Pacific Population Forum

    (1990)
  • H.A. David et al.

    The Theory of Competing Risks

    (1978)
  • Deheuvels, P., 1978. Caractérisation complète des Lois extrèmes multivariées et de la convergence des types extrèmes,...
  • P. Dellaportas et al.

    Bayesian analysis of mortality data

    Journal of the Royal Statistical Society, Series A

    (2001)
  • R.C. Elandt-Johnson

    Conditional failure time distributions under competing risk theory with dependent failure times and proportional hazard rates

    Scandinavian Actuarial Journal

    (1976)
  • R.C. Elandt-Johnson et al.

    Survival Models and Data Analysis

    (1980)
  • P. Embrechts et al.

    Modelling dependence with copulas and applications to risk management

  • G. Escarela et al.

    Fitting competing risks with an assumed copula

    Statistical Methods in Medical Research

    (2003)
  • E. Frees et al.

    Understanding relationships using copulas

    North American Actuarial Journal

    (1998)
  • Cited by (0)

    View full text