Skip to main content
Log in

Selection between proportional and stratified hazards models based on expected log-likelihood

  • Original Paper
  • Published:
Computational Statistics Aims and scope Submit manuscript

Abstract

The problem of selecting between semi-parametric and proportional hazards models is considered. We propose to make this choice based on the expectation of the log-likelihood (ELL) which can be estimated by the likelihood cross-validation (LCV) criterion. The criterion is used to choose an estimator in families of semi-parametric estimators defined by the penalized likelihood. A simulation study shows that the ELL criterion performs nearly as well in this problem as the optimal Kullback–Leibler criterion in term of Kullback–Leibler distance and that LCV performs reasonably well. The approach is applied to a model of age-specific risk of dementia as a function of sex and educational level from the data of a large cohort study.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Akaike H (1973) Information theory and an extension of the maximum likelihood principle. In: Petrov B, Csaki F (eds) Second International Symposium on Information Theory. Budapest, Akademiai Kiado, pp 267–281

    Google Scholar 

  • Andersen PK, Borgan R, Gill R, Keiding D (1993) Statistical models based on counting processes. Springer, New York

    MATH  Google Scholar 

  • Commenges D, Letenneur L, Joly P, Alioum A, Dartigues J (1998) Modelling age-specific risk: application to dementia. Statistics in Medicine 17:1973–1988

    Article  Google Scholar 

  • Commenges D, Joly P, Letenneur L, Dartigues JF (2004) Incidence and prevalence of Alzheimer’s disease or dementia using an Illness-death model. Stat Med 23:199–210

    Article  Google Scholar 

  • Cox D (1972) Regression models and life tables (with discussion). J R Stat Soc B 34:187–220

    MATH  Google Scholar 

  • Craven P, Wahba G (1979) Smoothing noisy data with spline functions: estimating the correct degree of smoothing by the method of generalized cross-validation. Numer Math 31:377–403

    Article  MATH  MathSciNet  Google Scholar 

  • DeLeeuw J (1992) Introduction to Akaike (1973) information theory and an extension of the maximum likelihood principle. In: Kotz S, Johnson NL (eds) Breakthroughs in statistics, vol I. Foundations and basic theory. Springer, New York, pp 599–609

    Google Scholar 

  • Gray RJ (1994) Splines-based tests in survival analysis. Biometrics 50:640–652

    Article  MATH  MathSciNet  Google Scholar 

  • Hastie TJ, Tibshirani RJ (1990) Generalized additive models. Chapman and Hall, London

    MATH  Google Scholar 

  • Ishiguro M, Sakamoto Y, Kitagawa G (1997) Bootstrapping log likelihood and EIC, an extension of AIC. Ann Inst Stat Math 49:411–434

    Article  MATH  MathSciNet  Google Scholar 

  • Joly P, Commenges D, Letenneur L (1998) A penalized likelihood approach for arbitrarily censored and truncated data: application to age-specific incidence of dementia. Biometrics 54:185–194

    Article  MATH  Google Scholar 

  • Kooperberg C, Stone CJ, Truong YK (1995) Hazard regression. J Am Stat Assoc 90:78–94

    Article  MATH  MathSciNet  Google Scholar 

  • Letenneur L, Commenges D, Dartigues J, Barberger-Gateau P (1994) Incidence of dementia and alzheimer’s disease in elderly community residents of south-western france. Int J Epidemiol 23:1256–1261

    Article  Google Scholar 

  • Letenneur L, Gilleron V, Commenges D, Helmer C, Orgogozo J, Dartigues J (1999) Are sex and educational level independent predictors of dementia and alzheimer’s disease? Incidence data from the PAQUID project. J Neurol Neurosurg Psychiatry 66:177–183

    Article  Google Scholar 

  • Liquet B, Commenges D (2004) Estimating the expectation of the log-likelihood with censored data for estimator selection. Lifetime Data Anal 10:351–367

    Article  MATH  MathSciNet  Google Scholar 

  • Liquet B, Sakarovitch C, Commenges D (2003) Bootstrap choice of estimators in non-parametric families: an extension of EIC. Biometrics 59:172–178

    Article  MathSciNet  Google Scholar 

  • Miller AJ (1990) Subset selection in regression. Chapman and Hall, New York

    MATH  Google Scholar 

  • O’Sullivan F (1988) Fast computation of fully automated log-density and log-hazard estimators. SIAM J Sci Stat Comput 9:363–379

    Article  MATH  MathSciNet  Google Scholar 

  • Ramlau-Hansen H (1983) Smoothing counting process intensities by means of kernel functions. Ann Stat 11:453–466

    MATH  MathSciNet  Google Scholar 

  • Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6:461–464

    MATH  Google Scholar 

  • Silverman B (1986) Density estimation for statistics and data analysis. Chapman and Hall, London

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Benoit Liquet.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liquet, B., Saracco, J. & Commenges, D. Selection between proportional and stratified hazards models based on expected log-likelihood. Computational Statistics 22, 619–634 (2007). https://doi.org/10.1007/s00180-007-0079-3

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00180-007-0079-3

Keywords

Navigation