Skip to main content

Targeted Learning

  • Chapter
  • First Online:
Ensemble Machine Learning

Abstract

Suppose we observe n i.i.d. copies O1, , On of a random variable O with probability distribution P0, and assume that it is known that \({P}_{0} \in \mathcal{M}\) for some set of probability distributions \(\mathcal{M}\). One refers to \(\mathcal{M}\) as the statistical model for P0. We consider so called semiparametric models that cannot be parameterized by a finite dimensional Euclidean vector. In addition, suppose that our target parameter of interest is a parameter \(\Psi : \mathcal{M}\rightarrow \mathcal{F} =\{ \Psi (P) : P \in \mathcal{M}\}\), so that ψ0 = Ψ(P0) denotes the parameter value of interest.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 189.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 249.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 249.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. H. Akaike. Information theory and an extension of the maximum likelihood principle. In B.N. Petrov and F. Csaki, editors, Second International Symposium on Information Theory, Budapest, 1973. Academiai Kiado.

    Google Scholar 

  2. C. Ambroise and G.J. McLachlan. Selection bias in gene extraction on the basis of microarray gene-expression data. Proc Natl Acad Sci, 99(10):6562–6566, 2002.

    Article  MATH  Google Scholar 

  3. O. Bembom and M.J. van der Laan. A practical illustration of the importance of realistic individualized treatment rules in causal inference. Electron J Stat, 1:574–596, 2007.

    Article  MathSciNet  MATH  Google Scholar 

  4. O. Bembom, M.L. Petersen, S.-Y. Rhee, W.J. Fessel, S.E. Sinisi, R.W. Shafer, and M.J. van der Laan. Biomarker discovery using targeted maximum likelihood estimation: application to the treatment of antiretroviral resistant HIV infection. Stat Med, 28:152–172, 2009.

    Article  MathSciNet  Google Scholar 

  5. P.J. Bickel, C.A.J. Klaassen, Y. Ritov, and J. Wellner. Efficient and adaptive estimation for semiparametric models. Springer, Berlin/Heidelberg/New York, 1997.

    MATH  Google Scholar 

  6. H. Bozdogan. Choosing the number of component clusters in the mixture model using a new informational complexity criterion of the inverse fisher information matrix. In O. Opitz, B. Lausen, and R. Klar, editors, Information and classification. Springer, Berlin/Heidelberg/New York, 1993.

    Google Scholar 

  7. H. Bozdogan. Akaike’s information criterion and recent developments in information complexity. J Math Psychol, 44:62–91, 2000.

    Article  MathSciNet  MATH  Google Scholar 

  8. L. Breiman. Heuristics of instability and stabilization in model selection. Ann Stat, 24(6): 2350–2383, 1996a.

    Article  MathSciNet  MATH  Google Scholar 

  9. L. Breiman. Out-of-bag estimation. Technical Report, Department of Statistics, University of California, Berkeley, 1996b.

    Google Scholar 

  10. L. Breiman. Stacked regressions. Mach Learn, 24:49–64, 1996c.

    Article  MATH  Google Scholar 

  11. L. Breiman. Bagging predictors. Mach Learn, 24(2):123–140, 1996d.

    Article  MATH  Google Scholar 

  12. L. Breiman. Arcing classifiers. Ann Stat, 26:801–824, 1998.

    MathSciNet  MATH  Google Scholar 

  13. L. Breiman. Random forests. Mach Learn, 45:5–32, 2001.

    Article  MATH  Google Scholar 

  14. L. Breiman and P. Spector. Submodel selection and evaluation in regression. The X random case. Int Stat Rev, 60:291–319, 1992.

    Article  Google Scholar 

  15. L. Breiman, J.H. Friedman, R. Olshen, and C.J. Stone. Classification and regression trees. Chapman & Hall, Boca Raton, 1984.

    MATH  Google Scholar 

  16. F. Bunea, A.B. Tsybakov, and M.H. Wegkamp. Aggregation and sparsity via L1 penalized least squares. In G. Lugosi and H.-U. Simon, editors, COLT, volume 4005 of Lecture Notes in Computer Science, Berlin/Heidelberg/New York, 2006. Springer.

    Google Scholar 

  17. F. Bunea, A.B. Tsybakov, and M.H. Wegkamp. Aggregation for gaussian regression. Ann Stat, 35(4):1674–1697, 2007a.

    Article  MathSciNet  MATH  Google Scholar 

  18. F. Bunea, A.B. Tsybakov, and M.H. Wegkamp. Sparse density estimation with L1 penalties. In N.H. Bshouty and C. Gentile, editors, COLT, volume 4539 of Lecture Notes in Computer Science, Berlin/Heidelberg/New York, 2007b. Springer.

    Google Scholar 

  19. C.-C. Chang and C.-J. Lin. LIBSVM: a library for support vector machines, 2001. URL http://www.csie.ntu.edu.tw/∼cjlin/libsvm.

  20. H.A. Chipman and R.E. McCulloch. BayesTree: Bayesian methods for tree-based models, 2009. URL http://CRAN.R-project.org/package=BayesTree. R package version 0.3-1.

  21. H.A. Chipman, E.I. George, and R.E. McCulloch. BART: Bayesian additive regression trees. Ann Appl Stat, 4(1):266–298, 2010.

    Article  MathSciNet  MATH  Google Scholar 

  22. W.S. Cleveland, E. Groose, and W.M. Shyu. Local regression models. In J.M. Chambers and T.J. Hastie, editors, Statistical models in S. Chapman & Hall, Boca Raton, 1992.

    Google Scholar 

  23. A.S. Dalalyan and A.B. Tsybakov. Aggregation by exponential weighting and sharp oracle inequalities. In N.H. Bshouty and C. Gentile, editors, COLT, volume 4539 of Lecture Notes in Computer Science, Berlin/Heidelberg/New York, 2007. Springer.

    Google Scholar 

  24. A.S. Dalalyan and A.B. Tsybakov. Aggregation by exponential weighting, sharp pac-Bayesian bounds and sparsity. Mach Learn, 72(1–2):39–61, 2008.

    Article  MATH  Google Scholar 

  25. E. Dimitriadou, K. Hornik, F. Leisch, D. Meyer, and A. Weingessel. e1071: misc functions of the Department of Statistics (e1071), 2009. URL http://CRAN.R-project.org/package=e1071. R package version 1.5-22.

  26. S. Dudoit and M.J. van der Laan. Asymptotics of cross-validated risk estimation in estimator selection and performance assessment. Stat Methodol, 2(2):131–154, 2005.

    Article  MathSciNet  MATH  Google Scholar 

  27. B. Efron and R. J. Tibshirani. An Introduction to the bootstrap. Chapman & Hall, Boca Raton, 1993.

    Book  MATH  Google Scholar 

  28. J.H. Friedman. Multivariate adaptive regression splines. Ann Stat, 19(1):1–141, 1991.

    MathSciNet  MATH  Google Scholar 

  29. J.H. Friedman. Greedy function approximation: a gradient boosting machine. Ann Stat, 29:1189–1232, 2001.

    Article  MathSciNet  MATH  Google Scholar 

  30. J.H. Friedman, T.J. Hastie, and R.J. Tibshirani. Regularization paths for generalized linear models via coordinate descent. J Stat Softw, 33(1), 2010a.

    Google Scholar 

  31. J.H. Friedman, T.J. Hastie, and R.J. Tibshirani. glmnet: lasso and elastic-net regularized generalized linear models, 2010b. URL http://CRAN.R-project.org/package=glmnet. R package version 1.1–5.

  32. S. Geisser. The predictive sample reuse method with applications. J Am Stat Assoc, 70(350):320–328, 1975.

    Article  MATH  Google Scholar 

  33. A. Gelman, A. Jakulin, M.G. Pittau, and Y.-S. Su. A weakly informative default prior distribution for logistic and other regression models. Ann Appl Stat, 2(3):1360–1383, 2009.

    MathSciNet  MATH  Google Scholar 

  34. A. Gelman, Y.-S. Su, M. Yajima, J. Hill, M.G. Pittau, J. Kerman, and T. Zheng. arm: data analysis using regression and multilevel/hierarchical models, 2010. URL http://CRAN.R-project.org/package=arm. R package version 1.3-02.

  35. R.D. Gill, M.J. van der Laan, and J.M. Robins. Coarsening at random: characterizations, conjectures and counter-examples. In D.Y. Lin and T.R. Fleming, editors, Proceedings of the First Seattle Symposium in Biostatistics, pages 255–94, New York, 1997. Springer Verlag.

    Google Scholar 

  36. S. Gruber and M.J. van der Laan. An application of collaborative targeted maximum likelihood estimation in causal inference and genomics. Int J Biostat, 6(1), 2010.

    Google Scholar 

  37. L. Györfi, M. Kohler, A. Krzyżak, and H. Walk. A distribution-free theory of nonparametric regression. Springer, Berlin/Heidelberg/New York, 2002.

    Book  MATH  Google Scholar 

  38. T.J. Hastie. Generalized additive models. In J.M. Chambers and T.J. Hastie, editors, Statistical models in S. Chapman & Hall, Boca Raton, 1992.

    Google Scholar 

  39. T.J. Hastie and R.J. Tibshirani. Generalized additive models. Chapman & Hall, Boca Raton, 1990.

    MATH  Google Scholar 

  40. T.J. Hastie, R.J. Tibshirani, and J.H. Friedman. The elements of statistical learning: data mining, inference, and prediction. Springer, Berlin/Heidelberg/New York, 2001.

    Book  MATH  Google Scholar 

  41. M.A. Hernan, B. Brumback, and J.M. Robins. Marginal structural models to estimate the causal effect of zidovudine on the survival of HIV-positive men. Epidemiol, 11(5):561–570, 2000.

    Article  Google Scholar 

  42. A. Juditsky, A.V. Nazin, A.B. Tsybakov, and N. Vayatis. Generalization error bounds for aggregation by mirror descent with averaging. In NIPS, 2005.

    Google Scholar 

  43. S. Keleş, M.J. van der Laan, and S. Dudoit. Asymptotically optimal model selection method for regression on censored outcomes. Technical Report 124, Division of Biostatistics, University of California, Berkeley, 2002.

    Google Scholar 

  44. C. Kooperberg. polspline: Polynomial spline routines, 2009. URL http://CRAN.R-project.org/package=polspline. R package version 1.1.4.

  45. M. LeBlanc and R.J. Tibshirani. Combining estimates in regression and classification. J Am Stat Assoc, 91:1641–1650, 1996.

    MathSciNet  MATH  Google Scholar 

  46. A. Liaw and M. Wiener. Classification and regression by randomforest. R News, 2(3):18–22, 2002. URL http://CRAN.R-project.org/package=randomForest.

  47. K.L. Moore and M.J. van der Laan. Application of time-to-event methods in the assessment of safety in clinical trials. In Karl E. Peace, editor, Design, summarization, analysis & interpretation of clinical trials with time-to-event endpoints, Boca Raton, 2009a. Chapman & Hall.

    Google Scholar 

  48. K.L. Moore and M.J. van der Laan. Covariate adjustment in randomized trials with binary outcomes: targeted maximum likelihood estimation. Stat Med, 28(1):39–64, 2009b.

    Article  MathSciNet  Google Scholar 

  49. K.L. Moore and M.J. van der Laan. Increasing power in randomized trials with right censored outcomes through covariate adjustment. J Biopharm Stat, 19(6):1099–1131, 2009c.

    Article  MathSciNet  Google Scholar 

  50. R. Neugebauer and J. Bullard. DSA: Data-adaptive estimation with cross-validation and the D/S/A algorithm, 2009. URL http://www.stat.berkeley.edu/∼laan/Software/. R package version 3.1.3.

  51. M. Pavlic and M.J. van der Laan. Fitting of mixtures with unspecified number of components using cross validation distance estimate. Comput Stat Data An, 41:413–428, 2003.

    Article  MathSciNet  MATH  Google Scholar 

  52. A. Peters and T. Hothorn. ipred: Improved Predictors, 2009. URL http://CRAN.R-project.org/package=ipred. R package version 0.8-8.

  53. E.C. Polley and M.J. van der Laan. Predicting optimal treatment assignment based on prognostic factors in cancer patients. In K.E. Peace, editor, Design, summarization, analysis & interpretation of clinical trials with time-to-event endpoints, Boca Raton, 2009. Chapman & Hall.

    Google Scholar 

  54. E.C. Polley and M.J. van der Laan. Super learner in prediction. Technical Report 266, Division of Biostatistics, University of California, Berkeley, 2010.

    Google Scholar 

  55. R Development Core Team. R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, 2010. URL http://www.R-project.org.

  56. G. Ridgeway. gbm: generalized boosted regression models, 2007. R package version 1.6-3.

    Google Scholar 

  57. B.D. Ripley. Pattern recognition and neural networks. Cambridge, New York, 1996.

    Book  MATH  Google Scholar 

  58. J. Rissanen. Modelling by shortest data description. Automatica, 14:465–471, 1978.

    Article  MATH  Google Scholar 

  59. J.M. Robins. Marginal structural models versus structural nested models as tools for causal inference. In Statistical models in epidemiology: the environment and clinical trials. Springer, Berlin/Heidelberg/New York, 1999.

    Google Scholar 

  60. J.M. Robins. Robust estimation in sequentially ignorable missing data and causal inference models. In Proceedings of the American Statistical Association on Bayesian Statistical Science 1999. pp. 6–10. 2000.

    Google Scholar 

  61. J.M. Robins and A. Rotnitzky. Recovery of information and adjustment for dependent censoring using surrogate markers. AIDS Epidemiology – Methodological issues, eds. N. Jewell, K. Dietz, V. Farewell, Boston, MA: Bikhäuser. pp. 297–331 (includes errata sheet). 1992.

    Google Scholar 

  62. J.M. Robins and A. Rotnitzky. Comment on the Bickel and Kwon article, “Inference for semiparametric models: some questions and an answer”. Stat Sinica, 11(4):920–936, 2001.

    Google Scholar 

  63. J.M. Robins, A. Rotnitzky, and M.J. van der Laan. Comment on “On profile likelihood”. J Am Stat Assoc, 450:431–435, 2000.

    Google Scholar 

  64. S. Rose and M.J. van der Laan. Simple optimal weighting of cases and controls in case-control studies. Int J Biostat, 4(1):Article 19, 2008.

    Google Scholar 

  65. S. Rose and M.J. van der Laan. Why match? Investigating matched case-control study designs with causal effect estimation. Int J Biostat, 5(1):Article 1, 2009.

    Google Scholar 

  66. S. Rose and M.J. van der Laan. A targeted maximum likelihood estimator for two-stage designs. Int J Biostat, 7(17), 2011.

    Google Scholar 

  67. M. Rosenblum and M.J. van der Laan. Targeted maximum likelihood estimation of the parameter of a marginal structural model. Int J Biostat, 6(2):Article 19, 2010.

    Google Scholar 

  68. M. Rosenblum, S.G. Deeks, M.J. van der Laan, and D.R. Bangsberg. The risk of virologic failure decreases with duration of HIV suppression, at greater than 50% adherence to antiretroviral therapy. PLoS ONE, 4(9): e7196.doi:10.1371/journal.pone.0007196, 2009.

    Google Scholar 

  69. Donald B. Rubin. Bayesian inference for causal effects: the role of randomization. Ann Stat, 6:34–58, 1978.

    Article  MathSciNet  MATH  Google Scholar 

  70. D.O. Scharfstein, A. Rotnitzky, and J.M. Robins. Adjusting for nonignorable drop-out using semiparametric nonresponse models, (with discussion and rejoinder). J Am Stat Assoc, 94:1096–1120 (1121–1146), 1999.

    Article  MATH  Google Scholar 

  71. G. Schwartz. Estimating the dimension of a model. Ann Stat, 6:461–464, 1978.

    MathSciNet  Google Scholar 

  72. S.E. Sinisi and M.J. van der Laan. Deletion/Substitution/Addition algorithm in learning with applications in genomics. Stat Appl Genet Mol, 3(1), 2004. Article 18.

    MathSciNet  MATH  Google Scholar 

  73. O.M. Stitelman and M.J. van der Laan. Collaborative targeted maximum likelihood for time-to-event data. Int J Biostat, 6(1):Article 21, 2010.

    Google Scholar 

  74. O.M. Stitelman and M.J. van der Laan. Targeted maximum likelihood estimation of time-to-event parameters with time-dependent covariates. Technical Report, Division of Biostatistics, University of California, Berkeley, 2011a.

    Google Scholar 

  75. O.M. Stitelman and M.J. van der Laan. Targeted maximum likelihood estimation of effect modification parameters in survival analysis. Int J Biostat, 7(1), 2011b.

    Google Scholar 

  76. M. Stone. Cross-validatory choice and assessment of statistical predictions. J R Stat Soc Ser B, 36(2):111–147, 1974.

    MathSciNet  MATH  Google Scholar 

  77. M. Stone. Asymptotics for and against cross-validation. Biometrika, 64(1):29–35, 1977.

    Article  MathSciNet  MATH  Google Scholar 

  78. A.B. Tsybakov. Optimal rates of aggregation. In B. Schölkopf and M.K. Warmuth, editors, COLT, volume 2777 of Lecture Notes in Computer Science, Berlin/Heidelberg/New York, 2003. Springer.

    Google Scholar 

  79. M.J. van der Laan. Estimation based on case-control designs with known prevalance probability. Int J Biostat, 4(1):Article 17, 2008.

    Google Scholar 

  80. M.J. van der Laan. Targeted maximum likelihood based causal inference: Part I. Int J Biostat, 6(2):Article 2, 2010.

    Google Scholar 

  81. M.J. van der Laan and S. Dudoit. Unified cross-validation methodology for selection among estimators and a general cross-validated adaptive epsilon-net estimator: finite sample oracle inequalities and examples. Technical Report 130, Division of Biostatistics, University of California, Berkeley, 2003.

    Google Scholar 

  82. M.J. van der Laan and S. Gruber. Collaborative double robust penalized targeted maximum likelihood estimation. Int J Biostat, 6(1), 2010.

    Google Scholar 

  83. M.J. van der Laan and J.M. Robins. Unified methods for censored longitudinal data and causality. Springer, Berlin/Heidelberg/New York, 2003.

    Book  MATH  Google Scholar 

  84. M.J. van der Laan and S. Rose. Targeted Learning: Causal Inference for Observational and Experimental Data. Springer, Berlin/Heidelberg/New York, 2011.

    Book  Google Scholar 

  85. M.J. van der Laan and Daniel B. Rubin. Targeted maximum likelihood learning. Int J Biostat, 2(1):Article 11, 2006.

    Google Scholar 

  86. M.J. van der Laan, S. Dudoit, and S. Keleş. Asymptotic optimality of likelihood-based cross-validation. Stat Appl Genet Mol, 3(1):Article 4, 2004.

    Google Scholar 

  87. M.J. van der Laan, S. Dudoit, and A.W. van der Vaart. The cross-validated adaptive epsilon-net estimator. Stat Decis, 24(3):373–395, 2006.

    Article  MathSciNet  MATH  Google Scholar 

  88. M.J. van der Laan, E.C. Polley, and A.E. Hubbard. Super learner. Stat Appl Genet Mol, 6(1):Article 25, 2007.

    Google Scholar 

  89. A.W. van der Vaart. Asymptotic statistics. Cambridge, New York, 1998.

    Book  MATH  Google Scholar 

  90. A.W. van der Vaart and J.A. Wellner. Weak convergence and empirical processes. Springer, Berlin/Heidelberg/New York, 1996.

    Book  MATH  Google Scholar 

  91. A.W. van der Vaart, S. Dudoit, and M.J. van der Laan. Oracle inequalities for multi-fold cross-validation. Stat Decis, 24(3):351–371, 2006.

    Article  MathSciNet  MATH  Google Scholar 

  92. W.N. Venables and B.D. Ripley. Modern applied statistics with S. Springer, Berlin/Heidelberg/New York, 4th edition, 2002.

    Book  MATH  Google Scholar 

  93. H. Wang, S. Rose, and M.J. van der Laan. Finding quantitative trait loci genes with collaborative targeted maximum likelihood learning. Stat Prob Lett, published online 11 Nov (doi: 10.1016/j.spl.2010.11.001), 2010.

    Google Scholar 

  94. Y. Wang, O. Bembom, and M.J. van der Laan. Data adaptive estimation of the treatment specific mean. J Stat Plan Infer, 137(6):1871–1877, 2007.

    Article  MathSciNet  MATH  Google Scholar 

  95. D. H. Wolpert. Stacked generalization. Neural Networks, 5:241–259, 1992.

    Article  Google Scholar 

  96. W. Zheng and M.J. van der Laan. Asymptotic theory for cross-validated targeted maximum likelihood estimation. Technical Report 273, Division of Biostatistics, University of California, Berkeley, 2010.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Maya L. Petersen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

van der Laan, M.J., Petersen, M.L. (2012). Targeted Learning. In: Zhang, C., Ma, Y. (eds) Ensemble Machine Learning. Springer, New York, NY. https://doi.org/10.1007/978-1-4419-9326-7_4

Download citation

  • DOI: https://doi.org/10.1007/978-1-4419-9326-7_4

  • Published:

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-1-4419-9325-0

  • Online ISBN: 978-1-4419-9326-7

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics