Targeted Learning

van der Laan, Mark J.; Petersen, Maya L.

doi:10.1007/978-1-4419-9326-7_4

Mark J. van der Laan³ &
Maya L. Petersen³

13k Accesses

Abstract

Suppose we observe n i.i.d. copies O₁, …, O_n of a random variable O with probability distribution P₀, and assume that it is known that \({P}_{0} \in \mathcal{M}\) for some set of probability distributions \(\mathcal{M}\). One refers to \(\mathcal{M}\) as the statistical model for P₀. We consider so called semiparametric models that cannot be parameterized by a finite dimensional Euclidean vector. In addition, suppose that our target parameter of interest is a parameter \(\Psi : \mathcal{M}\rightarrow \mathcal{F} =\{ \Psi (P) : P \in \mathcal{M}\}\), so that ψ₀ = Ψ(P₀) denotes the parameter value of interest.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 189.00; Price excludes VAT (USA)

Softcover Book: USD 249.99; Price excludes VAT (USA)

Hardcover Book: USD 249.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

H. Akaike. Information theory and an extension of the maximum likelihood principle. In B.N. Petrov and F. Csaki, editors, Second International Symposium on Information Theory, Budapest, 1973. Academiai Kiado.
Google Scholar
C. Ambroise and G.J. McLachlan. Selection bias in gene extraction on the basis of microarray gene-expression data. Proc Natl Acad Sci, 99(10):6562–6566, 2002.
Article MATH Google Scholar
O. Bembom and M.J. van der Laan. A practical illustration of the importance of realistic individualized treatment rules in causal inference. Electron J Stat, 1:574–596, 2007.
Article MathSciNet MATH Google Scholar
O. Bembom, M.L. Petersen, S.-Y. Rhee, W.J. Fessel, S.E. Sinisi, R.W. Shafer, and M.J. van der Laan. Biomarker discovery using targeted maximum likelihood estimation: application to the treatment of antiretroviral resistant HIV infection. Stat Med, 28:152–172, 2009.
Article MathSciNet Google Scholar
P.J. Bickel, C.A.J. Klaassen, Y. Ritov, and J. Wellner. Efficient and adaptive estimation for semiparametric models. Springer, Berlin/Heidelberg/New York, 1997.
MATH Google Scholar
H. Bozdogan. Choosing the number of component clusters in the mixture model using a new informational complexity criterion of the inverse fisher information matrix. In O. Opitz, B. Lausen, and R. Klar, editors, Information and classification. Springer, Berlin/Heidelberg/New York, 1993.
Google Scholar
H. Bozdogan. Akaike’s information criterion and recent developments in information complexity. J Math Psychol, 44:62–91, 2000.
Article MathSciNet MATH Google Scholar
L. Breiman. Heuristics of instability and stabilization in model selection. Ann Stat, 24(6): 2350–2383, 1996a.
Article MathSciNet MATH Google Scholar
L. Breiman. Out-of-bag estimation. Technical Report, Department of Statistics, University of California, Berkeley, 1996b.
Google Scholar
L. Breiman. Stacked regressions. Mach Learn, 24:49–64, 1996c.
Article MATH Google Scholar
L. Breiman. Bagging predictors. Mach Learn, 24(2):123–140, 1996d.
Article MATH Google Scholar
L. Breiman. Arcing classifiers. Ann Stat, 26:801–824, 1998.
MathSciNet MATH Google Scholar
L. Breiman. Random forests. Mach Learn, 45:5–32, 2001.
Article MATH Google Scholar
L. Breiman and P. Spector. Submodel selection and evaluation in regression. The X random case. Int Stat Rev, 60:291–319, 1992.
Article Google Scholar
L. Breiman, J.H. Friedman, R. Olshen, and C.J. Stone. Classification and regression trees. Chapman & Hall, Boca Raton, 1984.
MATH Google Scholar
F. Bunea, A.B. Tsybakov, and M.H. Wegkamp. Aggregation and sparsity via L1 penalized least squares. In G. Lugosi and H.-U. Simon, editors, COLT, volume 4005 of Lecture Notes in Computer Science, Berlin/Heidelberg/New York, 2006. Springer.
Google Scholar
F. Bunea, A.B. Tsybakov, and M.H. Wegkamp. Aggregation for gaussian regression. Ann Stat, 35(4):1674–1697, 2007a.
Article MathSciNet MATH Google Scholar
F. Bunea, A.B. Tsybakov, and M.H. Wegkamp. Sparse density estimation with L1 penalties. In N.H. Bshouty and C. Gentile, editors, COLT, volume 4539 of Lecture Notes in Computer Science, Berlin/Heidelberg/New York, 2007b. Springer.
Google Scholar
C.-C. Chang and C.-J. Lin. LIBSVM: a library for support vector machines, 2001. URL http://www.csie.ntu.edu.tw/∼cjlin/libsvm.
H.A. Chipman and R.E. McCulloch. BayesTree: Bayesian methods for tree-based models, 2009. URL http://CRAN.R-project.org/package=BayesTree. R package version 0.3-1.
H.A. Chipman, E.I. George, and R.E. McCulloch. BART: Bayesian additive regression trees. Ann Appl Stat, 4(1):266–298, 2010.
Article MathSciNet MATH Google Scholar
W.S. Cleveland, E. Groose, and W.M. Shyu. Local regression models. In J.M. Chambers and T.J. Hastie, editors, Statistical models in S. Chapman & Hall, Boca Raton, 1992.
Google Scholar
A.S. Dalalyan and A.B. Tsybakov. Aggregation by exponential weighting and sharp oracle inequalities. In N.H. Bshouty and C. Gentile, editors, COLT, volume 4539 of Lecture Notes in Computer Science, Berlin/Heidelberg/New York, 2007. Springer.
Google Scholar
A.S. Dalalyan and A.B. Tsybakov. Aggregation by exponential weighting, sharp pac-Bayesian bounds and sparsity. Mach Learn, 72(1–2):39–61, 2008.
Article MATH Google Scholar
E. Dimitriadou, K. Hornik, F. Leisch, D. Meyer, and A. Weingessel. e1071: misc functions of the Department of Statistics (e1071), 2009. URL http://CRAN.R-project.org/package=e1071. R package version 1.5-22.
S. Dudoit and M.J. van der Laan. Asymptotics of cross-validated risk estimation in estimator selection and performance assessment. Stat Methodol, 2(2):131–154, 2005.
Article MathSciNet MATH Google Scholar
B. Efron and R. J. Tibshirani. An Introduction to the bootstrap. Chapman & Hall, Boca Raton, 1993.
Book MATH Google Scholar
J.H. Friedman. Multivariate adaptive regression splines. Ann Stat, 19(1):1–141, 1991.
MathSciNet MATH Google Scholar
J.H. Friedman. Greedy function approximation: a gradient boosting machine. Ann Stat, 29:1189–1232, 2001.
Article MathSciNet MATH Google Scholar
J.H. Friedman, T.J. Hastie, and R.J. Tibshirani. Regularization paths for generalized linear models via coordinate descent. J Stat Softw, 33(1), 2010a.
Google Scholar
J.H. Friedman, T.J. Hastie, and R.J. Tibshirani. glmnet: lasso and elastic-net regularized generalized linear models, 2010b. URL http://CRAN.R-project.org/package=glmnet. R package version 1.1–5.
S. Geisser. The predictive sample reuse method with applications. J Am Stat Assoc, 70(350):320–328, 1975.
Article MATH Google Scholar
A. Gelman, A. Jakulin, M.G. Pittau, and Y.-S. Su. A weakly informative default prior distribution for logistic and other regression models. Ann Appl Stat, 2(3):1360–1383, 2009.
MathSciNet MATH Google Scholar
A. Gelman, Y.-S. Su, M. Yajima, J. Hill, M.G. Pittau, J. Kerman, and T. Zheng. arm: data analysis using regression and multilevel/hierarchical models, 2010. URL http://CRAN.R-project.org/package=arm. R package version 1.3-02.
R.D. Gill, M.J. van der Laan, and J.M. Robins. Coarsening at random: characterizations, conjectures and counter-examples. In D.Y. Lin and T.R. Fleming, editors, Proceedings of the First Seattle Symposium in Biostatistics, pages 255–94, New York, 1997. Springer Verlag.
Google Scholar
S. Gruber and M.J. van der Laan. An application of collaborative targeted maximum likelihood estimation in causal inference and genomics. Int J Biostat, 6(1), 2010.
Google Scholar
L. Györfi, M. Kohler, A. Krzyżak, and H. Walk. A distribution-free theory of nonparametric regression. Springer, Berlin/Heidelberg/New York, 2002.
Book MATH Google Scholar
T.J. Hastie. Generalized additive models. In J.M. Chambers and T.J. Hastie, editors, Statistical models in S. Chapman & Hall, Boca Raton, 1992.
Google Scholar
T.J. Hastie and R.J. Tibshirani. Generalized additive models. Chapman & Hall, Boca Raton, 1990.
MATH Google Scholar
T.J. Hastie, R.J. Tibshirani, and J.H. Friedman. The elements of statistical learning: data mining, inference, and prediction. Springer, Berlin/Heidelberg/New York, 2001.
Book MATH Google Scholar
M.A. Hernan, B. Brumback, and J.M. Robins. Marginal structural models to estimate the causal effect of zidovudine on the survival of HIV-positive men. Epidemiol, 11(5):561–570, 2000.
Article Google Scholar
A. Juditsky, A.V. Nazin, A.B. Tsybakov, and N. Vayatis. Generalization error bounds for aggregation by mirror descent with averaging. In NIPS, 2005.
Google Scholar
S. Keleş, M.J. van der Laan, and S. Dudoit. Asymptotically optimal model selection method for regression on censored outcomes. Technical Report 124, Division of Biostatistics, University of California, Berkeley, 2002.
Google Scholar
C. Kooperberg. polspline: Polynomial spline routines, 2009. URL http://CRAN.R-project.org/package=polspline. R package version 1.1.4.
M. LeBlanc and R.J. Tibshirani. Combining estimates in regression and classification. J Am Stat Assoc, 91:1641–1650, 1996.
MathSciNet MATH Google Scholar
A. Liaw and M. Wiener. Classification and regression by randomforest. R News, 2(3):18–22, 2002. URL http://CRAN.R-project.org/package=randomForest.
K.L. Moore and M.J. van der Laan. Application of time-to-event methods in the assessment of safety in clinical trials. In Karl E. Peace, editor, Design, summarization, analysis & interpretation of clinical trials with time-to-event endpoints, Boca Raton, 2009a. Chapman & Hall.
Google Scholar
K.L. Moore and M.J. van der Laan. Covariate adjustment in randomized trials with binary outcomes: targeted maximum likelihood estimation. Stat Med, 28(1):39–64, 2009b.
Article MathSciNet Google Scholar
K.L. Moore and M.J. van der Laan. Increasing power in randomized trials with right censored outcomes through covariate adjustment. J Biopharm Stat, 19(6):1099–1131, 2009c.
Article MathSciNet Google Scholar
R. Neugebauer and J. Bullard. DSA: Data-adaptive estimation with cross-validation and the D/S/A algorithm, 2009. URL http://www.stat.berkeley.edu/∼laan/Software/. R package version 3.1.3.
M. Pavlic and M.J. van der Laan. Fitting of mixtures with unspecified number of components using cross validation distance estimate. Comput Stat Data An, 41:413–428, 2003.
Article MathSciNet MATH Google Scholar
A. Peters and T. Hothorn. ipred: Improved Predictors, 2009. URL http://CRAN.R-project.org/package=ipred. R package version 0.8-8.
E.C. Polley and M.J. van der Laan. Predicting optimal treatment assignment based on prognostic factors in cancer patients. In K.E. Peace, editor, Design, summarization, analysis & interpretation of clinical trials with time-to-event endpoints, Boca Raton, 2009. Chapman & Hall.
Google Scholar
E.C. Polley and M.J. van der Laan. Super learner in prediction. Technical Report 266, Division of Biostatistics, University of California, Berkeley, 2010.
Google Scholar
R Development Core Team. R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, 2010. URL http://www.R-project.org.
G. Ridgeway. gbm: generalized boosted regression models, 2007. R package version 1.6-3.
Google Scholar
B.D. Ripley. Pattern recognition and neural networks. Cambridge, New York, 1996.
Book MATH Google Scholar
J. Rissanen. Modelling by shortest data description. Automatica, 14:465–471, 1978.
Article MATH Google Scholar
J.M. Robins. Marginal structural models versus structural nested models as tools for causal inference. In Statistical models in epidemiology: the environment and clinical trials. Springer, Berlin/Heidelberg/New York, 1999.
Google Scholar
J.M. Robins. Robust estimation in sequentially ignorable missing data and causal inference models. In Proceedings of the American Statistical Association on Bayesian Statistical Science 1999. pp. 6–10. 2000.
Google Scholar
J.M. Robins and A. Rotnitzky. Recovery of information and adjustment for dependent censoring using surrogate markers. AIDS Epidemiology – Methodological issues, eds. N. Jewell, K. Dietz, V. Farewell, Boston, MA: Bikhäuser. pp. 297–331 (includes errata sheet). 1992.
Google Scholar
J.M. Robins and A. Rotnitzky. Comment on the Bickel and Kwon article, “Inference for semiparametric models: some questions and an answer”. Stat Sinica, 11(4):920–936, 2001.
Google Scholar
J.M. Robins, A. Rotnitzky, and M.J. van der Laan. Comment on “On profile likelihood”. J Am Stat Assoc, 450:431–435, 2000.
Google Scholar
S. Rose and M.J. van der Laan. Simple optimal weighting of cases and controls in case-control studies. Int J Biostat, 4(1):Article 19, 2008.
Google Scholar
S. Rose and M.J. van der Laan. Why match? Investigating matched case-control study designs with causal effect estimation. Int J Biostat, 5(1):Article 1, 2009.
Google Scholar
S. Rose and M.J. van der Laan. A targeted maximum likelihood estimator for two-stage designs. Int J Biostat, 7(17), 2011.
Google Scholar
M. Rosenblum and M.J. van der Laan. Targeted maximum likelihood estimation of the parameter of a marginal structural model. Int J Biostat, 6(2):Article 19, 2010.
Google Scholar
M. Rosenblum, S.G. Deeks, M.J. van der Laan, and D.R. Bangsberg. The risk of virologic failure decreases with duration of HIV suppression, at greater than 50% adherence to antiretroviral therapy. PLoS ONE, 4(9): e7196.doi:10.1371/journal.pone.0007196, 2009.
Google Scholar
Donald B. Rubin. Bayesian inference for causal effects: the role of randomization. Ann Stat, 6:34–58, 1978.
Article MathSciNet MATH Google Scholar
D.O. Scharfstein, A. Rotnitzky, and J.M. Robins. Adjusting for nonignorable drop-out using semiparametric nonresponse models, (with discussion and rejoinder). J Am Stat Assoc, 94:1096–1120 (1121–1146), 1999.
Article MATH Google Scholar
G. Schwartz. Estimating the dimension of a model. Ann Stat, 6:461–464, 1978.
MathSciNet Google Scholar
S.E. Sinisi and M.J. van der Laan. Deletion/Substitution/Addition algorithm in learning with applications in genomics. Stat Appl Genet Mol, 3(1), 2004. Article 18.
MathSciNet MATH Google Scholar
O.M. Stitelman and M.J. van der Laan. Collaborative targeted maximum likelihood for time-to-event data. Int J Biostat, 6(1):Article 21, 2010.
Google Scholar
O.M. Stitelman and M.J. van der Laan. Targeted maximum likelihood estimation of time-to-event parameters with time-dependent covariates. Technical Report, Division of Biostatistics, University of California, Berkeley, 2011a.
Google Scholar
O.M. Stitelman and M.J. van der Laan. Targeted maximum likelihood estimation of effect modification parameters in survival analysis. Int J Biostat, 7(1), 2011b.
Google Scholar
M. Stone. Cross-validatory choice and assessment of statistical predictions. J R Stat Soc Ser B, 36(2):111–147, 1974.
MathSciNet MATH Google Scholar
M. Stone. Asymptotics for and against cross-validation. Biometrika, 64(1):29–35, 1977.
Article MathSciNet MATH Google Scholar
A.B. Tsybakov. Optimal rates of aggregation. In B. Schölkopf and M.K. Warmuth, editors, COLT, volume 2777 of Lecture Notes in Computer Science, Berlin/Heidelberg/New York, 2003. Springer.
Google Scholar
M.J. van der Laan. Estimation based on case-control designs with known prevalance probability. Int J Biostat, 4(1):Article 17, 2008.
Google Scholar
M.J. van der Laan. Targeted maximum likelihood based causal inference: Part I. Int J Biostat, 6(2):Article 2, 2010.
Google Scholar
M.J. van der Laan and S. Dudoit. Unified cross-validation methodology for selection among estimators and a general cross-validated adaptive epsilon-net estimator: finite sample oracle inequalities and examples. Technical Report 130, Division of Biostatistics, University of California, Berkeley, 2003.
Google Scholar
M.J. van der Laan and S. Gruber. Collaborative double robust penalized targeted maximum likelihood estimation. Int J Biostat, 6(1), 2010.
Google Scholar
M.J. van der Laan and J.M. Robins. Unified methods for censored longitudinal data and causality. Springer, Berlin/Heidelberg/New York, 2003.
Book MATH Google Scholar
M.J. van der Laan and S. Rose. Targeted Learning: Causal Inference for Observational and Experimental Data. Springer, Berlin/Heidelberg/New York, 2011.
Book Google Scholar
M.J. van der Laan and Daniel B. Rubin. Targeted maximum likelihood learning. Int J Biostat, 2(1):Article 11, 2006.
Google Scholar
M.J. van der Laan, S. Dudoit, and S. Keleş. Asymptotic optimality of likelihood-based cross-validation. Stat Appl Genet Mol, 3(1):Article 4, 2004.
Google Scholar
M.J. van der Laan, S. Dudoit, and A.W. van der Vaart. The cross-validated adaptive epsilon-net estimator. Stat Decis, 24(3):373–395, 2006.
Article MathSciNet MATH Google Scholar
M.J. van der Laan, E.C. Polley, and A.E. Hubbard. Super learner. Stat Appl Genet Mol, 6(1):Article 25, 2007.
Google Scholar
A.W. van der Vaart. Asymptotic statistics. Cambridge, New York, 1998.
Book MATH Google Scholar
A.W. van der Vaart and J.A. Wellner. Weak convergence and empirical processes. Springer, Berlin/Heidelberg/New York, 1996.
Book MATH Google Scholar
A.W. van der Vaart, S. Dudoit, and M.J. van der Laan. Oracle inequalities for multi-fold cross-validation. Stat Decis, 24(3):351–371, 2006.
Article MathSciNet MATH Google Scholar
W.N. Venables and B.D. Ripley. Modern applied statistics with S. Springer, Berlin/Heidelberg/New York, 4th edition, 2002.
Book MATH Google Scholar
H. Wang, S. Rose, and M.J. van der Laan. Finding quantitative trait loci genes with collaborative targeted maximum likelihood learning. Stat Prob Lett, published online 11 Nov (doi: 10.1016/j.spl.2010.11.001), 2010.
Google Scholar
Y. Wang, O. Bembom, and M.J. van der Laan. Data adaptive estimation of the treatment specific mean. J Stat Plan Infer, 137(6):1871–1877, 2007.
Article MathSciNet MATH Google Scholar
D. H. Wolpert. Stacked generalization. Neural Networks, 5:241–259, 1992.
Article Google Scholar
W. Zheng and M.J. van der Laan. Asymptotic theory for cross-validated targeted maximum likelihood estimation. Technical Report 273, Division of Biostatistics, University of California, Berkeley, 2010.
Google Scholar

Download references

Author information

Authors and Affiliations

Division of Biostatistics, School of Public Health, University of California, Berkeley, CA, USA
Mark J. van der Laan & Maya L. Petersen

Authors

Mark J. van der Laan
View author publications
You can also search for this author in PubMed Google Scholar
Maya L. Petersen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Maya L. Petersen .

Editor information

Editors and Affiliations

Microsoft, One Microsoft Road, Redmond, 98052, USA
Cha Zhang
Honeywell, Douglas Drive North 1985, Golden Valley, 55422, USA
Yunqian Ma

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

van der Laan, M.J., Petersen, M.L. (2012). Targeted Learning. In: Zhang, C., Ma, Y. (eds) Ensemble Machine Learning. Springer, New York, NY. https://doi.org/10.1007/978-1-4419-9326-7_4

Download citation

DOI: https://doi.org/10.1007/978-1-4419-9326-7_4
Published: 19 January 2012
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4419-9325-0
Online ISBN: 978-1-4419-9326-7
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics