October 2023 Efficient estimation of the maximal association between multiple predictors and a survival outcome
Tzu-Jung Huang, Alex Luedtke, Ian W. McKeague
Author Affiliations +
Ann. Statist. 51(5): 1965-1988 (October 2023). DOI: 10.1214/23-AOS2313

Abstract

This paper develops a new approach to post-selection inference for screening high-dimensional predictors of survival outcomes. Post-selection inference for right-censored outcome data has been investigated in the literature, but much remains to be done to make the methods both reliable and computationally-scalable in high dimensions. Machine learning tools are commonly used to provide predictions of survival outcomes, but the estimated effect of a selected predictor suffers from confirmation bias unless the selection is taken into account. The new approach involves the construction of semiparametrically efficient estimators of the linear association between the predictors and the survival outcome, which are used to build a test statistic for detecting the presence of an association between any of the predictors and the outcome. Further, a stabilization technique reminiscent of bagging allows a normal calibration for the resulting test statistic, which enables the construction of confidence intervals for the maximal association between predictors and the outcome and also greatly reduces computational cost. Theoretical results show that this testing procedure is valid even when the number of predictors grows superpolynomially with sample size, and our simulations support this asymptotic guarantee at moderate sample sizes. The new approach is applied to the problem of identifying patterns in viral gene expression associated with the potency of an antiviral drug.

Funding Statement

AL was supported by the National Institutes of Health (NIH) under award number DP2-LM013340.
IWM was supported by NIH under award 1R01 AG062401 and by the National Science Foundation (NSF) under award DMS-2112938. The content is solely the responsibility of the authors and does not necessarily represent the official views of NIH or NSF.

Acknowledgments

We thank Peter Gilbert for suggesting the application in Section 7.

Citation

Download Citation

Tzu-Jung Huang. Alex Luedtke. Ian W. McKeague. "Efficient estimation of the maximal association between multiple predictors and a survival outcome." Ann. Statist. 51 (5) 1965 - 1988, October 2023. https://doi.org/10.1214/23-AOS2313

Information

Received: 1 January 2023; Published: October 2023
First available in Project Euclid: 14 December 2023

Digital Object Identifier: 10.1214/23-AOS2313

Subjects:
Primary: 62G10 , 62N03
Secondary: 62G20

Keywords: Marginal screening , Post-selection inference , Semiparametric efficiency

Rights: Copyright © 2023 Institute of Mathematical Statistics

JOURNAL ARTICLE
24 PAGES

This article is only available to subscribers.
It is not available for individual sale.
+ SAVE TO MY LIBRARY

Vol.51 • No. 5 • October 2023
Back to Top