Skip to main content Accessibility help
×
Hostname: page-component-848d4c4894-wg55d Total loading time: 0 Render date: 2024-05-25T22:39:09.457Z Has data issue: true hasContentIssue false

15 - Large-Scale Hypothesis Testing and FDRs

from Part III - Twenty-First-Century Topics

Published online by Cambridge University Press:  05 July 2016

Bradley Efron
Affiliation:
Stanford University, California
Trevor Hastie
Affiliation:
Stanford University, California
Get access

Summary

By the final decade of the twentieth century, electronic computation fully dominated statistical practice. Almost all applications, classical or otherwise, were now performed on a suite of computer platforms: SAS, SPSS, Minitab, Matlab, S (later R), and others.

The trend accelerates when we enter the twenty-first century, as statistical methodology struggles, most often successfully, to keep up with the vastly expanding pace of scientific data production. This has been a twoway game of pursuit, with statistical algorithms chasing ever larger data sets, while inferential analysis labors to rationalize the algorithms. Part III of our book concerns topics in twenty-first-century1 statistics.

The word “topics” is intended to signal selections made from a wide catalog of possibilities. Part II was able to review a large portion (though certainly not all) of the important developments during the postwar period. Now, deprived of the advantage of hindsight, our survey will be more illustrative than definitive.

For many statisticians, microarrays provided an introduction to largescale data analysis. These were revolutionary biomedical devices that enabled the assessment of individual activity for thousands of genes at once— and, in doing so, raised the need to carry out thousands of simultaneous hypothesis tests, done with the prospect of finding only a few interesting genes among a haystack of null cases. This chapter concerns large-scale hypothesis testing and the false-discovery rate, the breakthrough in statistical inference it elicited.

Large-Scale Testing

The prostate cancer data, Figure 3.4, came from a microarray study of n = 102 men, 52 prostate cancer patients and 50 normal controls. Each man's gene expression levels were measured on a panel of N = 6033 genes, yielding a 6033 102 matrix of measurements xij,

For each gene, a two-sample t statistic (2.17) ti was computed comparing gene i 's expression levels for the 52 patients with those for the 50 controls. Under the null hypothesis H0i that the patients’ and the controls’ responses come from the same normal distribution of gene i expression levels, ti will follow a standard Student t distribution with 100 degrees of freedom, t100.

Type
Chapter
Information
Computer Age Statistical Inference
Algorithms, Evidence, and Data Science
, pp. 271 - 297
Publisher: Cambridge University Press
Print publication year: 2016

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Save book to Kindle

To save this book to your Kindle, first ensure coreplatform@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

Available formats
×

Save book to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

Available formats
×

Save book to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

Available formats
×