Elsevier

Social Science Research

Volume 50, March 2015, Pages 277-291
Social Science Research

The power of a paired t-test with a covariate

https://doi.org/10.1016/j.ssresearch.2014.12.004Get rights and content

Highlights

  • The paired t-test is a basic but popular statistical test.

  • Use of regression to improve power in t-tests is also common.

  • We provide guidance on computing power for such a design.

Abstract

Many researchers employ the paired t-test to evaluate the mean difference between matched data points. Unfortunately, in many cases this test in inefficient. This paper reviews how to increase the precision of this test through using the mean centered independent variable x, which is familiar to researchers that use analysis of covariance (ANCOVA). We add to the literature by demonstrating how to employ these gains in efficiency as a factor for use in finding the statistical power of the test. The key parameters for this factor are the correlation between the two measures and the variance ratio of the dependent measure on the predictor. The paper then demonstrates how to compute the gains in efficiency a priori to amend the power computations for the traditional paired t-test. We include an example analysis from a recent intervention, Families Preparing the New Generation (Familias Preparando la Nueva Generación). Finally, we conclude with an analysis of extant data to derive reasonable parameter values.

Introduction

It is still common practice in social science to collect paired observations such as pretest and posttests and perform a statistical test on the average difference to determine an average change in score. This paired t-test is a one sample (population) t-test on the mean difference the stems from Gossett’s work on small sample tests of means (Student, 1908) and is a classic method to test gains from pretests to posttests (Lord, 1956, McNemar, 1958). The analysis of covariance (ANCOVA) literature also has long recognized the gains in statistical power to measure differences when using covariates (Cochran, 1957, Kisbu-Sakarya et al., 2013, Oakes and Feldman, 2001, Porter and Raudenbush, 1987). Previous work has examined the use of covariates to estimate differential gains based on the initial value (Garside, 1956), but to our knowledge a factor for predicting the gains in precision for the mean difference has yet to be developed.

While such pretest–posttest designs are undesirable for causal interpretations due to uncontrolled sources of gains (Shadish et al., 2002), researchers conducting observational studies may still want to plan for the ability to detect changes in a cohort of subjects. For example, surveys of older adults may wish to find evidence of increasing or decreasing depression symptoms (see, e.g., O’Muircheartaigh et al., 2014, Payne et al., 2014). Given limited resources for any study, whether observational or experimental, it is important to maximize power for detecting such changes. The factor developed in this paper is directly relevant to planning such studies. We show how studies employing a paired t-test will sometimes have less power, and thus require a larger sample, than an analysis that employs regression with a covariate.

The purpose of this paper is to outline how including the pretest (centered on the pretest mean) in a prediction model of gain scores produces the same mean difference with less sampling variance, thus increasing statistical power (the chance of detecting the mean difference). We then present a factor that predicts the increase in precision based on the correlation and relative variance of the posttest and pretest variables. In the spirit of Guenther (1981), we then present power analysis techniques that employ this factor. This paper recognizes that issues of measurement error are very important with these designs (Althauser and Rubin, 1971, Lord, 1956, Overall and Woodward, 1975). We explore the implications of measurement error for the methods presented here. Moreover, our discussion section incorporates how measurement error may influence which test to use. Future work will incorporate how to include measurement error in the calculations of precision gains.

The regression-based test that we explore in this paper can be achieved using any conventional regression package. The procedure is simply to calculate two new variables. The first variable is the difference between the posttest and pretest (post minus pre). The second variable is the pretest minus the pretest mean, or the mean-centered pretest. The procedure for the analysis is simply to use the first variable, the difference between the posttest and the pretest, as the dependent variable regressed onto the mean-centered pretest. The intercept of that regression model, and its standard error, provide the regression-based test. For example, if we import our hypothetical data (see Table 1) into R and ask for a paired t-test (Team, 2012)

We see that the mean difference (6.5) is not statistically significant (i.e., p-value = 0.05394). We can use the same data, but instead fit a linear model (lm) of the difference between y and x (y-x) regressed on the mean centered x (I(x - mean(x))):The intercept (which is the mean difference, 6.5000) is statistically significant (p-value is 0.0383) using this method. In this paper we investigate why this is the case and how to estimate the power for a study that would use this analysis method.

Section snippets

Theoretical model

In this section we outline the explicit data-generating process for which a paired t-test analysis seeks to uncover. We then explore the theoretical reason why a regression approach provides a more powerful test. We assume that data are generated from a population where each ith population member increases (or decreases) their score on a given measure by a quantity denoted as δ. We call this quantity again, but the following applies even if the change is, on average, negative. Algebraically,

Review of test variances

In this section we move from the theoretical statistical model to the formulation of a practical factor that predicts gains in precision through the use of regression. This requires an examination of the paired- and regression-based approaches to estimating the variance of the gains. Below we present the variances for each test: the variance of the paired t-test and the variance of the intercept when the difference is regressed on the mean centered covariate. Details about these variances are

Formulating the paired test inflation factor (PTIF) for power analyses

Most power analysis programs’ regression routines focus on the ability to detect a slope or correlation (see, e.g., Faul et al., 2007). Since the mean difference in this case is estimated by the intercept, most programs are not equipped to give a power estimate for this design. However, it is possible with modern software such as R, Stata, SAS, or SPSS, to estimate power with a noncentrality parameter. This section provides guidance on estimating this noncentrality parameter and how to use

Example of increased power from an intervention

We now turn to an example with actual data from an intervention, Families Preparing the New Generation (Familias Preparando la Nueva Generación)(FPNG). FPNG is an efficacy trial examining a culturally specific parenting intervention, FPNG, designed to increase or boost the effects of keepin’it REAL, an efficacious classroom-based drug abuse prevention intervention targeting middle school students (Marsiglia et al., 2013) through culturally specific materials (Williams et al., 2012). Overall,

Discussion

In this part of the paper we discuss several implications of this analysis strategy. First, we discuss under which conditions this approach is useful. Next, we offer guidance on the key design parameters.

Conclusion

Power analysis is an important aspect of any study design. In this paper we have presented a method for testing paired differences that is more powerful than the standard t-test. More importantly, we have developed a method to gauge the power of these designs before data collection. It is common knowledge that a more sensitive test of the difference between two variables on paired observations can be achieved using a mean-centered covariate. This paper conceptualized the gains in power as a

Acknowledgments

This research was supported by funding from the National Institutes of Health/National Institute on Minority Health and Health Disparities (NIMHD/NIH), award P20 MD002316 (F. Marsiglia, P.I.). The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIMHD or the NIH.

References (28)

  • C.L. Aberson

    Applied Power Analysis for the Behavioral Sciences

    (2011)
  • R.P. Althauser et al.

    Measurement error and regression to the mean in matched samples

    Soc. Forces

    (1971)
  • W.G. Cochran

    Analysis of covariance: its nature and uses

    Biometrics

    (1957)
  • J. Cohen

    Statistical Power Analysis for the Behavioral Sciences

    (1988)
  • J. Cohen

    A power primer

    Psychol. Bull.

    (1992)
  • L.E. Dumka et al.

    Examination of the cross-cultural and cross-language equivalence of the parenting self-agency measure

    Fam. Relat.

    (1996)
  • Esbensen, F.-A. 2006. Evaluation of the Gang Resistance Education and Training (GREAT) Program in the United States,...
  • F.A. Esbensen et al.

    How great is great? results from a longitudinal quasi-experimental design∗

    Criminol. Public Policy

    (2001)
  • F. Faul et al.

    G∗ Power 3: a flexible statistical power analysis program for the social, behavioral, and biomedical sciences

    Behav. Res. Methods

    (2007)
  • R. Garside

    The regression of gains upon initial scores

    Psychometrika

    (1956)
  • W.C. Guenther

    Sample size formulas for normal theory t tests

    Am. Statistician

    (1981)
  • Harris, K.M., Udry, J.R. 2014. National Longitudinal Study of Adolescent to Adult Health (Add Health), 1994–2008...
  • Y. Kisbu-Sakarya et al.

    A Monte Carlo comparison study of the power of the analysis of covariance, simple difference, and residual change scores in testing two-wave data

    Educ. Psychol. Measur.

    (2013)
  • F.M. Lord

    The measurement of growth

    Educ. Psychol. Measur.

    (1956)
  • Cited by (59)

    • Focus Forward Fellowship: Evaluation of a program for women student service members and veterans

      2022, Evaluation and Program Planning
      Citation Excerpt :

      Following suggestions from Hedberg and Ayers (2015), we first calculated gain scores (i.e., difference scores) for the change in each measure between time points and mean-centered the baseline scores. We chose this method because it is similar to ANCOVA but allows for more precision (via reduction in the sampling variance by the addition of baseline measures) and is useful when baseline measures are correlated with gain scores (Hedberg & Ayers, 2015). Inspection of data prior to analyses indicated that the assumption of normality was violated for some outcomes.

    View all citing articles on Scopus
    View full text