The power of a paired t-test with a covariate

doi:10.1016/j.ssresearch.2014.12.004

Social Science Research

Volume 50, March 2015, Pages 277-291

https://doi.org/10.1016/j.ssresearch.2014.12.004 Get rights and content

Highlights

•
The paired t-test is a basic but popular statistical test.
•
Use of regression to improve power in t-tests is also common.
•
We provide guidance on computing power for such a design.

Abstract

Many researchers employ the paired t-test to evaluate the mean difference between matched data points. Unfortunately, in many cases this test in inefficient. This paper reviews how to increase the precision of this test through using the mean centered independent variable x, which is familiar to researchers that use analysis of covariance (ANCOVA). We add to the literature by demonstrating how to employ these gains in efficiency as a factor for use in finding the statistical power of the test. The key parameters for this factor are the correlation between the two measures and the variance ratio of the dependent measure on the predictor. The paper then demonstrates how to compute the gains in efficiency a priori to amend the power computations for the traditional paired t-test. We include an example analysis from a recent intervention, Families Preparing the New Generation (Familias Preparando la Nueva Generación). Finally, we conclude with an analysis of extant data to derive reasonable parameter values.

Introduction

It is still common practice in social science to collect paired observations such as pretest and posttests and perform a statistical test on the average difference to determine an average change in score. This paired t-test is a one sample (population) t-test on the mean difference the stems from Gossett’s work on small sample tests of means (Student, 1908) and is a classic method to test gains from pretests to posttests (Lord, 1956, McNemar, 1958). The analysis of covariance (ANCOVA) literature also has long recognized the gains in statistical power to measure differences when using covariates (Cochran, 1957, Kisbu-Sakarya et al., 2013, Oakes and Feldman, 2001, Porter and Raudenbush, 1987). Previous work has examined the use of covariates to estimate differential gains based on the initial value (Garside, 1956), but to our knowledge a factor for predicting the gains in precision for the mean difference has yet to be developed.

While such pretest–posttest designs are undesirable for causal interpretations due to uncontrolled sources of gains (Shadish et al., 2002), researchers conducting observational studies may still want to plan for the ability to detect changes in a cohort of subjects. For example, surveys of older adults may wish to find evidence of increasing or decreasing depression symptoms (see, e.g., O’Muircheartaigh et al., 2014, Payne et al., 2014). Given limited resources for any study, whether observational or experimental, it is important to maximize power for detecting such changes. The factor developed in this paper is directly relevant to planning such studies. We show how studies employing a paired t-test will sometimes have less power, and thus require a larger sample, than an analysis that employs regression with a covariate.

The purpose of this paper is to outline how including the pretest (centered on the pretest mean) in a prediction model of gain scores produces the same mean difference with less sampling variance, thus increasing statistical power (the chance of detecting the mean difference). We then present a factor that predicts the increase in precision based on the correlation and relative variance of the posttest and pretest variables. In the spirit of Guenther (1981), we then present power analysis techniques that employ this factor. This paper recognizes that issues of measurement error are very important with these designs (Althauser and Rubin, 1971, Lord, 1956, Overall and Woodward, 1975). We explore the implications of measurement error for the methods presented here. Moreover, our discussion section incorporates how measurement error may influence which test to use. Future work will incorporate how to include measurement error in the calculations of precision gains.

The regression-based test that we explore in this paper can be achieved using any conventional regression package. The procedure is simply to calculate two new variables. The first variable is the difference between the posttest and pretest (post minus pre). The second variable is the pretest minus the pretest mean, or the mean-centered pretest. The procedure for the analysis is simply to use the first variable, the difference between the posttest and the pretest, as the dependent variable regressed onto the mean-centered pretest. The intercept of that regression model, and its standard error, provide the regression-based test. For example, if we import our hypothetical data (see Table 1) into R and ask for a paired t-test (Team, 2012)

We see that the mean difference (6.5) is not statistically significant (i.e., p-value = 0.05394). We can use the same data, but instead fit a linear model (lm) of the difference between y and x (y-x) regressed on the mean centered x (I(x - mean(x))):The intercept (which is the mean difference, 6.5000) is statistically significant (p-value is 0.0383) using this method. In this paper we investigate why this is the case and how to estimate the power for a study that would use this analysis method.

Section snippets

Theoretical model

In this section we outline the explicit data-generating process for which a paired t-test analysis seeks to uncover. We then explore the theoretical reason why a regression approach provides a more powerful test. We assume that data are generated from a population where each ith population member increases (or decreases) their score on a given measure by a quantity denoted as δ. We call this quantity again, but the following applies even if the change is, on average, negative. Algebraically,

Review of test variances

In this section we move from the theoretical statistical model to the formulation of a practical factor that predicts gains in precision through the use of regression. This requires an examination of the paired- and regression-based approaches to estimating the variance of the gains. Below we present the variances for each test: the variance of the paired t-test and the variance of the intercept when the difference is regressed on the mean centered covariate. Details about these variances are

Formulating the paired test inflation factor (PTIF) for power analyses

Most power analysis programs’ regression routines focus on the ability to detect a slope or correlation (see, e.g., Faul et al., 2007). Since the mean difference in this case is estimated by the intercept, most programs are not equipped to give a power estimate for this design. However, it is possible with modern software such as R, Stata, SAS, or SPSS, to estimate power with a noncentrality parameter. This section provides guidance on estimating this noncentrality parameter and how to use

Example of increased power from an intervention

We now turn to an example with actual data from an intervention, Families Preparing the New Generation (Familias Preparando la Nueva Generación)(FPNG). FPNG is an efficacy trial examining a culturally specific parenting intervention, FPNG, designed to increase or boost the effects of keepin’it REAL, an efficacious classroom-based drug abuse prevention intervention targeting middle school students (Marsiglia et al., 2013) through culturally specific materials (Williams et al., 2012). Overall,

Discussion

In this part of the paper we discuss several implications of this analysis strategy. First, we discuss under which conditions this approach is useful. Next, we offer guidance on the key design parameters.

Conclusion

Power analysis is an important aspect of any study design. In this paper we have presented a method for testing paired differences that is more powerful than the standard t-test. More importantly, we have developed a method to gauge the power of these designs before data collection. It is common knowledge that a more sensitive test of the difference between two variables on paired observations can be achieved using a mean-centered covariate. This paper conceptualized the gains in power as a

Acknowledgments

This research was supported by funding from the National Institutes of Health/National Institute on Minority Health and Health Disparities (NIMHD/NIH), award P20 MD002316 (F. Marsiglia, P.I.). The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIMHD or the NIH.

References (28)

C.L. Aberson
Applied Power Analysis for the Behavioral Sciences
(2011)
R.P. Althauser et al.
Measurement error and regression to the mean in matched samples
Soc. Forces
(1971)
W.G. Cochran
Analysis of covariance: its nature and uses
Biometrics
(1957)
J. Cohen
Statistical Power Analysis for the Behavioral Sciences
(1988)
J. Cohen
A power primer
Psychol. Bull.
(1992)
L.E. Dumka et al.
Examination of the cross-cultural and cross-language equivalence of the parenting self-agency measure
Fam. Relat.
(1996)
Esbensen, F.-A. 2006. Evaluation of the Gang Resistance Education and Training (GREAT) Program in the United States,...
F.A. Esbensen et al.
How great is great? results from a longitudinal quasi-experimental design∗
Criminol. Public Policy
(2001)
F. Faul et al.
G∗ Power 3: a flexible statistical power analysis program for the social, behavioral, and biomedical sciences
Behav. Res. Methods
(2007)
R. Garside
The regression of gains upon initial scores
Psychometrika
(1956)

W.C. Guenther

Sample size formulas for normal theory t tests

Am. Statistician

(1981)

Harris, K.M., Udry, J.R. 2014. National Longitudinal Study of Adolescent to Adult Health (Add Health), 1994–2008...

Y. Kisbu-Sakarya et al.

A Monte Carlo comparison study of the power of the analysis of covariance, simple difference, and residual change scores in testing two-wave data

Educ. Psychol. Measur.

(2013)

F.M. Lord

The measurement of growth

Educ. Psychol. Measur.

(1956)

Cited by (59)

Mild exogenous inflammation blunts neural signatures of bounded evidence accumulation and reward prediction error processing in healthy male participants
2024, Brain, Behavior, and Immunity
Altered neural haemodynamic activity during decision making and learning has been linked to the effects of inflammation on mood and motivated behaviours. So far, it has been reported that blunted mesolimbic dopamine reward signals are associated with inflammation-induced anhedonia and apathy. Nonetheless, it is still unclear whether inflammation impacts neural activity underpinning decision dynamics. The process of decision making involves integration of noisy evidence from the environment until a critical threshold of evidence is reached. There is growing empirical evidence that such process, which is usually referred to as bounded accumulation of decision evidence, is affected in the context of mental illness.
In a randomised, placebo-controlled, crossover study, 19 healthy male participants were allocated to placebo and typhoid vaccination. Three to four hours post-injection, participants performed a probabilistic reversal-learning task during functional magnetic resonance imaging. To capture the hidden neurocognitive operations underpinning decision-making, we devised a hybrid sequential sampling and reinforcement learning computational model. We conducted whole brain analyses informed by the modelling results to investigate the effects of inflammation on the efficiency of decision dynamics and reward learning.
We found that during the decision phase of the task, typhoid vaccination attenuated neural signatures of bounded evidence accumulation in the dorsomedial prefrontal cortex, only for decisions requiring short integration time. Consistent with prior work, we showed that, in the outcome phase, mild acute inflammation blunted the reward prediction error in the bilateral ventral striatum and amygdala.
Our study extends current insights into the effects of inflammation on the neural mechanisms of decision making and shows that exogenous inflammation alters neural activity indexing efficiency of evidence integration, as a function of choice discriminability. Moreover, we replicate previous findings that inflammation blunts striatal reward prediction error signals.
Increased neural differentiation after a single session of aerobic exercise in older adults
2023, Neurobiology of Aging
Aging is associated with decreased cognitive function. One theory posits that this decline is in part due to multiple neural systems becoming dedifferentiated in older adults. Exercise is known to improve cognition in older adults, even after only a single session. We hypothesized that one mechanism of improvement is a redifferentiation of neural systems. We used a within-participant, cross-over design involving 2 sessions: either 30 minutes of aerobic exercise or 30 minutes of seated rest (n = 32; ages 55–81 years). Both functional Magnetic Resonance Imaging (fMRI) and Stroop performance were acquired soon after exercise and rest. We quantified neural differentiation via general heterogeneity regression. There were 3 prominent findings following the exercise. First, participants were better at reducing Stroop interference. Second, while there was greater neural differentiation within the hippocampal formation and cerebellum, there was lower neural differentiation within frontal cortices. Third, this greater neural differentiation in the cerebellum and temporal lobe was more pronounced in the older ages. These data suggest that exercise can induce greater neural differentiation in healthy aging.
The Initial Efficacy of Stand-Alone DBT Skills Training for Treating Impulsivity Among Individuals With Alcohol and Other Substance Use Disorders
2023, Behavior Therapy
Impulsivity is considered a core feature of substance use disorders (SUDs), including personological (i.e., negative urgency, positive urgency, lack of premeditation) and neuropsychological (i.e., cognitive and motor disinhibition, impulsive choice) dimensions. Dialectical Behavior Therapy Skills Training (DBT-ST) as a stand-alone treatment is an effective intervention for alcohol use disorder (AUD) and other SUDs. However, there are no studies that have investigated changes in impulsivity levels during a DBT-ST program, especially testing the therapeutic effects of DBT skills. Twenty-nine patients with AUD and other SUDs were admitted to a 3-month DBT-ST program. Self-report (i.e., UPPS-P) and computerized neuropsychological (i.e., Attentional Network test; Go/No-Go task; Iowa Gambling Task) measures of impulsivity were administered at the beginning and end of the DBT-ST. Distress tolerance (DTS), mindfulness (MAAS, FFMQ) and emotion regulation (DERS) were also assessed pre- and post-intervention. The study included two age- and gender-matched control groups: (a) untreated patients with SUDs (N = 29); (b) healthy controls (HCs) (N = 29). Twenty-four (82.7%) patients concluded the DBT-ST program. Emotion-based forms of impulsivity significantly improved during the program. At the end of treatment, impulsivity levels were significantly lower than those of untreated patients with SUDs and they were not significantly different from HCs. Cognitive disinhibition significantly decreased during the treatment. The improvement in impulsivity was explained by pre- posttreatment changes in distress tolerance, mindfulness and emotion regulation. Motor disinhibition did not improve during the treatment. These findings supported the initial efficacy of the DBT-ST program for addressing different features of impulsivity among individuals with AUD and other SUDs. Future follow-up studies should demonstrate the role of impulsivity domains in long-term relapse prevention.
A mutli-scale spatial-temporal convolutional neural network with contrastive learning for motor imagery EEG classification
2023, Medicine in Novel Technology and Devices
Motor imagery (MI) based Brain-computer interfaces (BCIs) have a wide range of applications in the stroke rehabilitation field. However, due to the low signal-to-noise ratio and high cross-subject variation of the electroencephalogram (EEG) signals generated by motor imagery, the classification performance of the existing methods still needs to be improved to meet the need of real practice. To overcome this problem, we propose a multi-scale spatial-temporal convolutional neural network called MSCNet. We introduce the contrastive learning into a multi-temporal convolution scale backbone to further improve the robustness and discrimination of embedding vectors. Experimental results of binary classification show that MSCNet outperforms the state-of-the-art methods, achieving accuracy improvement of 6.04%, 3.98%, and 8.15% on BCIC IV 2a, SMR-BCI, and OpenBMI datasets in subject-dependent manner, respectively. The results show that the contrastive learning method can significantly improve the classification accuracy of motor imagery EEG signals, which provides an important reference for the design of motor imagery classification algorithms.
Focus Forward Fellowship: Evaluation of a program for women student service members and veterans
2022, Evaluation and Program Planning
Citation Excerpt :
Following suggestions from Hedberg and Ayers (2015), we first calculated gain scores (i.e., difference scores) for the change in each measure between time points and mean-centered the baseline scores. We chose this method because it is similar to ANCOVA but allows for more precision (via reduction in the sampling variance by the addition of baseline measures) and is useful when baseline measures are correlated with gain scores (Hedberg & Ayers, 2015). Inspection of data prior to analyses indicated that the assumption of normality was violated for some outcomes.
The Focus Forward Fellowship was designed to support women student service members and veterans (SSM/Vs) in developing skills and resources to promote persistence to graduation and career attainment. Despite their accomplishments and strengths, women SSM/Vs can be challenged by their military and gender identities in a university environment surrounded by peers who differ in age and life experience (Iverson et al., 2016). Guided by King’s (2004) meaning of life meta-model, the Fellowship was designed to increase sense of belonging, understanding of self, and engagement in behaviors tied to academic and career success. We gathered longitudinal evaluation data from two early program cohorts comprising 19 women. Analyses indicated that women reported significant gains in knowledge and use of personal strengths, identity integration, resume preparation, and networking skills, with baseline assessments controlled. No gains were found for sense of belonging or engagement in networking with career professionals or military peers. Based on existing literature, improvement in identity integration is a particularly positive contribution to women students’ academic and career success. Program refinements will aim to strengthen contributions to the “belonging” domainof the program.
Green marketing innovation: Opportunities from an environmental education analysis in young consumers
2022, Journal of Cleaner Production
As a society, we are aiming to achieve the Sustainable Development Goals (SDGs) by 2030, and one of the most important challenges is fostering responsible production and consumption. Green marketing campaigns and instruction in university curricula are an opportunity to influence producers' and consumers' decision-making process in a positive way. This research aims to understand the instructional effects of non-compulsory university courses linked to sustainability and the circular economy (SCE) on students' motivations and behavior. We analyze students' proclivities to consume products and develop more sustainable habits before and after enrolling in SCE courses. Results confirm that the courses impacted students' propensities toward sustainable consumption. Therefore, the biggest changes in the six dimensions underlying green consumption reveal four key recommendations for developing a green marketing strategy. We recommend firms to 1) engage in green education, 2) create community, 3) be aware of consumer diversity, and 4) not differentiate by gender. For this reason, we argue that university education may greatly influence students' mindset concerning sustainable behavior. Results also revealed no significant gender differences, which contrasts with the differentiated behavior found in extant studies on older populations.

View all citing articles on Scopus

View full text

The power of a paired t-test with a covariate

Highlights

Abstract

Introduction

Section snippets

Theoretical model

Review of test variances

Formulating the paired test inflation factor (PTIF) for power analyses

Example of increased power from an intervention

Discussion

Conclusion

Acknowledgments

Applied Power Analysis for the Behavioral Sciences

Measurement error and regression to the mean in matched samples

Soc. Forces

Analysis of covariance: its nature and uses

Biometrics

Statistical Power Analysis for the Behavioral Sciences

A power primer

Psychol. Bull.

Examination of the cross-cultural and cross-language equivalence of the parenting self-agency measure

Fam. Relat.

How great is great? results from a longitudinal quasi-experimental design∗

Criminol. Public Policy

G∗ Power 3: a flexible statistical power analysis program for the social, behavioral, and biomedical sciences

Behav. Res. Methods

The regression of gains upon initial scores

Psychometrika

Sample size formulas for normal theory t tests

Am. Statistician

A Monte Carlo comparison study of the power of the analysis of covariance, simple difference, and residual change scores in testing two-wave data

Educ. Psychol. Measur.

The measurement of growth

Educ. Psychol. Measur.