Controlling for occasion-specific effects when assessing the test–retest reliability of self-report health questionnaires

Olsen, Joseph A.; Bloch, Daniel A.; Bloch, George J.

doi:10.1007/s11136-007-9246-9

Controlling for occasion-specific effects when assessing the test–retest reliability of self-report health questionnaires

Original Paper
Published: 31 July 2007

Volume 16, pages 1399–1405, (2007)
Cite this article

Quality of Life Research Aims and scope Submit manuscript

Joseph A. Olsen¹,
Daniel A. Bloch² &
George J. Bloch³

135 Accesses
2 Citations
Explore all metrics

Abstract

Objective

This study proposes a method for self-report health questionnaires to adjust test–retest reliability for changes during the test–retest interval based on an external measure, and to distinguish such changes from random response errors.

Methods

In our application, eighty participants completed the Symptoms of Illness Checklist (SIC) on two occasions, two weeks apart, immediately before interviews given on each occasion by one of two physicians in a crossover design. The physician interview scores served as external measures, and structural equation modeling was used to estimate the parameters of a model that corrected for the occasion-specific effect of participants’ responses using information from the interviews.

Results

Correcting for changes in symptoms during the test–retest interval increased SIC test–retest reliability from .744 to .804 and significantly improved model fit (χ² _diff(1) = 30.78, p < .001).

Conclusions

The results suggest methods that can improve the evaluation of self-report health questionnaire test–retest reliability by identifying changes using an external measure, and distinguishing these from random response errors; these increased the estimated SIC test–retest reliability and indicated that the SIC was indeed able to measure changes over the studied time interval. This method can be applied across a broad range of questionnaires.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Does the single-item self-rated health measure the same thing across different wordings? Construct validity study

Article Open access 20 May 2020

Detecting short-term change and variation in health-related quality of life: within- and between-person factor structure of the SF-36 health survey

Article Open access 21 December 2015

Are Single-Item Global Ratings Useful for Assessing Health Status?

Article 22 October 2015

Notes

AMOS allows the model to be specified graphically in the form of a path diagram—in the present case the illustration presented in Fig. 1, but without the triangle and the lines emanating from it (these intercepts are estimated by default in AMOS). If the reader wishes to obtain a copy of the AMOS program file used in the present study, please contact Joseph Olsen at joseph_olsen@byu.edu.

References

Schmidt, F. L., Le, H., & Ilies, R. (2003). Beyond alpha: An empirical examination of the effects of different sources of measurement error on reliability estimates for measures of individual differences constructs. Psychological Methods, 8, 206–224.
Article Google Scholar
Laenen, A., Vangeneugden, T., Geys, H., & Molenberghs, G. (2006). Generalized reliability estimation using repeated measurements. British Journal of Mathematical and Statistical Psychology, 59, 113–131.
Article Google Scholar
Schuck, P. (2004). Assessing reproducibility for interval data in health-related quality of life questionnaires: Which coefficient should be used? Quality of Life Research, 13, 571–586.
Article Google Scholar
Becker, G. (2000). How important is transient error in estimating reliability? Going beyond simulation studies. Psychological Methods, 5, 370–379.
Article CAS Google Scholar
Green, S. B. (2003). A coefficient Alpha for test–retest data. Psychological Methods, 8, 88–101.
Article Google Scholar
Vautier, S., & Jmel, S. (2003). Transient error or specificity? An alternative to the staggered equivalent split-half procedure. Psychological Methods, 8, 225–238.
Article Google Scholar
Raykov, T., & Penev S. (2005). Estimating the reliability for multiple component measuring instruments in test–retest designs. British Journal of Mathematical and Statistical Psychology, 58, 285–299.
Article Google Scholar
Sturman, M. C., Cheramie, R. A., & Cashen, L. H. (2005). The impact of job complexity and performance measurement on the temporal consistency, stability, and test–retest reliability of employee job performance ratings. Journal of Applied Psychology, 90, 269–283.
Article Google Scholar
Watson, D. (2004). Stability versus change, dependability versus error: Issues in the assessment of personality over time. Journal of Research in Personality, 38, 319–350.
Article Google Scholar
Stowell, J. R., & Bloch, G. J. (2002, April). The symptoms of illness checklist (SIC): A relation between health and stress. (Paper presented at the Rocky Mountain Psychological Association, Abstract 180.).
Stowell, J. R., Hedges, D. W., Ghambaryan. A., Key, C., & Bloch, G. J. (Submitted, 2007). Validation of the Symptoms of Illness Checklist (SIC) as a tool for health psychology research. Journal of Health Psychology.
Bollen, K. A. (1989). Structural equations with latent variables. New York: Wiley.
Google Scholar
McDowell, I., & Newell, C. (1996). Measuring health: A guide to rating scales and questionnaires. New York: Oxford Press.
Google Scholar
Aaronson, N. K., Ahmedzai, S., Bergman, B., Bullinger, M., Cull, A., et al. (1993). The European organization for research and treatment of cancer QLQ-C30: a quality-of-life instrument for use in international clinical trials in oncology. Journal of the National Cancer Institute, 85, 365–376.
Article CAS Google Scholar
Ware, J. E., & Sherbourne, C. D. (1992). The MOS 36-item Short-Form Health Survey (SF-36). I. Conceptual framework and item selection. Medical Care, 30, 473–483.
Article Google Scholar

Download references

Author information

Authors and Affiliations

College of Family, Home and Social Sciences, Brigham Young University, Provo, UT, USA
Joseph A. Olsen
Division of Biostatistics, Department of Health Research and Policy, Stanford University, Stanford, CA, USA
Daniel A. Bloch
Department of Psychology, Brigham Young University, 1052 SWKT, Provo, UT, 84602, USA
George J. Bloch

Authors

Joseph A. Olsen
View author publications
You can also search for this author in PubMed Google Scholar
Daniel A. Bloch
View author publications
You can also search for this author in PubMed Google Scholar
George J. Bloch
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to George J. Bloch.

Appendix: Model specification

Letting bold face characters indicate vectors or matrices, denote $ {\mathbf{Y}} = \varvec{\upalpha} + {\mathbf{\Lambda \eta }} $, $ Cov(\varvec{\upeta}) = \varvec{\uppsi} $. Then

$$ {\left[ {\begin{array}{*{20}c} {{y^{{(Q)}}_{{j1}} }} \\ {{y^{{(Q)}}_{{j2}} }} \\ {{y^{{(I)}}_{{j1}} }} \\ {{y^{{(I)}}_{{j2}} }} \\ \end{array} } \right]} = {\left[ {\begin{array}{*{20}c} {{a^{{(Q)}}_{1} }} \\ {{a^{{(Q)}}_{2} }} \\ {{a^{{(I)}}_{1} }} \\ {{a^{{(I)}}_{2} }} \\ \end{array} } \right]} + {\left[ {\begin{array}{*{20}c} {1} & {0} & {c} & {0} & {1} & {0} \\ {1} & {0} & {0} & {c} & {0} & {1} \\ {0} & {1} & {1} & {0} & {0} & {0} \\ {0} & {1} & {0} & {1} & {0} & {0} \\ \end{array} } \right]}{\left[ {\begin{array}{*{20}c} {{T^{{(Q)}}_{j} }} \\ {{T^{{(I)}}_{j} }} \\ {{d^{{(I)}}_{{j1}} }} \\ {{d^{{(I)}}_{{j2}} }} \\ {{e^{{(Q)}}_{{j1}} }} \\ {{e^{{(Q)}}_{{j2}} }} \\ \end{array} } \right]},\;\varvec{\uppsi} = {\left[ {\begin{array}{*{20}c} {{\sigma ^{2}_{Q} }} & {{\sigma _{{QI}} }} & {0} & {0} & {0} & {0} \\ {{\sigma _{{QI}} }} & {{\sigma ^{2}_{I} }} & {0} & {0} & {0} & {0} \\ {0} & {0} & {{\sigma ^{2}_{d} }} & {0} & {0} & {0} \\ {0} & {0} & {0} & {{\sigma ^{2}_{d} }} & {0} & {0} \\ {0} & {0} & {0} & {0} & {{\sigma ^{2}_{e} }} & {0} \\ {0} & {0} & {0} & {0} & {0} & {{\sigma ^{2}_{e} }} \\ \end{array} } \right]} $$

Here $ \sigma ^{2}_{Q} $, $ \sigma ^{2}_{I} $, $ \sigma ^{2}_{d} $ and $ \sigma ^{2}_{e} $ are the variances of the factors T ^(Q), T ^(I), d ^(I), and e ^(Q), $ \sigma _{{QI}} $ is the covariance between T ^(Q)and T ^(I), and c is the correlation coefficient obtained from regressing $ d^{{(Q)}}_{{jk}} $ on $ d^{{(I)}}_{{jk}} $. Because the mean structure of the model is saturated due to the estimation of separate intercepts for each of the observed variables, it is possible to give a simplified expression for the model-implied covariance matrix as $ {\mathbf{\hat{\Sigma }}} = {\mathbf{\Lambda \Psi {\Lambda }\ifmmode{'}\else$'$\fi }} $. The maximum likelihood fitting function which is used to index the discrepancy between this model-implied matrix and the original sample covariance matrix (S) can be given as: $ F_{{ML}} = \log {\left| {\hat{\Sigma }} \right|} + trace[{\mathbf{S\hat{\Sigma }}}^{{ - 1}} ] - \log {\left| {\mathbf{S}} \right|} - t, $ where t is the total number of study variables. In the present study, S is the covariance matrix for the four-variate multinormal distribution of the study variables $ y^{{(I)}}_{{j1}} $, $ y^{{(I)}}_{{j2}} $, $ y^{{(Q)}}_{{j1}} $, and $ y^{{(Q)}}_{{j2}} $.

With sample size N, F _ML can be rescaled to approximate a chi-square variate for purposes of model goodness-of-fit testing: (N–1)F _ML∼ χ². Under standard conditions, this provides a chi-square test of model fit with degrees of freedom equal to the difference between the number of estimated model parameters and the total number of means, variances, and covariances.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Olsen, J.A., Bloch, D.A. & Bloch, G.J. Controlling for occasion-specific effects when assessing the test–retest reliability of self-report health questionnaires. Qual Life Res 16, 1399–1405 (2007). https://doi.org/10.1007/s11136-007-9246-9

Download citation

Received: 19 March 2007
Accepted: 13 July 2007
Published: 31 July 2007
Issue Date: October 2007
DOI: https://doi.org/10.1007/s11136-007-9246-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Controlling for occasion-specific effects when assessing the test–retest reliability of self-report health questionnaires