Abstract
Coefficient alpha is commonly used as a reliability estimator. However, several estimators are believed to be more accurate than alpha, with factor analysis (FA) estimators being the most commonly recommended. Furthermore, unstandardized estimators are considered more accurate than standardized estimators. In other words, the existing literature suggests that unstandardized FA estimators are the most accurate regardless of data characteristics. To test whether this conventional knowledge is appropriate, this study examines the accuracy of 12 estimators using a Monte Carlo simulation. The results show that several estimators are more accurate than alpha, including both FA and non-FA estimators. The most accurate on average is a standardized FA estimator. Unstandardized estimators (e.g., alpha) are less accurate on average than the corresponding standardized estimators (e.g., standardized alpha). However, the accuracy of estimators is affected to varying degrees by data characteristics (e.g., sample size, number of items, outliers). For example, standardized estimators are more accurate than unstandardized estimators with a small sample size and many outliers, and vice versa. The greatest lower bound is the most accurate when the number of items is 3 but severely overestimates reliability when the number of items is more than 3. In conclusion, estimators have their advantageous data characteristics, and no estimator is the most accurate for all data characteristics.
Similar content being viewed by others
Notes
This quote is thankfully taken from a reviewer’s comment.
References
Armor, D. J. (1973). Theta reliability and factor scaling. Sociological Methodology, 5, 17–50. https://doi.org/10.2307/270831
Astivia, O. L. O., Kroc, E., & Zumbo, B. D. (2020). The role of item distributions on reliability estimation: The case of Cronbach’s coefficient alpha. Educational and Psychological Measurement, 80, 825–846. https://doi.org/10.1177/0013164420903770
Bay, K. S. (1973). The effect of non-normality on the sampling distribution and standard error of reliability coefficient estimates under an analysis of variance model. British Journal of Mathematical and Statistical Psychology, 26(1), 45–57. https://doi.org/10.1111/j.2044-8317.1973.tb00505.x
Bentler, P. M. (1968). Alpha-maximized factor analysis (alphamax): Its relation to alpha and canonical factor analysis. Psychometrika, 33(3), 335–345. https://doi.org/10.1007/BF02289328
Bentler, P. M., & Woodward, J. A. (1980). Inequalities among lower bounds to reliability: With applications to test construction and factor analysis. Psychometrika, 45(2), 249–267. https://doi.org/10.1007/BF02294079
Bhattacherjee, A. (2012). Social science research: Principles, methods, and practices. https://digitalcommons.usf.edu/oa_textbooks/3
Borsboom, D., & Mellenbergh, G. J. (2002). True scores, latent variables and constructs: A comment on Schmidt and Hunter. Intelligence, 30(6), 505–514. https://doi.org/10.1016/S0160-2896(02)00082-X
Brown, W. (1910). Some experimental results in the correlation of mental abilities. British Journal of Psychology, 3(3), 296–322. https://doi.org/10.1111/j.2044-8295.1910.tb00207.x
Chissom, B. S. (1970). Interpretation of the kurtosis statistic. The American Statistician, 24(4), 19–22. https://doi.org/10.1080/00031305.1970.10477202
Cho, E. (2016). Making reliability reliable: A systematic approach to reliability coefficients. Organizational Research Methods, 19(4), 651–682. https://doi.org/10.1177/1094428116656239
Cho, E. (2021). Neither Cronbach’s alpha nor McDonald’s omega: A commentary on Sijtsma and Pfadt. Psychometrika, 86(4), 877–886. https://doi.org/10.1007/s11336-021-09801-1
Cho, E. (2022). Reliability and omega hierarchical in multidimensional data: A comparison of various estimators. Psychological Methods, Advance online publication. https://doi.org/10.1037/met0000525
Cho, E. (2022). The accuracy of reliability coefficients: A reanalysis of existing simulations. Psychological Methods, Advance online publication. https://doi.org/10.1037/met0000475
Cho, E., & Chun, S. (2018). Fixing a broken clock: A historical review of the originators of reliability coefficients including Cronbach’s alpha. Survey Research, 19(2), 23–54. https://doi.org/10.20997/sr.19.2.4
Cho, E., & Kim, S. (2015). Cronbach’s coefficient alpha: Well known but poorly understood. Organizational Research Methods, 18(2), 207–230. https://doi.org/10.1177/1094428114555994
Comrey, A. L., & Lee, H. B. (2013). A first course in factor analysis (2nd ed.). Psychology Press. https://doi.org/10.4324/9781315827506
Cronbach, L. J. (1951). Coefficient alpha and the internal structure of tests. Psychometrika, 16(3), 297–334. https://doi.org/10.1007/BF02310555
Cronbach, L. J. (1978). Citation classics. Current Contents, 13, 263.
Cronbach, L. J., & Shavelson, R. J. (2004). My current thoughts on coefficient alpha and successor procedures. Educational and Psychological Measurement, 64(3), 391–418. https://doi.org/10.1177/0013164404266386
Dunn, T. J., Baguley, T., & Brunsden, V. (2014). From alpha to omega: A practical solution to the pervasive problem of internal consistency estimation. British Journal of Psychology, 105(3), 399–412. https://doi.org/10.1111/bjop.12046
Edwards, A. A., Joyner, K. J., & Schatschneider, C. (2021). A simulation study on the performance of different reliability estimation methods. Educational and Psychological Measurement, 81(6), 1089–1117. https://doi.org/10.1177/0013164421994184
Falk, C. F., & Savalei, V. (2011). The relationship between unstandardized and standardized alpha, true reliability, and the underlying measurement model. Journal of Personality Assessment, 93(5), 445–453. https://doi.org/10.1080/00223891.2011.594129
Feldt, L. S. (1965). The approximate sampling distribution of Kuder-Richardson reliability coefficient twenty. Psychometrika. https://doi.org/10.1007/BF02289499
Feldt, L. S., & Brennan, R. L. (1989). Reliability. In R. L. Linn (Ed.), Educational measurement (3rd ed., pp. 105–146). American Council on Education and Macmillan.
Fleishman, A. I. (1978). A method for simulating non-normal distributions. Psychometrika, 43, 521–532. https://doi.org/10.1007/BF02293811
Flora, D. B. (2020). Your coefficient alpha is probably wrong, but which coefficient omega is right? A tutorial on using R to obtain better reliability estimates. Advances in Methods and Practices in Psychological Science, 3(4), 484–501. https://doi.org/10.1177/2515245920951747
Fortmann-Roe, S. (2012). Understanding the bias-variance tradeoff. http://scott.fortmann-roe.com/docs/BiasVariance.html
Foster, R. C. (2021). KR20 and KR21 for some nondichotomous data (It’s not just Cronbach’s alpha). Educational and Psychological Measurement, 81(6), 1172–1202. https://doi.org/10.1177/0013164421992535
Green, S. B., & Yang, Y. (2009). Commentary on coefficient alpha: A cautionary tale. Psychometrika, 74(1), 121–135. https://doi.org/10.1007/s11336-008-9098-4
Greer, T., Dunlap, W. P., Hunter, S. T., & Berman, M. E. (2006). Skew and internal consistency. Journal of Applied Psychology, 91(6), 1351. https://doi.org/10.1037/0021-9010.91.6.1351
Guttman, L. (1941). The quantification of a class of attributes: A theory and method of scale construction. In P. Horst (Ed.), The prediction of personal adjustment. (Vol. 48, pp. 321–345). Social Science Research Council.
Guttman, L. (1945). A basis for analyzing test-retest reliability. Psychometrika, 10(4), 255–282. https://doi.org/10.1007/BF02288892
Hancock, G., & Mueller, R. O. (2001). Rethinking construct reliability within latent variable systems. In R. Cudeck, S. du Toit, & D. Sörbom (Eds.), Structural equation modeling: Present and future—A festschrift in honor of Karl Jöreskog (pp. 195–216). Scientific Software International.
Hayashi, K., & Kamata, A. (2005). A note on the estimator of the alpha coefficient for standardized variables under normality. Psychometrika. https://doi.org/10.1007/s11336-001-0888-1
Hayes, A. F., & Coutts, J. J. (2020). Use omega rather than Cronbach’s alpha for estimating reliability. But…. Communication Methods and Measures, 14(1), 1–24. https://doi.org/10.1080/19312458.2020.1718629
Hoyt, C. (1941). Note on a simplified method of computing test reliability. Educational and Psychological Measurement, 1(1), 93–95. https://doi.org/10.1177/001316444100100109
Hoyt, C. (1941). Test reliability estimated by analysis of variance. Psychometrika, 6(3), 153–160. https://doi.org/10.1007/BF02289270
Hunt, T. D., & Bentler, P. M. (2015). Quantile lower bounds to reliability based on locally optimal splits. Psychometrika, 80(1), 182–195. https://doi.org/10.1007/s11336-013-9393-6
Jackson, D. L. (2001). Sample size and number of parameter estimates in maximum likelihood confirmatory factor analysis: A Monte Carlo investigation. Structural Equation Modeling: A Multidisciplinary Journal, 8(2), 205–223. https://doi.org/10.1207/S15328007SEM0802_3
Jackson, P. H., & Agunwamba, C. C. (1977). Lower bounds for the reliability of the total score on a test composed of non-homogeneous items: I: Algebraic lower bounds. Psychometrika, 42(4), 567–578. https://doi.org/10.1007/BF02295979
Jackson, R. W. B., & Ferguson, G. A. (1941). Studies on the reliability of tests. University of Toronto Department of Educational Research Bulletin, 12, 132.
Jöreskog, K. G. (1971). Statistical analysis of sets of congeneric tests. Psychometrika, 36(2), 109–133. https://doi.org/10.1007/BF02291393
Jorgensen, T. D., Pornprasertmanit, S., Schoemann, A. M., & Rosseel, Y. (2021). semTools: Useful tools for structural equation modeling. https://CRAN.R-project.org/package=semTools
Kaiser, H. F., & Caffrey, J. (1965). Alpha factor analysis. Psychometrika, 30(1), 1–14. https://doi.org/10.1007/BF02289743
Kline, R. B. (2015). Principles and practice of structural equation modeling (4th ed.). Guilford Press.
Kuder, G. F., & Richardson, M. W. (1937). The theory of the estimation of test reliability. Psychometrika, 2(3), 151–160. https://doi.org/10.1007/BF02288391
Kyriazos, T. A. (2018). Applied psychometrics: Sample size and sample power considerations in factor analysis (EFA, CFA) and SEM in general. Psychology, 09, 2207–2230. https://doi.org/10.4236/psych.2018.98126
Lee, S., Sriutaisuk, S., & Kim, H. (2020). Using the tidyverse package in R for simulation studies in SEM. Structural Equation Modeling: A Multidisciplinary Journal, 27(3), 468–482. https://doi.org/10.1080/10705511.2019.1644515
Li, H., Rosenthal, R., & Rubin, D. B. (1996). Reliability of measurement in psychology: From Spearman-Brown to maximal reliability. Psychological Methods, 1(1), 98–107. https://doi.org/10.1037/1082-989X.1.1.98
Lord, F. M. (1958). Some relations between Guttman’s principal components of scale analysis and other psychometric theory. Psychometrika. https://doi.org/10.1007/BF02289779
Lord, F. M., & Novick, M. R. (1968). Statistical theories of mental test scores. Addison-Wesley.
McDonald, R. P. (1999). Test theory: A unified treatment. Lawrence Erlbaum.
McNeish, D. (2018). Thanks coefficient alpha, we’ll take it from here. Psychological Methods, 23(3), 412–433. https://doi.org/10.1037/met0000144
Muthén, L. K., & Muthén, B. O. (2002). How to use a monte carlo study to decide on sample size and determine power. Structural Equation Modeling: A Multidisciplinary Journal, 9(4), 599–620. https://doi.org/10.1207/S15328007SEM0904_8
Novick, M. R. (1966). The axioms and principal results of classical test theory. Journal of Mathematical Psychology, 3(1), 1–18. https://doi.org/10.1016/0022-2496(66)90002-2
Novick, M. R., & Lewis, C. (1967). Coefficient alpha and the reliability of composite measurements. Psychometrika, 32(1), 1–13. https://doi.org/10.1007/BF02289400
Paxton, P., Curran, P. J., Bollen, K., & a., Kirby, J., & Chen, F. (2001). Monte Carlo experiments: Design and implementation. Structural Equation Modeling: A Multidisciplinary Journal, 8(2), 287–312. https://doi.org/10.1207/S15328007SEM0802_7
Peterson, R. A., & Kim, Y. (2013). On the relationship between coefficient alpha and composite reliability. Journal of Applied Psychology, 98(1), 194–198. https://doi.org/10.1037/a0030767
R Core Team. (2023). R: A language and environment for statistical computing [Computer software]. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/
Raykov, T., Anthony, J. C., & Menold, N. (2022). On the importance of coefficient alpha for measurement research: Loading equality is not necessary for alpha’s utility as a scale reliability index. Educational and Psychological Measurement, Advance online publication. https://doi.org/10.1177/00131644221104972
Raykov, T., & Marcoulides, G. A. (2019). Thanks coefficient alpha, we still need you! Educational and Psychological Measurement, 79(1), 200–210. https://doi.org/10.1177/0013164417725127
Raykov, T., & Marcoulides, G. A. (2021). On the pitfalls of estimating and using standardized reliability coefficients. Educational and Psychological Measurement, 81(4), 791–810. https://doi.org/10.1177/0013164420937345
Revelle, W. (1979). Hierarchical cluster analysis and the internal structure of tests. Multivariate Behavioral Research, 14(1), 57–74. https://doi.org/10.1207/s15327906mbr1401_4
Revelle, W. (2022). psych: Procedures for psychological, psychometric, and personality research [Computer software]. https://cran.r-project.org/web/packages/psych/index.html
Revelle, W., & Condon, D. M. (2019). Reliability from α to ω: A tutorial. Psychological Assessment, 31(12), 1395–1411. https://doi.org/10.1037/pas0000754
Revelle, W., & Zinbarg, R. E. (2009). Coefficients alpha, beta, omega, and the glb: Comments on Sijtsma. Psychometrika, 74(1), 145–154. https://doi.org/10.1007/s11336-008-9102-z
Rulon, P. J. (1939). A simplified procedure for determining the reliability of a test by split-halves. Harvard Educational Review, 9(1), 99–103.
Savalei, V., & Reise, S. P. (2019). Don’t forget the model in your model-based reliability coefficients: A reply to McNeish (2018). Collabra: Psychology, 5(1), 36. https://doi.org/10.1525/collabra.247
Sheng, Y., & Sheng, Z. (2012). Is coefficient alpha robust to non-normal data? Frontiers in Psychology, 3. https://doi.org/10.3389/fpsyg.2012.00034
Sijtsma, K., & Pfadt, J. M. (2021). Part II: On the use, the misuse, and the very limited usefulness of Cronbach’s alpha: Discussing lower bounds and correlated errors. Psychometrika, 86(4), 843–860. https://doi.org/10.1007/s11336-021-09789-8
Spearman, C. (1910). Correlation calculated from faulty data. British Journal of Psychology, 3(3), 271–295. https://doi.org/10.1111/j.2044-8295.1910.tb00206.x
Tabachnick, B. G., & Fidell, L. S. (2013). Using multivariate statistics (6th ed.). Pearson.
Ten Berge, J. M. F., & Sočan, G. (2004). The greatest lower bound to the reliability of a test and the hypothesis of unidimensionality. Psychometrika, 69(4), 613–625. https://doi.org/10.1007/BF02289858
Ten Berge, J. M. F., & Zegers, F. E. (1978). A series of lower bounds to the reliability of a test. Psychometrika, 43(4), 575–579. https://doi.org/10.1007/BF02293815
Traub, R. E. (1997). Classical test theory in historical perspective. Educational Measurement: Issues and Practice, 16(4), 8–14. https://doi.org/10.1111/j.1745-3992.1997.tb00603.x
Trizano-Hermosilla, I., & Alvarado, J. M. (2016). Best alternatives to Cronbach’s alpha reliability in realistic conditions: Congeneric and asymmetrical measurements. Frontiers in Psychology, 7, 769. https://doi.org/10.3389/fpsyg.2016.00769
Watkins, M. W. (2017). The reliability of multidimensional neuropsychological measures: From alpha to omega. The Clinical Neuropsychologist, 31(6–7), 1113–1126. https://doi.org/10.1080/13854046.2017.1317364
Westfall, P. H. (2014). Kurtosis as peakedness, 1905–2014. R.I.P. The American Statistician, 68(3), 191–195. https://doi.org/10.1080/00031305.2014.917055
Wolf, E. J., Harrington, K. M., Clark, S. L., & Miller, M. W. (2013). Sample size requirements for structural equation models: An evaluation of power, bias, and solution propriety. Educational and Psychological Measurement, 73(6), 913–934. https://doi.org/10.1177/0013164413495237
Woodhouse, B., & Jackson, P. H. (1977). Lower bounds for the reliability of the total score on a test composed of non-homogeneous items: II: A search procedure to locate the greatest lower bound. Psychometrika, 42(4), 579–591. https://doi.org/10.1007/BF02295980
Woodward, J. A., & Bentler, P. M. (1978). A statistical lower bound to population reliability. Psychological Bulletin, 85(6), 1323–1326. https://doi.org/10.1037/0033-2909.85.6.1323
Xiao, L., & Hau, K.-T. (2022). Performance of coefficient alpha and its alternatives: Effects of different types of non-normality. Educational and Psychological Measurement, Advance online publication. https://doi.org/10.1177/00131644221088240
Yang, Y., & Green, S. B. (2010). A note on structural equation modeling estimates of reliability. Structural Equation Modeling: A Multidisciplinary Journal, 17(1), 66–81. https://doi.org/10.1080/10705510903438963
Zimmerman, D. W., Zumbo, B. D., & Lalonde, C. (1993). Coefficient alpha as an estimate of test reliability under violation of two assumptions. Educational and Psychological Measurement, 53(1), 33–49. https://doi.org/10.1177/0013164493053001003
Zinbarg, R. E., Revelle, W., Yovel, I., & Li, W. (2005). Cronbach’s α, Revelle’s β and McDonald’s ωH: Their relations with each other and two alternative conceptualizations of reliability. Psychometrika, 70(1), 123–133. https://doi.org/10.1007/s11336-003-0974-7
Funding
Kwangwoon University,Research Grant of Kwangwoon University in 2022, National Research Foundation of Korea, NRF-2021S1A5A2A03061515.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Appendix: The condition for \({{{\lambda}}}_{4(\mathrm{m}\mathrm{a}\mathrm{x})}\) to equal reliability at the population level when \({{k}}\) = 3
Appendix: The condition for \({{{\lambda}}}_{4(\mathrm{m}\mathrm{a}\mathrm{x})}\) to equal reliability at the population level when \({{k}}\) = 3
Let us denote the three items as \({X}_{1}\), \({X}_{2}\), and \({X}_{3}\), where \({X}_{i}={a}_{i}+{b}_{i}F+{e}_{i}\), and denote the two split-halves as \(A\) and \(B\), where \(A\) has one item and \(B\) has two items. For simplicity, we assume that \({X}_{1}\) and \({X}_{2}\) have the smallest covariance, and \({X}_{2}\) and \({X}_{3}\) have the greatest covariance (i.e., \(Cov({X}_{1},{X}_{2})\le Cov({X}_{1},{X}_{3})\le Cov({X}_{2},{X}_{3})\)). In this case, \({\lambda }_{4({\text{max}})}\) = 4 \({\sigma }_{AB}/{\sigma }_{X}^{2}\) = 4Cov \(({X}_{3}, {X}_{1}+{X}_{2})/{\sigma }_{X}^{2}\) = \(4({b}_{1}{b}_{3}+{b}_{2}{b}_{3})/{\sigma }_{X}^{2}\). Reliability (\({\rho }_{X{X}{\prime}}\)) is = \({({b}_{1}+{b}_{2}+{b}_{3})}^{2}/{\sigma }_{X}^{2}\). Therefore, \({\lambda }_{4({\text{max}})}\) is less than \({\rho }_{X{X}{\prime}}\) if the following formula is negative: \(4\left({b}_{1}{b}_{3}+{b}_{2}{b}_{3}\right)-{({b}_{1}+{b}_{2}+{b}_{3})}^{2}\). Rearranging this formula leads to the following formula: \({-({b}_{1}+{b}_{2}-{b}_{3})}^{2}\). This formula is zero if \({b}_{3}= {b}_{1}+{b}_{2}\), and negative otherwise.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Cho, E. Beyond alpha and omega: The accuracy of single-test reliability estimators in unidimensional continuous data. Behav Res (2024). https://doi.org/10.3758/s13428-024-02361-z
Accepted:
Published:
DOI: https://doi.org/10.3758/s13428-024-02361-z