Abstract
We present an approach to meta-analytic structural equation models that relies on hierarchical modeling of sample covariance matrices under the assumption that the matrices are Wishart. The approach handles the commonplace fixed- and random-effects meta-analytic SEMs, and solves the problem of dependent covariance matrices where more than one covariance matrix is obtained from a single study or study author. The ability of the approach to adequately recover parameters is examined via a simulation study. The approach is implemented in the bayesianmasem R package and a demonstration shows applications of the model.
Similar content being viewed by others
Notes
Although this is the convention with TSSEM, we believe that random-effects models should be preferred by default as it is unlikely that different studies sample the exact same population. Moreover, the classification of a fit index as ‘small’ is not straightforward (e.g. McNeish et al., 2018; Savalei, 2012; Ximénez et al., 2022).
Although path and regression models are the most common MASEMs, these models do not involve latent variables.
\(\varepsilon \) is often close in value to the RMSEA obtained from a multi-group SEM where parameters are constrained equal across groups.
As m gets larger, the random-effects deviations as implied by the inverse-Wishart distribution is increasingly approximately a zero-mean multivariate normal vector (Wu & Browne, 2015) but with a more constrained covariance matrix than the unstructured covariance matrix estimated by TSSEM.
bayesianmasem does not handle the problem of missing data.
The difference in relative bias for loadings and error variances occurs because variances are on the squared loading scale. Alternatively stated, the relative bias of loadings was the same as the relative bias of error standard deviations.
Following from the mean of an inverse-Wishart distribution, the relative bias of model-implied covariance matrix elements in the random-effects model is: \(\left[ \left( \nicefrac {m_1}{(m_1 - p - 1)}\right) \big / \left( \nicefrac {m}{(m - p - 1)}\right) \right] - 1,\ m_{(1)} = \varepsilon _{(1)}^{-2} + p - 1\).
The degrees of freedom are based on the number replications, 1000.
References
Archakov, I., & Hansen, P. R. (2021). A new parametrization of correlation matrices. Econometrica, 89(4), 1699–1715. https://doi.org/10.3982/ECTA16910
Archakov, I., Hansen, P. R., & Luo, Y. (2022). A new method for generating random correlation matrices. Retrieved 23 Aug 2023, from arXiv:2210.08147
Betancourt, M. (2017). A conceptual introduction to Hamiltonian Monte Carlo. arXiv:1701.02434
Carpenter, B., Gelman, A., Hoffman, M. D., Lee, D., Goodrich, B., Betancourt, M., ... Riddell, A. (2017). Stan: A probabilistic programming language. Journal of Statistical Software, 76(1). https://doi.org/10.18637/jss.v076.i01
Cheung, M.W.-L. (2014). Fixed- and random-effects meta-analytic structural equation modeling: Examples and analyses in R. Behavior Research Methods, 46(1). https://doi.org/10.3758/s13428-013-0361-y
Cheung, M.W.-L. (2015). metaSEM: An R package for meta-analysis using structural equation modeling. Frontiers in Psychology, 5. https://doi.org/10.3389/fpsyg.2014.01521
Cheung, M.W.-L., & Chan, W. (2005). Meta-analytic structural equation modeling: A two-stage approach. Psychological Methods, 10(1), 40–64. https://doi.org/10.1037/1082-989X.10.1.40
Cheung, M.W.-L., & Chan, W. (2009). A two-stage approach to synthesizing covariance matrices in meta-analytic structural equation modeling. Structural Equation Modeling: A Multidisciplinary Journal, 16(1), 28–53. https://doi.org/10.1080/10705510802561295
Cheung, M.W.-L., & Cheung, S. F. (2016). Random-effects models for meta-analytic structural equation modeling: Review, issues, and illustrations. Research Synthesis Methods, 7(2), 140–155. https://doi.org/10.1002/jrsm.1166
Conti, G., Frühwirth-Schnatter, S., Heckman, J. J., & Piatek, R. (2014). Bayesian exploratory factor analysis. Journal of Econometrics, 183(1), 31–57. https://doi.org/10.1016/j.jeconom.2014.06.008
Cook, S. R., Gelman, A., & Rubin, D. B. (2006). Validation of software for Bayesian models using posterior quantiles. Journal of Computational and Graphical Statistics, 15(3), 675–692. https://doi.org/10.1198/106186006X136976
Digman, J. M. (1997). Higher-order factors of the Big Five. Journal of Personality and Social Psychology, 73(6), 1246–1256. https://doi.org/10.1037/0022-3514.73.6.1246
Furlow, C. F., & Beretvas, S. N. (2005). Meta-analytic methods of pooling correlation matrices for structural equation modeling under different patterns of missing data. Psychological Methods, 10(2), 227–254. https://doi.org/10.1037/1082-989X.10.2.227
Gelman, A. (2006). Prior distributions for variance parameters in hierarchical models (comment on article by Browne and Draper). Bayesian Analysis, 1(3), 515–534. https://doi.org/10.1214/06-BA117A
Granström, K., & Orguner, U. (2011). Properties and approximations of some matrix variate probability density functions. Division of Automatic Control, Linköping University. Linköping, Sweden. Retrieved from http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-88735
Gupta, A. K., & Nagar, D. K. (1999). Matrix variate distributions. CRC Press.
Hedges, L. V., & Olkin, I. (1985). Statistical methods for meta-analysis. San Diego, CA: Academic Press.
Jak, S., & Cheung, M. W. L. (2020). Meta-analytic structural equation modeling with moderating effects on SEM parameters. Psychological Methods, 25(4), 430–455. https://doi.org/10.1037/met0000245
Jak, S., & Cheung, M.W.-L. (2018). Accounting for missing correlation coefficients in fixed-effects masem. Multivariate Behavioral Research, 53(1), 1–14. https://doi.org/10.1080/00273171.2017.1375886
Ke, Z., Zhang, Q., & Tong, X. (2019). Bayesian meta-analytic SEM: A one-stage approach to modeling between-studies heterogeneity in structural parameters. Structural Equation Modeling: A Multidisciplinary Journal, 26(3), 348–370. https://doi.org/10.1080/10705511.2018.1530059
Kim, S., Moon, H., Modrák, M., & Säilynoja, T. (2023). SBC: Simulation based calibration for rstan/cmdstanr models. https://hyunjimoon.github.io/SBC/. https://github.com/hyunjimoon/SBC/.
Lemoine, N. P. (2019). Moving beyond noninformative priors: Why and how to choose weakly informative priors in Bayesian analyses. Oikos, 128(7), 912–928. https://doi.org/10.1111/oik.05985
MacCallum, R. C., & Tucker, L. R. (1991). Representing sources of error in the common-factor model: Implications for theory and practice. Psychological Bulletin, 109(3), 502–511. https://doi.org/10.1037/0033-2909.109.3.502
McNeish, D., An, J., & Hancock, G. R. (2018). The thorny relation between measurement quality and fit index cutoffs in latent variable models. Journal of Personality Assessment, 100(1), 43–52. https://doi.org/10.1080/00223891.2017.1281286
Merkle, E. C., Fitzsimmons, E., Uanhoro, J., & Goodrich, B. (2021). Efficient Bayesian structural equation modeling in Stan. Journal of Statistical Software, 100(6), 1–22. https://doi.org/10.18637/jss.v100.i06
Olkin, I., & Finn, J. D. (1995). Correlations redux. Psychological Bulletin, 118(1), 155–164. https://doi.org/10.1037/0033-2909.118.1.155
Oort, F. J., & Jak, S. (2016). Maximum likelihood estimation in meta-analytic structural equation modeling. Research Synthesis Methods, 7(2), 156–167. https://doi.org/10.1002/jrsm.1203
Peeters, C. F. W. (2012). Rotational uniqueness conditions under oblique factor correlation metric. Psychometrika, 77(2), 288–292. https://doi.org/10.1007/s11336-012-9259-3
Säilynoja, T., Bürkner, P.-C., & Vehtari, A. (2022). Graphical test for discrete uniformity and its applications in goodness of fit evaluation and multiple sample comparison. Statistics and Computing, 32(2), 32. https://doi.org/10.1007/s11222-022-10090-6
Savalei, V. (2012). The relationship between root mean square error of approximation and model misspecification in confirmatory factor analysis models. Educational and Psychological Measurement, 72(6), 910–932. https://doi.org/10.1177/0013164412452564
Talts, S., Betancourt, M., Simpson, D., Vehtari, A., & Gelman, A. (2018). Validating Bayesian inference algorithms with simulation-based calibration. arXiv:1804.06788
Uanhoro, J. O. (2023). Hierarchical covariance estimation approach to meta-analytic structural equation modeling. Structural Equation Modeling: A Multidisciplinary Journal, 30(4), 532–546. https://doi.org/10.1080/10705511.2022.2142128
Vehtari, A., Gabry, J., Magnusson, M., Yao, Y., Bürkner, P.-C., Paananen, T., & Gelman, A. (2020). loo: Efficient leave-one-out cross-validation and WAIC for Bayesian models. Version R package version 2.3.1. Retrieved from https://mc-stan.org/loo/
Vehtari, A., Gelman, A., & Gabry, J. (2017). Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC. Statistics and Computing, 27(5), 1413–1432. https://doi.org/10.1007/s11222-016-9696-4
Vehtari, A., Gelman, A., Simpson, D., Carpenter, B., & Bürkner, P.-C. (2020). Rank-normalization, folding, and localization: An improved R for assessing convergence of MCMC. Bayesian Analysis, 1–28. https://doi.org/10.1214/20-BA1221
Viswesvaran, C., & Ones, D. S. (1995). Theory testing: Combining psychometric meta-analysis and structural equations modeling. Personnel Psychology, 48(4), 865–885. https://doi.org/10.1111/j.1744-6570.1995.tb01784.x
Widaman, K. F., Little, T. D., Preacher, K. J., & Sawalani, G. M. (2011). On creating and using short forms of scales in secondary research. In: Secondary data analysis: An introduction for psychologists (pp. 39–61). https://doi.org/10.1037/12350-003
Wilson, S. J., Polanin, J. R., & Lipsey, M. W. (2016). Fitting meta-analytic structural equation models with complex datasets. Research Synthesis Methods, 7(2), 121–139. https://doi.org/10.1002/jrsm.1199
Wu, H., & Browne, M. W. (2015). Quantifying adventitious error in a covariance structure as a random effect. Psychometrika, 80(3), 571–600. https://doi.org/10.1007/s11336-015-9451-3
Ximénez, C., Maydeu-Olivares, A., Shi, D., & Revuelta, J. (2022). Assessing cutoff values of SEM fit indices: Advantages of the unbiased SRMR index and its cutoff criterion based on communality. Structural Equation Modeling: A Multidisciplinary Journal, 29(3), 368–380. https://doi.org/10.1080/10705511.2021.1992596
Yuan, K.-H., & Kano, Y. (2018). Meta-analytical SEM: Equivalence between maximum likelihood and generalized least squares. Journal of Educational and Behavioral Statistics, 43(6), 693–720. https://doi.org/10.3102/1076998618787799
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Open Practices Statement
All code for simulation studies and data analysis are available at https://osf.io/yd5q4/.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix A: Additional simulation results
Appendix B: Simulation-based calibration – Digman (1997) application
The data generation process (DGP) for the SBC study was based on the Digman (1997) example. The exact DGP was:
Priors were chosen such that the generated data would produce valid covariance matrices (e.g. Merkle et al., 2021; Uanhoro, 2023). Loadings and residual standard deviations had median values of 0.8 and 0.6 respectively. And \(m_1\) and \(m_2\) priors were chosen such that the median value of \(\rho \) would be about 0.25, \(\exp (-6) / (\exp (-5) + \exp (-6))\).
The distribution of parameters based on Eq. B2 is shown in Fig. 8.
For the SBC study, each model was estimated using a single chain. We requested 5000 iterations, 1000 iterations were discarded for warmup, while the remaining 4000 iterations were thinned at every second iteration to reduce autocorrelation between posterior samples. Thus, 2000 posterior samples were retained per parameter. Finally, we repeated this process 1000 times.
Evaluation of SBC results was based on graphical summaries recommended by Säilynoja et al. (2022). We report the evaluations in Figs. 9 and 10 – these figures were produced using the SBC package in R (Kim et al., 2023).
Our expectation is that the distribution of ranks for each parameter are uniformly distributed. When this is true, the histogram counts will often remain within the 95% simultaneous confidence bands – this expectation is met for all parameters with very few exceptions, see Fig. 9. This suggests adequate calibration of all parameters.
The evaluation via histogram is sensitive to the number of bins. Hence, we also assessed the empirical cumulative distribution function (ECDF) of the ranks. Precisely, we assessed the difference of the ECDF from the theoretical CDF of a uniform variable. When these differences are contained within the 95% simultaneous bands, parameters are adequately calibrated. This expectation is met for all parameters, Fig. 10.
We also repeated the testing-based SBC evaluation procedures in Uanhoro (2023). The SBC ranks are first transformed to rankits: \(q_i = (r_i + 0.5)(L + 1)^{-1}\), where \(r_i\) are the ranks and \(L = 2000\), the number of retained posterior samples. The standard normal quantile function was applied to the rankits. If the ranks were approximately uniform, then the result should be an approximately standard normal variable. The bias of the mean (difference from 0 based on the one-sample \(t_{999}\) test), bias of the variance (difference from 1 based on the one-sample \(\chi ^2_{999}\) test), and a \(\chi ^2_{1000}\) test of standard normality (Cook et al., 2006) were then used to assess the standard normality expectation.Footnote 8 As shown in Fig. 11, no parameter resulted in a statistically significant test suggesting calibration for all parameters.
Appendix C: Simulation study of dependent correlation matrices
As mentioned in the Discussion section, we repeated the simulation study in the paper, but transformed the sample covariance matrices to correlation matrices prior to data analysis. Following from the expectation of an inverse-Wishart distribution, our model when applied to these data should return the following structured covariance matrix: \(\varvec{\Omega }(\varvec{\theta })(m_1 - p - 1)m_1^{-1}\), where \(m_1 = \varepsilon _1^{-2} + p - 1\) and \(\varepsilon _1 = \varepsilon \sqrt{(1-\rho )}\). Hence, the bias and empirical coverage rate evaluations are adjusted to reflect this. We excluded conditions where \(\rho = .75\) and \(\varepsilon = 0.05\) as analysis runtimes for the three conditions \((c \in \{5, 15, 25\})\) were overly time consuming. Finally, we only ran 300 replications, down from 1000 replications in the original study.
Results are reported in Figs. 12, 13, and 14. Most parameters had acceptable levels of bias \((<|10\%|)\), apart from \(\varepsilon \) which was sometimes downwardly biased (especially when \(\varepsilon = 0.05\)). We believe this downward bias occurs because the process of converting a covariance matrix to a correlation matrix eliminates important variation, given the data generation process. Posterior standard deviations were often upwardly biased, especially for loading parameters. This suggests overly conservative inference, and resulted in higher than nominal coverage rates especially for loading parameters. Coverage for \(\varepsilon \) was always low given the downward parameter bias, and there were also periods of under-coverage for loading parameters at the combination of high values of \(\varepsilon \) and larger number of clusters. This under-coverage likely occurs even in the presence of overly wide posterior standard deviations because of the combination of some parameter bias and increased precision of posterior standard deviations at larger sample size. Finally, as with the original simulation study, there were problems estimating the dispersion parameters, see Fig. 15.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Uanhoro, J.O. Handling dependent samples in meta-analytic structural equation models: A Wishart-based approach. Behav Res (2024). https://doi.org/10.3758/s13428-024-02340-4
Accepted:
Published:
DOI: https://doi.org/10.3758/s13428-024-02340-4