Assessing agreement with intraclass correlation coefficient and concordance correlation coefficient for data with repeated measures
Introduction
The intraclass correlation coefficient () and the concordance correlation coefficient () are two popular indices for assessing agreement between quantitative measurements taken from different observers. and are usually used for data without and with replications and the comparison between these two indices based on this data structure under a general model is reported by Chen and Barnhart (2008). However, we cannot use this methodology if repeated measurements rather than replications are collected. Several authors have studied the agreement indices of and for data with repeated measurements. Vangeneugden et al. (2005) investigated the by linear mixed models with serial correlation for inter-observer, intra-observer, and absolute agreement, where both observer and time are treated as random effects. King et al. (2007) proposed a for assessing inter-observer agreement for a response with repeated measurement, where both observer and time are treated as fixed effects. Chen and Barnhart (2011) also proposed a new for assessing inter-observer, intra-observer, and absolute agreement for data with repeated measurements where observers and times are treated as random since researchers may assess agreement among many observers who take measurements at different time points.
In addition to treating both observer and time as random effects in defining , there are situations where researchers may be interested in the case of random observer and fixed time, or fixed observer and random time. For example, a study is designed to assess the agreement among subjects’ measurements taken by observers (e.g. nurse) for two shifts. Researchers can conduct and assess an agreement study by randomly selecting several nurses from the nurse population and obtaining measurements from the chosen nurses at two time points. In this study, nurse is random and time is fixed because different nurses may produce different measurements at the specific time. Another example is to assess agreement among a fixed number of observers for image study. Patients may have no scheduled visits to take measurements by these observers. In this example, observer is fixed and time is random because the same observers may produce different measurements at different times. Therefore, any combinations of fixed or random effects for observer and time for an agreement study of or can happen depending on the goals of the researchers.
In this paper, we propose new s and s for the remaining combinations of random or fixed effects for the observer and time. We summarize and compare s and s between combinations of random or fixed effects for data with repeated measurements and illustrate the methodology with an example from image study. Section 2 studies four combinations for random or fixed effects of observer and time for the two indices, and introduces the new s and s for the remaining combinations for different agreement assessments. Section 3 presents the estimation and inference for the methods introduced in Section 2. Section 4 demonstrates the performance of the new for the case of random observer and fixed time by a simulation study. Section 5 illustrates four combinations for and by image data. Finally, Section 6 discusses the comparison between these two indices.
Section snippets
Methodology
Consider that there are randomly selected subjects where measurements are taken by observers at time points. Two factors, observer and time, can be treated either as random or fixed. If the factor is treated as random, the levels of this factor are treated as random samples from the corresponding population. If the factor is treated as fixed, the levels of this factor are the finite number of levels for this factor. Chen and Barnhart (2011) proposed for random observers and random
Estimation and inference
The point estimation and statistical inference of for case (1) has been proposed by Chen and Barnhart (2011). Similar to their previous work, we present the estimation and inference of for case (2) in Section 3.1, while the remaining cases can be done in a similar fashion. The estimation and inference of for cases (1) through (4) are shown in Section 3.2 for different ANOVA assumptions.
Simulation
To evaluate the performance of , and , we carried out simulations based on 1000 Monte Carlo data sets. We only present the simulation results for the case of random observers and fixed times. Simulation results for the case of random observers and random times were reported previously (Chen and Barnhart, 2011). The results for the case of fixed observers and random times as well as fixed observers and fixed times are similar to the case of random observers and fixed
Data analysis
We use the image data discussed in Chen and Barnhart (2011) for illustrations. The purpose of the image study is to evaluate the pulmonary arterial hypertension measures by 2D-echocardiograms. To assess the agreement between sonographers who measure the 2D-echocardiogram images, two sonographers make measurements twice on 10 patients. The variables of interest for assessing agreement are Visual Ejection Fraction, Biplane Ejection Fraction, and Right Atrium Volume. The number of patients without
Discussion
In this paper, we have proposed new indices for for assessing inter-observer, intra-observer, and absolute agreement under all four combinations for random or fixed effects of observer and time factors for data with repeated measurements. The point estimates of these s regarding random or fixed effects are obtained by using the method of moments approach for each component of the index. The sample estimates approach used in the paper is a non-parametric approach that has the
References (9)
- et al.
Comparison of and for assessing agreement for data without and with replications
Computational Statistics and Data Analysis
(2008) - et al.
Assessing intra, inter and total agreement with replicated readings
Statistics in Medicine
(2005) - et al.
Assessing agreement with repeated measures for random observers
Statistics in Medicine
(2011) - et al.
A repeated measures concordance correlation coefficient
Statistics in Medicine
(2007)
Cited by (31)
Anomalous aortic origin of coronary artery biomechanical modeling: Toward clinical application
2021, Journal of Thoracic and Cardiovascular SurgeryCitation Excerpt :We believe it is unlikely that the pulmonary artery (lower pressure system) can compress the aortic root and coronaries (higher pressure system). We evaluate consistency and reproducibility among the 3 observers with interclass coefficient correlation (ICC), estimated using a 2-way random-effect model based on a single rating and absolute agreement.9 For each of the 25 parameters, we calculated the ICC estimation with 95% confidence interval and P value.
Entrepreneurial fear of failure: Scale development and validation
2020, Journal of Business VenturingCitation Excerpt :Thus, proving a high short-term retest stability of inter-individual differences is a necessary requirement for any study of personality.” To examine this, we used the Intraclass Correlation Coefficient—adopting a two-way, fixed effects, consistency approach—and focused on the stability of each single dimension and the aggregated scale across the three waves of data (McGraw and Wong, 1996; Chen and Barnhart, 2013). The intraclass correlation coefficient presents a more appropriate test of the stability of scores across test-retest situations (Shrout and Fleiss, 1979; McGraw and Wong, 1996).
Deep learning-based automatic sella turcica segmentation and morphology measurement in X-ray images
2023, BMC Medical ImagingWeighted Alternative Coefficient of Concordance
2023, Proceedings - 2023 5th International Conference on Control Systems, Mathematical Modeling, Automation and Energy Efficiency, SUMMA 2023