Abstract
The detection of heterogeneity among objects (products, treatments, medical studies) assessed on a series of blocks (consumers, patients, methods, pathologists) is critical in numerous areas such as clinical research, cosmetic studies, or survey analysis. The Cochran’s Q test is the most widely used test for identifying heterogeneity on binary data (success vs. failure, cure vs. not cure, 1 vs. 0, etc.). For a large number of blocks, the Q distribution can be approximated by a χ2 distribution. Unfortunately, this does not hold for limited sample sizes or sparse tables. In such situations, one has to either run Monte Carlo simulations or compute the exact Q distribution to obtain an accurate and reliable result. However, the latter method is often disregarded in favor of the former due to computational expense considerations. The purpose of this article is to propose an extremely fast implementation of the exact Cochran’s Q test so one can benefit from its accuracy at virtually no cost regarding computation time. It is implemented as a part of the XLSTAT statistical software (Addinsoft 2015). After a short presentation of the Cochran’s Q test and the motivation for its exact version, we detail our approach and present its actual implementation. We then demonstrate the gain of this algorithm with performance evaluations and measurements. Comparisons against a well-established implementation have shown an increase of the computational velocity by a factor ranging from 100 up to 1× 106 in the most favorable cases.
Supplemental Material
Available for Download
Software for Fast Computation of the Non-Asymptotic Cochran's Q Statistic for Heterogeneity Detection
- Addinsoft. 2015. XLSTAT 2015: Data analysis and statistical solution for microsoft excel. ADDINSOFT Corporation (2015).Google Scholar
- Nils Blomqvist and others. 1951. Some tests based on dichotomization. Ann. Math. Stat. 22, 3 (1951), 362--371.Google ScholarCross Ref
- Brian S. Cade and Jon D. Richards. 2005. User manual for BLOSSOM statistical software. US Geological Survey Open-File Report 1353 (2005), 124.Google Scholar
- William G. Cochran. 1950. The comparison of percentages in matched samples. Biometrika (1950), 256--266.Google Scholar
- William Feller. 1950. An introduction to probability theory and its applications, John Wiley 8 Sons.Google Scholar
- Quinn McNemar. 1947. Note on the sampling error of the difference between correlated proportions or percentages. Psychometrika 12, 2 (1947), 153--157.Google ScholarCross Ref
- Cyrus Mehta and Nitin Ratilal Patel. 1992. StatXact: User Manual: Statistical Software for Exact Nonparametric Inference. CYTEL Software Corporation.Google Scholar
- Paul W. Mielke and Kenneth J. Berry. 1995. Nonasymptotic inferences based on Cochran’s Q test. Percept. Motor Skills 81, 1 (1995), 319--322.Google ScholarCross Ref
- Paul W. Mielke and Kenneth J. Berry. 2007. Permutation Methods: A Distance Function Approach. Springer Science 8 Business Media. Google ScholarDigital Library
- Kashinath D. Patil. 1975. Cochran’s Q test: Exact distribution. J. Am. Stat. Assoc. 70, 349 (1975), 186--189.Google ScholarCross Ref
Index Terms
- Algorithm 983: Fast Computation of the Non-Asymptotic Cochran’s Q Statistic for Heterogeneity Detection
Recommendations
Bootstrap likelihood ratio confidence bands for survival functions under random censorship and its semiparametric extension
Simultaneous confidence bands for survival functions, from randomly right censored data, can be computed by inverting likelihood ratio functions based on appropriate thresholds. Sometimes, however, the requisite asymptotic distributions are intractable, ...
Accessible Streaming Algorithms for the Chi-Square Test
SSDBM '20: Proceedings of the 32nd International Conference on Scientific and Statistical Database ManagementWe present space-efficient algorithms for performing Pearson’s chi-square goodness-of-fit test in a streaming setting. Since the chi-square test is one of the most well known and commonly used tests in statistics, it is surprising that there has been no ...
Testing marginal homogeneity against stochastically ordered marginals for r×r contingency tables
A square contingency table often appears in social, biomedical and behavioral science to be used to display joint responses when two variables have the same category levels. When responses are ordered categories, it is usually important to test the ...
Comments