A SAS® macro for constructing simultaneous confidence intervals for multinomial proportions
Introduction
Several methods of constructing simultaneous confidence intervals for multinomial proportions have been proposed, however, statistical computing packages do not always include support for the various methods of construction. Although a function could be written in a high-level programming language such as C or FORTRAN, researchers need a method of constructing simultaneous confidence intervals that is readily available. We have written a SAS macro using PROC IML that takes multinomial cell counts as input and constructs simultaneous confidence intervals at the specified coverage level for a variety of methods of construction.
In Section 2we motivate the problem with a brief overview of six methods of construction provided in the macro. Because the binomial distribution is a special case, we discuss simultaneous construction for binomial proportions in Section 3. We discuss the macro for confidence interval construction in Section 4and make suggestions concerning the method that performs best in most practical situations in Section 5. In Appendix A, we list the macro code, and in Appendix B, we illustrate the macro call with an example and provide program output.
Section snippets
Methods
Suppose outcomes on a categorical response variable are such that each outcome can be classified in exactly one of k cells of a k×1 table. The cell counts can be written in vector form as n=(n1,…, nk)T where ni (i=1,…, k) is the number of observations in the ith cell and the superscript T denotes the transpose operator. If the total sample size n+=n1+⋯+nk is fixed, n represents a sample vector from a multinomial distribution with underlying multinomial probability vector π=(π1,…, πk)T. The maximum
Application to binomial proportions
Because the binomial is a special case of the multinomial distribution, it is instructive to consider the methods proposed in relation to the binomial. Let π1 and π2 denote the binomial parameters and let n1, n2, p1 and p2 represent the cell counts and observed proportions. Let π0,1 and π0,2 be the hypothesized values of π1 and π2, respectively. To develop Pearson's goodness-of-fit statistic, let A=[1 0] so that Ap=p1 and AΣAT=π1(1−π1)/n+. Pearson's goodness-of-fit statistic
The SAS macro
To aid the researcher, we have written a SAS macro using PROC IML [10] that takes cell counts as input and constructs confidence intervals for the multinomial proportions as output. In addition to selecting the method used for confidence interval construction, the user can specify the desired coverage probability using a simple macro call. The macro is given in Appendix Aand a sample calling program is given in Appendix Balong with program output.
First, we prepare a SAS data set in a DATA step
Conclusions
Based on simulation studies, May and Johnson [4] stated that the Bonferroni adjustment suggested by Goodman [2] performs well in most practical situations when the number of categories is greater than 2 and each cell count is greater than 5, provided the number of categories is not too large. The methods of Quesenberry and Hurst [1] are conservative and, in general, yield wider intervals than those based on Goodman's [2] method. As an added advantage of both the Quesenberry and Hurst [1] and
References (10)
- C.P. Quesenberry and D.C. Hurst, Large sample simultaneous confidence intervals for multinomial proportions,...
- L.A. Goodman, On simultaneous confidence intervals for multinomial proportions, Technometrics 7 (1965)...
- S. Fitzpatrick and A. Scott, Quick simultaneous confidence intervals for multinomial proportions, J. Am. Stat. Assoc....
- W.L. May and W.D. Johnson, Properties of simultaneous confidence intervals for multinomial proportions, Commun. Stat....
- K. Pearson, On the criterion that a given system of deviations from the probable in the case of a correlated system of...