A SAS® macro for constructing simultaneous confidence intervals for multinomial proportions

https://doi.org/10.1016/S0169-2607(97)01809-9Get rights and content

Abstract

Quesenberry and Hurst (1964), Goodman (1965) and Fitzpatrick and Scott (1987) proposed simultaneous construction of confidence intervals for multinomial proportions, however, statistical computing packages do not generally give one the option of specifying the type of construction to be used. We have written a SAS macro using PROC IML that takes multinomial cell counts as input and returns simultaneous confidence intervals with the user-specified coverage probability. Two main features of the macro are its ease of use and its flexibility in allowing the user to choose among six methods of constructing confidence intervals for multinomial proportions. Based on simulation May and Johnson (1997) recommended the intervals proposed by Goodman (1965) in most practical applications.

Introduction

Several methods of constructing simultaneous confidence intervals for multinomial proportions have been proposed, however, statistical computing packages do not always include support for the various methods of construction. Although a function could be written in a high-level programming language such as C or FORTRAN, researchers need a method of constructing simultaneous confidence intervals that is readily available. We have written a SAS macro using PROC IML that takes multinomial cell counts as input and constructs simultaneous confidence intervals at the specified coverage level for a variety of methods of construction.

In Section 2we motivate the problem with a brief overview of six methods of construction provided in the macro. Because the binomial distribution is a special case, we discuss simultaneous construction for binomial proportions in Section 3. We discuss the macro for confidence interval construction in Section 4and make suggestions concerning the method that performs best in most practical situations in Section 5. In Appendix A, we list the macro code, and in Appendix B, we illustrate the macro call with an example and provide program output.

Section snippets

Methods

Suppose outcomes on a categorical response variable are such that each outcome can be classified in exactly one of k cells of a k×1 table. The cell counts can be written in vector form as n=(n1,…, nk)T where ni (i=1,…, k) is the number of observations in the ith cell and the superscript T denotes the transpose operator. If the total sample size n+=n1+⋯+nk is fixed, n represents a sample vector from a multinomial distribution with underlying multinomial probability vector π=(π1,…, πk)T. The maximum

Application to binomial proportions

Because the binomial is a special case of the multinomial distribution, it is instructive to consider the methods proposed in relation to the binomial. Let π1 and π2 denote the binomial parameters and let n1, n2, p1 and p2 represent the cell counts and observed proportions. Let π0,1 and π0,2 be the hypothesized values of π1 and π2, respectively. To develop Pearson's goodness-of-fit statistic, let A=[1 0] so that Ap=p1 and AΣAT=π1(1−π1)/n+. Pearson's goodness-of-fit statisticX2=n+(p1−π0,1)2π0,1+n+

The SAS macro

To aid the researcher, we have written a SAS macro using PROC IML [10] that takes cell counts as input and constructs confidence intervals for the multinomial proportions as output. In addition to selecting the method used for confidence interval construction, the user can specify the desired coverage probability using a simple macro call. The macro is given in Appendix Aand a sample calling program is given in Appendix Balong with program output.

First, we prepare a SAS data set in a DATA step

Conclusions

Based on simulation studies, May and Johnson [4] stated that the Bonferroni adjustment suggested by Goodman [2] performs well in most practical situations when the number of categories is greater than 2 and each cell count is greater than 5, provided the number of categories is not too large. The methods of Quesenberry and Hurst [1] are conservative and, in general, yield wider intervals than those based on Goodman's [2] method. As an added advantage of both the Quesenberry and Hurst [1] and

References (10)

  • C.P. Quesenberry and D.C. Hurst, Large sample simultaneous confidence intervals for multinomial proportions,...
  • L.A. Goodman, On simultaneous confidence intervals for multinomial proportions, Technometrics 7 (1965)...
  • S. Fitzpatrick and A. Scott, Quick simultaneous confidence intervals for multinomial proportions, J. Am. Stat. Assoc....
  • W.L. May and W.D. Johnson, Properties of simultaneous confidence intervals for multinomial proportions, Commun. Stat....
  • K. Pearson, On the criterion that a given system of deviations from the probable in the case of a correlated system of...
There are more references available in the full text version of this article.

Cited by (0)

View full text