A generalized Conway–Maxwell–Poisson distribution which includes the negative binomial distribution
Introduction
In statistical research, it is important to select an adequate distribution to describe the observed variation of counts. The Poisson distribution is a classically utilized model for analyzing count data. However, it has a serious restriction that the variance is equal to the mean or, equivalently, the index of dispersion (the ratio of variance to mean) is one, because observed count data do not satisfy the equality of the sample mean and variance in many cases. For many observed count data, it is common to have the sample variance to be greater or smaller than the sample mean which are referred to as over-dispersion and under-dispersion, respectively, relative to the Poisson distribution. Information on dispersion is useful for selecting an appropriate model to count data. For example, the negative binomial distribution is often selected for over-dispersed data and the binomial distribution is for under-dispersed data.
Shmueli et al. [7] have revived the Conway–Maxwell–Poisson (COM-Poisson for short) distribution, originally developed by Conway and Maxwell [2] as a solution to handling queueing systems with state-dependent arrival or service rates, and indicated its flexibility to adapt to over- and under-dispersions. The COM-Poisson distribution has the probability mass function (pmf)for and and reduces to the geometric distribution when and and the Bernoulli distribution when . This means that the COM-Poisson distribution can become an over- or under-dispersed model. This flexibility greatly expands the types of problems for which the COM-Poisson distribution can be used to model count data.
In empirical modeling, the length of the tail parts of the distribution is an important factor. The negative binomial distribution is a generalized form of the geometric distribution and becomes a longer-tailed distribution. This paper proposes a generalization of the COM-Poisson (GCOM-Poisson for short) distribution, which includes the negative binomial distribution as a special case and, therefore, can become a longer-tailed model than the original COM-Poisson distribution. Moreover, the GCOM-Poisson can become a bimodal distribution where one of the modes is at zero and, therefore, can be adapted to count data with excess zeros. The flexibility of the dispersion and the length of the tail and applicability to excess zeros make the proposed distribution more versatile than the COM-Poisson distribution.
This paper is arranged as follows. The definition of the GCOM-Poisson distribution with some properties is given in Section 2. In Section 3, we consider methods of estimation for fitting the proposed distribution to real data sets and numerical examples using the methods are given in Section 4. Finally, our conclusion is given in Section 5.
Section snippets
Definition
A random variable X is said to have the GCOM-Poisson distribution with three parameters and ifwhere the normalizing constant is given byfor and or and . The ratios of consecutive probabilities are formed asand it can be seen that converges for or and . Hence, the parameter space of the GCOM-Poisson distribution is and or and
Estimation
In this section, we deal with the methods using first three moments or four consecutive probabilities for estimating the parameters of the GCOM-Poisson distribution. The estimated parameters obtained from these methods are crude and, therefore, refined by feeding them as initial values into the maximum likelihood estimation (MLE), which is more accurate and the best way to do inference.
Numerical examples
In this section, we give three examples of fittings to practical data by the GCOM-Poisson distribution and compare them with those by the COM-Poisson distribution to illustrate its utility and flexibility.
The first is quarterly sales of a well-known brand of a particular article of clothing (Shmueli et al., [7]), which is over-dispersed and long-tailed count data and the second is the length of words in a Hungarian dictionary (Wimmer et al., [8]), which is under-dispersed count data. These two
Conclusion
The Conway–Maxwell–Poisson distribution was originally developed in queueing systems and revived as a flexible distribution to over- and under-dispersions. The generalized Conway–Maxwell–Poisson distribution proposed in this paper has the flexibility to model the tail behavior and the dispersion. Moreover, the proposed distribution can become the bimodal distribution where one of the modes is at zero and this fact leads to the use of this distribution for the count data with excess zeros
Acknowledgments
I am most grateful to the reviewer for his positive and constructive suggestions which led to a greatly improved version of this paper. My sincere gratitude also goes to Keio University and The Institute of Statistical Mathematics for the supports during the preparation of this paper.
References (8)
- et al.
Overdispersed and underdispersed Poisson generalizations
J. Stat. Plan. Inference
(2005) - et al.
A queueing model with state dependent service rates
J. Ind. Eng.
(1962) - et al.
Analysis of discrete data by Conway–Maxwell Poisson distribution
AStA Adv. Stat. Anal.
(2014) - T.P. Minka, G. Shmueli, J.B. Kadane, S. Borle, P. Boatwright, Computing With The COM-Poisson Distribution, Technical...
Cited by (18)
A fast look-up method for Bayesian mean-parameterised Conway–Maxwell–Poisson regression models
2023, Statistics and ComputingA progressive mean control chart for dispersed count data considering tail behavior
2023, Quality Technology and Quantitative ManagementOn the Conway-Maxwell-Poisson point process
2023, Communications in Statistics - Theory and MethodsA new exponentially weighted moving average control chart to monitor count data with applications in healthcare and manufacturing
2023, Journal of Statistical Computation and SimulationA Flexible Model for Time Series of Counts with Overdispersion or Underdispersion, Zero-Inflation and Heavy-Tailedness
2023, Communications in Mathematics and Statistics