Abstract
Finite mixture modeling have been applied for different data mining tasks. The majority of the work done concerning finite mixture models has focused on mixtures for continuous data. However, many applications involve and generate discrete data for which discrete mixtures are better suited. In this paper, we investigate the problem of discrete data modeling using finite mixture models. We propose a novel mixture that we call the multinomial generalized Dirichlet mixture. We designed experiments involving spatial color image databases modeling and summarization to show the robustness, flexibility and merits of our approach.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bouguila, N., Ziou, D.: Unsupervised Learning of a Finite Discrete Mixture: Applications to Texture Modeling and Image Databases Summarization. Journal of Visual Communication and Image Representation 18(4), 295–309 (2007)
Rennie, J.D.M., Shih, L., Teevan, J., Karger, D.R.: Tackling the Poor Assumptions of Naive Bayes Text Classifiers. In: Proc. of the Twentieth International Conference on Machine Learning (ICML 2003), pp. 616–623 (2003)
Madsen, R.E., Kauchak, D., Elkan, C.: Modeling Word Buristness Using the Dirichlet Distribution. In: Proc. of the 22nd International Conference on Machine Learning (ICML 2005), pp. 545–552. ACM Press, New York (2005)
Bouguila, N., Ziou, D., Vaillancourt, J.: Unsupervised Learning of a Finite Mixture Model Based on the Dirichlet Distribution and its Application. IEEE Transactions on Image Processing 13(11), 1533–1543 (2004)
Bouguila, N., Ziou, D.: A Hybrid SEM Algorithm for High-Dimensional Unsupervised Learning Using a Finite Generalized Dirichlet Mixture. IEEE Transactions on Image Processing 15(9), 2657–2668 (2006)
Lochner, R.H.: A Generalized Dirichlet Distribution in Bayesian Life Testing. Journal of the Royal Statistical Society, B 37, 103–113 (1975)
Thall, P.F., Sung, H.G.: Some Extensions and Applications of a Bayesian Startegy for Monitoring Multiple Outcomes in Clinical Trials. Statistics in Medicine 17, 1563–1580 (1998)
Wong, T.: Generalized Dirichlet Distribution in Bayesian Analysis. Applied Mathematics and Computation 97, 165–181 (1998)
McLachlan, G.J., Krishnan, T.: The EM Algorithm and Extensions. Wiley-Interscience, New York (1997)
Swain, M., Ballard, D.: Color Indexing. International Journal of Computer Vision 7(1), 11–32 (1991)
Bouguila, N.: Spatial Color Image Databases Summarization. In: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2007), Honolulu, HI, USA, vol. 1, pp. I–953–I–956 (2007)
Huang, J., Kumar, S.R., Mitra, M., Zhu, W., Zabih, R.: Spatial Color Indexing and Applications. International Journal of Computer Vision 35(3), 245–268 (1999)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bouguila, N., ElGuebaly, W. (2008). On Discrete Data Clustering. In: Washio, T., Suzuki, E., Ting, K.M., Inokuchi, A. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2008. Lecture Notes in Computer Science(), vol 5012. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-68125-0_44
Download citation
DOI: https://doi.org/10.1007/978-3-540-68125-0_44
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-68124-3
Online ISBN: 978-3-540-68125-0
eBook Packages: Computer ScienceComputer Science (R0)