Abstract
In this paper a novel feature selection scheme is proposed, which exploits the potentialities of a recent probabilistic generative model, the Counting Grid. This model is able to cluster together similar observations, highlighting the compactness of a class and its underlying structure. The proposed feature selection scheme is applied to the expression microarray scenario, a peculiar context with very few patterns and a huge number of features. Experiments on benchmark datasets show that the proposed approach is effective and stable, assessing state-of-the-art classification accuracies.
Chapter PDF
Similar content being viewed by others
References
Duda, R., Hart, P., Stork, D.: Pattern Classification, 2nd edn. John Wiley & Sons (2001)
Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. Journal of Machine Learning Research 3, 1157–1182 (2003)
Saeys, Y., Inza, I., Larraaga, P.: A review of feature selection techniques in bioinformatics. Bioinformatics 23(19), 2507–2517 (2007)
Thomas, J., Olson, J., Tapscott, S., Zhao, L.: An efficient and robust statistical modeling approach to discover differentially expressed genes using genomic expression profiles. Genome Research 11, 1227–1236 (2001)
Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene selection for cancer classification using support vector machines. Machine Learning 46, 389–422 (2002)
Li, T., Zhang, C., Ogihara, M.: A comprehensive study on feature selection and multiclass classification methods for tissue classifcation based on gene expression. Bioinformatics 20, 2429–2437 (2004)
Abeel, T., Helleputte, T., de Peer, Y.V., Dupont, P., Saeys, Y.: Robust biomarker identification for cancer diagnosis with ensemble feature selection methods. Bioinformatics 26, 392–398 (2010)
Yu, L., Han, Y., Berens, M.: Stable gene selection from microarray data via sample weighting. IEEE Transaction on Computational Biology and Bionformatics 9, 262–272 (2012)
Jojic, N., Perina, A.: Multidimensional counting grids: Inferring word order from disordered bags of words. In: Uncertainty in Artificial Intelligence (2011)
Salton, G., McGill, M.: Introduction to Modern Information Retrieval. McGraw-Hill, New York (1983)
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
Rogers, S., Girolami, M., Campbell, C., Breitling, R.: The latent process decomposition of cdna microarray datasets. IEEE/ACM Transactions on Computational Biology and Bioinformatics (2005)
Bicego, M., Lovato, P., Oliboni, B., Perina, A.: Expression microarray classification using topic models. In: SAC, pp. 1516–1520 (2010)
Perina, A., Lovato, P., Cristani, M., Bicego, M.: A Comparison on Score Spaces for Expression Microarray Data Classification. In: Loog, M., Wessels, L., Reinders, M.J.T., de Ridder, D. (eds.) PRIB 2011. LNCS, vol. 7036, pp. 202–213. Springer, Heidelberg (2011)
Singh, D., Febbo, P., Ross, K., Jackson, D., Manola, J., Ladd, C., Tamayo, P., Renshaw, A., D’Amico, A., et al.: Gene expression correlates of clinical prostate cancer behavior. Cancer Cell 98, 203–209 (2002)
Alon, U., Barkai, N., Notterman, D., Gish, K., Ybarra, S., Mack, D., Levine, A.: Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc. Natl. Acad. Sci. 96, 6745–6750 (1999)
Vapnik, V.: Statistical Learning Theory. Wiley, New York (1998)
Kuncheva, L.: A stability index for feature selection. In: IASTED International Multi-Conference Artificial Intelligence and Applications, pp. 390–395 (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lovato, P., Bicego, M., Cristani, M., Jojic, N., Perina, A. (2012). Feature Selection Using Counting Grids: Application to Microarray Data. In: Gimel’farb, G., et al. Structural, Syntactic, and Statistical Pattern Recognition. SSPR /SPR 2012. Lecture Notes in Computer Science, vol 7626. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34166-3_69
Download citation
DOI: https://doi.org/10.1007/978-3-642-34166-3_69
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-34165-6
Online ISBN: 978-3-642-34166-3
eBook Packages: Computer ScienceComputer Science (R0)