Top-Down Hierarchical Ensembles of Classifiers for Predicting G-Protein-Coupled-Receptor Functions

Costa, Eduardo P.; Lorena, Ana C.; Carvalho, André C. P. L. F.; Freitas, Alex A.

doi:10.1007/978-3-540-85557-6_4

Eduardo P. Costa¹,
Ana C. Lorena²,
André C. P. L. F. Carvalho¹ &
…
Alex A. Freitas³

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 5167))

Included in the following conference series:

Brazilian Symposium on Bioinformatics

457 Accesses
12 Citations

Abstract

Despite the recent advances in Molecular Biology, the function of a large amount of proteins is still unknown. An approach that can be used in the prediction of a protein function consists of searching against secondary databases, also known as signature databases. Different strategies can be applied to use protein signatures in the prediction of function of proteins. A sophisticated approach consists of inducing a classification model for this prediction. This paper applies five hierarchical classification methods based on the standard Top-Down approach and one hierarchical classification method based on a new approach named Top-Down Ensembles - based on the hierarchical combination of classifiers - to three different protein functional classification datasets that employ protein signatures. The algorithm based on the Top-Down Ensembles approach presented slightly better results than the other algorithms, indicating that combinations of classifiers can improve the performance of hierarchical classification models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 54.99; Price excludes VAT (USA)

Softcover Book: USD 69.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

E. B. Institute, Protein function (accessed March 07, 2008), http://www.ebi.ac.uk/2can/tutorials/function/
Apweiler, R., Attwood, T., Bairoch, A., Bateman, A., Birney, E., Biswas, M., Bucher, P., Cerutti, L., Corpet, F., Croning, M., et al.: The InterPro database, an integrated documentation resource for protein families, domains and functional sites. Nucleic Acids Research 29(1), 37–40 (2001)
Article Google Scholar
Sigrist, C., Cerutti, L., Hulo, N., Gattiker, A., Falquet, L., Pagni, M., Bairoch, A., Bucher, P.: PROSITE: A documented database using patterns and profiles as motif descriptors. Briefings in Bioinformatics 3(3), 265–274 (2002)
Article Google Scholar
Bateman, A., Birney, E., Cerruti, L., Durbin, R., Etwiller, L., Eddy, S., Griffiths-Jones, S., Howe, K., Marshall, M., Sonnhammer, E.: The Pfam Protein Families Database. Nucleic Acids Research 30(1), 276–280 (2002)
Article Google Scholar
Attwood, T.: The PRINTS database: A resource for identification of protein families. Briefings in Bioinformatics 3(3), 252–263 (2002)
Article Google Scholar
E.Nomenclature, of the IUPAC-IUB, American Elsevier Pub. Co., New York, NY 104 (1972)
Google Scholar
Mitchell, T.M.: Machine Learning. McGraw-Hill Higher Education, New York (1997)
MATH Google Scholar
Freitas, A.A., Carvalho, A.C.P.F.: A Tutorial on Hierarchical Classification with Applications in Bioinformatics. In: Taniar, D. (ed.) Research and Trends in Data Mining Technologies and Applications, pp. 175–208. Idea Group (2007)
Google Scholar
Sun, A., Lim, E.P., Ng, W.K.: Hierarchical text classification methods and their specification. Cooperative Internet Computing 256, 18 p. (2003)
Google Scholar
Sun, A., Lim, E.P., Ng, W.K.: Performance measurement framework for hierarchical text classification. Journal of the American Society for Information Science and Technology 54(11), 1014–1028 (2003)
Article Google Scholar
Kuncheva, L.: Combining Pattern Classifiers: Methods and Algorithms. Wiley-Interscience, Chichester (2004)
MATH Google Scholar
Holden, N., Freitas, A.A.: Hierarchical Classification of G-Protein-Coupled Receptors with PSO/ACO Algorithm. In: Proceedings of the 2006 IEEE Swarm Intelligence Symposium, pp. 77–84 (2006)
Google Scholar
Filmore, D.: It’s a GPCR world. Modern drug discovery 1(17), 24–28 (2004)
Google Scholar
GPCRDB, Information system for G protein-coupled receptors (GPCR) (accessed, July 2006), http://www.gpcr.org/7tm/
S. I. of Bioinformatics, Prosite - description (accessed March 01, 2008), http://us.expasy.org/prosite/prosite_details.html
Apweiler, R., Bairoch, A., Wu, C.H., Barker, W.C., Boeckmann, B., Ferro, S., Gasteiger, E., Huang, H., Lopez, R., Magrane, M., et al.: UniProt: the Universal Protein knowledgebase. Nucleic Acids Research 32, D115–D119 (2004)
Article Google Scholar
Quinlan, J.R.: Induction of decision trees. Machine Learning 1(1), 81–106 (1986)
Google Scholar
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco (1993)
Google Scholar
Cohen, W.: Fast effective rule induction. In: Proceedings of the Twelfth International Conference on Machine Learning, pp. 115–123 (1995)
Google Scholar
Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines and other kernel-based learning methods. Cambridge University Press, Cambridge (2000)
Google Scholar
Cover, T., Hart, P.: Nearest neighbor pattern classification, Information Theory. IEEE Transactions 13(1), 21–27 (1967)
MATH Google Scholar
Friedman, N., Geiger, D., Goldszmidt, M.: Bayesian Network Classifiers. Machine Learning 29(2), 131–163 (1997)
Article MATH Google Scholar
Venables, W.N., Smith, D.M.: The R Development Core Team, An introduction to R - version 2.4.1 (2006), http://cran.r-project.org/doc/manuals/R-intro.pdf
Dimitriadou, E., Hornik, K., Leisch, F., Meyer, D., Weingessel, A.: e1071: Misc Functions of the Department of Statistics (e1071), TU Wien, 1–5 (2006)
Google Scholar
Hornik, K., Zeileis, A., Hothorn, T., Buchta, C.: RWeka: An R Interface to Weka, R package version 0.2-14, http://CRAN.R-project.org
Blockeel, H., Bruynooghe, M., Dzeroski, S., Ramon, J., Struyf, J.: Hierarchical multi-classification. In: Proceedings of the ACM SIGKDD 2002 Workshop on Multi-Relational Data Mining (MRDM 2002), pp. 21–35 (2002)
Google Scholar
Nadeau, C., Bengio, Y.: Inference for the Generalization Error. Machine Learning 52(3), 239–281 (2003)
Article MATH Google Scholar
Salzberg, S.: On Comparing Classifiers: Pitfalls to Avoid and a Recommended Approach. Data Mining and Knowledge Discovery 1(3), 317–328 (1997)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Depto. Ciências de Computação, ICMC/USP - São Carlos, Caixa Postal 668, 13560-970, São Carlos, SP, Brazil
Eduardo P. Costa & André C. P. L. F. Carvalho
Universidade Federal do ABC, 09.210-170, Santo André, SP, Brazil
Ana C. Lorena
Computing Laboratory and Centre for BioMedical Informatics, University of Kent, Canterbury, CT2 7NF, UK
Alex A. Freitas

Authors

Eduardo P. Costa
View author publications
You can also search for this author in PubMed Google Scholar
Ana C. Lorena
View author publications
You can also search for this author in PubMed Google Scholar
André C. P. L. F. Carvalho
View author publications
You can also search for this author in PubMed Google Scholar
Alex A. Freitas
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Ana L. C. Bazzan Mark Craven Natália F. Martins

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Costa, E.P., Lorena, A.C., Carvalho, A.C.P.L.F., Freitas, A.A. (2008). Top-Down Hierarchical Ensembles of Classifiers for Predicting G-Protein-Coupled-Receptor Functions. In: Bazzan, A.L.C., Craven, M., Martins, N.F. (eds) Advances in Bioinformatics and Computational Biology. BSB 2008. Lecture Notes in Computer Science(), vol 5167. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85557-6_4

Download citation

DOI: https://doi.org/10.1007/978-3-540-85557-6_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-85556-9
Online ISBN: 978-3-540-85557-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics