Skip to main content

Top-Down Hierarchical Ensembles of Classifiers for Predicting G-Protein-Coupled-Receptor Functions

  • Conference paper
Advances in Bioinformatics and Computational Biology (BSB 2008)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 5167))

Included in the following conference series:

Abstract

Despite the recent advances in Molecular Biology, the function of a large amount of proteins is still unknown. An approach that can be used in the prediction of a protein function consists of searching against secondary databases, also known as signature databases. Different strategies can be applied to use protein signatures in the prediction of function of proteins. A sophisticated approach consists of inducing a classification model for this prediction. This paper applies five hierarchical classification methods based on the standard Top-Down approach and one hierarchical classification method based on a new approach named Top-Down Ensembles - based on the hierarchical combination of classifiers - to three different protein functional classification datasets that employ protein signatures. The algorithm based on the Top-Down Ensembles approach presented slightly better results than the other algorithms, indicating that combinations of classifiers can improve the performance of hierarchical classification models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 54.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 69.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. E. B. Institute, Protein function (accessed March 07, 2008), http://www.ebi.ac.uk/2can/tutorials/function/

  2. Apweiler, R., Attwood, T., Bairoch, A., Bateman, A., Birney, E., Biswas, M., Bucher, P., Cerutti, L., Corpet, F., Croning, M., et al.: The InterPro database, an integrated documentation resource for protein families, domains and functional sites. Nucleic Acids Research 29(1), 37–40 (2001)

    Article  Google Scholar 

  3. Sigrist, C., Cerutti, L., Hulo, N., Gattiker, A., Falquet, L., Pagni, M., Bairoch, A., Bucher, P.: PROSITE: A documented database using patterns and profiles as motif descriptors. Briefings in Bioinformatics 3(3), 265–274 (2002)

    Article  Google Scholar 

  4. Bateman, A., Birney, E., Cerruti, L., Durbin, R., Etwiller, L., Eddy, S., Griffiths-Jones, S., Howe, K., Marshall, M., Sonnhammer, E.: The Pfam Protein Families Database. Nucleic Acids Research 30(1), 276–280 (2002)

    Article  Google Scholar 

  5. Attwood, T.: The PRINTS database: A resource for identification of protein families. Briefings in Bioinformatics 3(3), 252–263 (2002)

    Article  Google Scholar 

  6. E.Nomenclature, of the IUPAC-IUB, American Elsevier Pub. Co., New York, NY 104 (1972)

    Google Scholar 

  7. Mitchell, T.M.: Machine Learning. McGraw-Hill Higher Education, New York (1997)

    MATH  Google Scholar 

  8. Freitas, A.A., Carvalho, A.C.P.F.: A Tutorial on Hierarchical Classification with Applications in Bioinformatics. In: Taniar, D. (ed.) Research and Trends in Data Mining Technologies and Applications, pp. 175–208. Idea Group (2007)

    Google Scholar 

  9. Sun, A., Lim, E.P., Ng, W.K.: Hierarchical text classification methods and their specification. Cooperative Internet Computing 256, 18 p. (2003)

    Google Scholar 

  10. Sun, A., Lim, E.P., Ng, W.K.: Performance measurement framework for hierarchical text classification. Journal of the American Society for Information Science and Technology 54(11), 1014–1028 (2003)

    Article  Google Scholar 

  11. Kuncheva, L.: Combining Pattern Classifiers: Methods and Algorithms. Wiley-Interscience, Chichester (2004)

    MATH  Google Scholar 

  12. Holden, N., Freitas, A.A.: Hierarchical Classification of G-Protein-Coupled Receptors with PSO/ACO Algorithm. In: Proceedings of the 2006 IEEE Swarm Intelligence Symposium, pp. 77–84 (2006)

    Google Scholar 

  13. Filmore, D.: It’s a GPCR world. Modern drug discovery 1(17), 24–28 (2004)

    Google Scholar 

  14. GPCRDB, Information system for G protein-coupled receptors (GPCR) (accessed, July 2006), http://www.gpcr.org/7tm/

  15. S. I. of Bioinformatics, Prosite - description (accessed March 01, 2008), http://us.expasy.org/prosite/prosite_details.html

  16. Apweiler, R., Bairoch, A., Wu, C.H., Barker, W.C., Boeckmann, B., Ferro, S., Gasteiger, E., Huang, H., Lopez, R., Magrane, M., et al.: UniProt: the Universal Protein knowledgebase. Nucleic Acids Research 32, D115–D119 (2004)

    Article  Google Scholar 

  17. Quinlan, J.R.: Induction of decision trees. Machine Learning 1(1), 81–106 (1986)

    Google Scholar 

  18. Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco (1993)

    Google Scholar 

  19. Cohen, W.: Fast effective rule induction. In: Proceedings of the Twelfth International Conference on Machine Learning, pp. 115–123 (1995)

    Google Scholar 

  20. Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines and other kernel-based learning methods. Cambridge University Press, Cambridge (2000)

    Google Scholar 

  21. Cover, T., Hart, P.: Nearest neighbor pattern classification, Information Theory. IEEE Transactions 13(1), 21–27 (1967)

    MATH  Google Scholar 

  22. Friedman, N., Geiger, D., Goldszmidt, M.: Bayesian Network Classifiers. Machine Learning 29(2), 131–163 (1997)

    Article  MATH  Google Scholar 

  23. Venables, W.N., Smith, D.M.: The R Development Core Team, An introduction to R - version 2.4.1 (2006), http://cran.r-project.org/doc/manuals/R-intro.pdf

  24. Dimitriadou, E., Hornik, K., Leisch, F., Meyer, D., Weingessel, A.: e1071: Misc Functions of the Department of Statistics (e1071), TU Wien, 1–5 (2006)

    Google Scholar 

  25. Hornik, K., Zeileis, A., Hothorn, T., Buchta, C.: RWeka: An R Interface to Weka, R package version 0.2-14, http://CRAN.R-project.org

  26. Blockeel, H., Bruynooghe, M., Dzeroski, S., Ramon, J., Struyf, J.: Hierarchical multi-classification. In: Proceedings of the ACM SIGKDD 2002 Workshop on Multi-Relational Data Mining (MRDM 2002), pp. 21–35 (2002)

    Google Scholar 

  27. Nadeau, C., Bengio, Y.: Inference for the Generalization Error. Machine Learning 52(3), 239–281 (2003)

    Article  MATH  Google Scholar 

  28. Salzberg, S.: On Comparing Classifiers: Pitfalls to Avoid and a Recommended Approach. Data Mining and Knowledge Discovery 1(3), 317–328 (1997)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Ana L. C. Bazzan Mark Craven Natália F. Martins

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Costa, E.P., Lorena, A.C., Carvalho, A.C.P.L.F., Freitas, A.A. (2008). Top-Down Hierarchical Ensembles of Classifiers for Predicting G-Protein-Coupled-Receptor Functions. In: Bazzan, A.L.C., Craven, M., Martins, N.F. (eds) Advances in Bioinformatics and Computational Biology. BSB 2008. Lecture Notes in Computer Science(), vol 5167. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85557-6_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-85557-6_4

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-85556-9

  • Online ISBN: 978-3-540-85557-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics