Skip to main content

A New Feature Selection Method for Improving the Precision of Diagnosing Abnormal Protein Sequences by Support Vector Machine and Vectorization Method

  • Conference paper
Adaptive and Natural Computing Algorithms (ICANNGA 2007)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4432))

Included in the following conference series:

  • 1981 Accesses

Abstract

Pattern recognition and classification problems are most popular issue in machine learning, and it seem that they meet their second golden age with bioinformatics. However, the dataset of bioinformatics has several distinctive characteristics compared to the data set in classical pattern recognition and classification research area. One of the most difficulties using this theory in bioinformatics is that raw data of DNA or protein sequences cannot be directly used as input data for machine learning because every sequence has different length of its own code sequences. Therefore, this paper introduces one of the methods to overcome this difficulty, and also argues that the capability of generalization in this method is very poor as showing simple experiments. Finally, this paper suggests different approach to select the fixed number of effective features by using Support Vector Machine, and noise whitening method. This paper also defines the criteria of this suggested method and shows that this method improves the precision of diagnosing abnormal protein sequences with experiment of classifying ovarian cancer data set.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification. John Wiley & Sons, Inc, New York (2001)

    MATH  Google Scholar 

  2. Hansen, P.C.: Regularization Tools, A Matlab Package for Analysis and Solution of Discrete Ill-Posed Problems. Version 3.1 for Matlab 6.0 (2001)

    Google Scholar 

  3. Haykin, S.: Neural Networks, A comprehensive Foundation. Prentice-Hall Inc., Englewood Cliffs (1999)

    MATH  Google Scholar 

  4. Jeong, J.C.: A New Learning Methodology for Support Vector Machine and Regularization RBF Neural Networks. Thesis for the degree of the master of engineering. Department of Computer Engineering Graduate School. Yosu National University. Republic of Korea (2002)

    Google Scholar 

  5. Joachims, T.: Text Categorization with Support Vector Machines: Learning with Many Relevant Features. In: Nédellec, C., Rouveirol, C. (eds.) ECML 1998. LNCS, vol. 1398, Springer, Heidelberg (1998)

    Chapter  Google Scholar 

  6. Kim, E.M., Park, S.M., Kim, K.H., Lee, B.H.: An effective machine learning algorithm using momentum scheduling. In: Hybrid Intelligent Systems, Japan, pp. 442–443 (2004)

    Google Scholar 

  7. Li, L., William, S.N.: Combining pairwise sequence similarity and support vector machines for detecting remote protein evolutionary and structural relationship. Journal of Computational Biology 10(6), 857–867 (2003)

    Article  Google Scholar 

  8. Mangasarian, O.L., Musicant, D.R.: Active Set Support Vector Machine Classification. In: Lee, T.K., Dietterich, T.G., Tresp, V. (eds.) Neural Information Processing Systems 2000 (NIPS 2000), pp. 577–583. MIT Press, Cambridge (2001)

    Google Scholar 

  9. Platt, J.C.: Fast Training of Support Vector Machines Using Sequential Minimal Optimization. In: Advances in Kernels Methods: Support Vector Learning, MIT Press, Cambridge (1998)

    Google Scholar 

  10. Tikhonov, A.N.: On solving incorrectly posed problems and method of regularization. Doklady Akademii Nauk USSR 151, 501–504 (1963)

    Google Scholar 

  11. Vapnik, V.: Statistical learning theory. John Wiley and Sons, New York (1998)

    MATH  Google Scholar 

  12. Yoo, J.H., Jeong, J.C.: Sparse Representation Learning of Kernel Space Using the Kernel Relaxation Procedure. Journal of Fuzzy Logic and Intelligent Systems 11(9), 817–821 (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Bartlomiej Beliczynski Andrzej Dzielinski Marcin Iwanowski Bernardete Ribeiro

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer Berlin Heidelberg

About this paper

Cite this paper

Kim, EM., Jeong, JC., Pae, HY., Lee, BH. (2007). A New Feature Selection Method for Improving the Precision of Diagnosing Abnormal Protein Sequences by Support Vector Machine and Vectorization Method. In: Beliczynski, B., Dzielinski, A., Iwanowski, M., Ribeiro, B. (eds) Adaptive and Natural Computing Algorithms. ICANNGA 2007. Lecture Notes in Computer Science, vol 4432. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71629-7_41

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-71629-7_41

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-71590-0

  • Online ISBN: 978-3-540-71629-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics