J. Chem. Inf. Comput. Sci., 44 (1), 161 -167, 2004. 10.1021/ci034173u S0095-2338(03)04173-8
Web Release Date: December 2, 2003

Copyright © 2003 American Chemical Society

Prediction of the Isoelectric Point of an Amino Acid Based on GA-PLS and SVMs

H. X. Liu, R. S. Zhang,* X. J. Yao, M. C. Liu, Z. D. Hu, and B. T. Fan

Departments of Chemistry and Computer Science, Lanzhou University, Lanzhou 730000, China, and Université Paris 7-Denis Diderot, ITODYS 1, Rue Guy de la Brosse, 75005 Paris, France

Received August 10, 2003

Abstract:

The support vector machine (SVM), as a novel type of a learning machine, for the first time, was used to develop a QSPR model that relates the structures of 35 amino acids to their isoelectric point. Molecular descriptors calculated from the structure alone were used to represent molecular structures. The seven descriptors selected using GA-PLS, which is a sophisticated hybrid approach that combines GA as a powerful optimization method with PLS as a robust statistical method for variable selection, were used as inputs of RBFNNs and SVM to predict the isoelectric point of an amino acid. The optimal QSPR model developed was based on support vector machines, which showed the following results: the root-mean-square error of 0.2383 and the prediction correlation coefficient R = 0.9702 were obtained for the whole data set. Satisfactory results indicated that the GA-PLS approach is a very effective method for variable selection, and the support vector machine is a very promising tool for the nonlinear approximation.


Download the full text: PDF | HTML