Abstract
In this paper a novel data mining technique - Clustering and Classification Algorithm-Supervised (CCA-S)1 is introduced. CCA-S supports incremental learning and non-hierarchical clustering, and is scalable for processing large data sets. CCA-S incorporates the class information in making clustering decisions, and uses the resulting clusters to classify new data records. We apply and test CCA-S on several common data sets for classification problems. The testing results show that the classification performance of CCAS is comparable to the other classification algorithms such as decision trees, artificial neural networks and discriminant analysis.
A US and international patent on this algorithm has been filed, and is currently pending with ASU Case No. M1-015.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Cherkassky, V., Mulier, F.: Learning from Data. John Wiley & Sons, Inc. (1998)
Ester, M., Kriegel H.P., Sander, J., Wimmer, M., Xu, X.: Incremental Clustering for Mining in a data Warehousing Environment. Proceedings of 24th VLDB Conference. New York (1998)
Harsha, S.G., Choudhary, A.: Parallel Subspace Clustering for Very Large Data Sets. Techinical Report No. CPDC-TR-9906-010. Northwestern University (1999)
Jain, A. K., Dubes, R. C.: Algorithms for Clustering Data. Prentice Hall, New Jersey (1988)
Johnson, R. A., Wichern, D. W.: Applied multivariate Statistical Analysis. Prentice Hall, New Jersey (1998)
Lim, T., Loh, W., Shih, Y.: A Comparison of Prediction Accuracy, Complexity, and Training Time of Thirty-three Old and New Classification Algorithms. Machine Learning, Vol. 40. (2000) 203–228
Mangasarian, O. L., Wolberg, W. H.: Cancer Diagnosis via Linear Programming. SIAM News, Vol. 23(5). (1990) 1–18.
Mitchell, T.: Machine Learning. WCB/McGraw-Hill. (1997)
Sheikholeslami, G., Chatterjee, S., Zhang, A.: WaveCluster: A Multi-Resolution Clustering Approach for Very Large Spatial Databases. Proceedings of 24th VLDB Conference. New York (1998)
Zhang, T.: Data Clustering For Very Large Datasets Plus Applications. Ph.D. Thesis, Dept. of Computer Science, University of Wisconsin-Madison (1997)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ye, N., Li, X. (2001). A Machine Learning Algorithm Based on Supervised Clustering and Classification. In: Liu, J., Yuen, P.C., Li, Ch., Ng, J., Ishida, T. (eds) Active Media Technology. AMT 2001. Lecture Notes in Computer Science, vol 2252. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45336-9_38
Download citation
DOI: https://doi.org/10.1007/3-540-45336-9_38
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43035-3
Online ISBN: 978-3-540-45336-9
eBook Packages: Springer Book Archive