Abstract
Applying the traditional clustering algorithms on high-dimensional datasets scales down in the efficiency and effectiveness of the output clusters. H-K Means is advancement over the problems caused in K-means algorithm such as randomness and apriority in the primary centers for K-means, still it could not clear away the problems as dimensional disaster which is due to the high-computational complexity and also the poor quality of clusters. Subspace and ensemble clustering algorithms enhance the execution of clustering high-dimensional dataset from distinctive angles in diverse degree, still in a solitary viewpoint. The proposed model conquers the limitations of traditional H-K means clustering algorithm and provides an algorithm that automatically improves the performance of output clusters, by merging the subspace clustering algorithm (ORCLUS) and ensemble clustering algorithm with the H-K Means algorithm that partitions and merge the clusters based on the number of dimensions. Proposed model is evaluated for various real datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
McLachlan G., and Basford K., Mixture Models: Inference and Applications to Clustering, Marcel Dekker, New York, NY, 1988.
Shi Na, Li Xumin, Guan Yong Research on K-means clustering algorithm. Proc of Third International symposium on Intelligent Information Technology and Security Informatics, IEEE 2010.
Vance Faber, Clustering and the Continuous k-Means Algorithm, LosAlamos Science, 1994.
A.K. Jain, M.N. Murty, and P.J. Flynn 1999, -Data Clustering: A Review, ACM Computing Surveys, vol. 31, no. 3, pp. 264–323.
Zhizhou KONG et al. 2008, A Novel Clustering-Ensemble Approach, 978-1-4244-1748-3/08/ IEEE
Weiwei Zhuang et al. Ensemble 2012, Clustering for Internet Security Applications, IEEE transactions on systems, man, and cybernetics part c: applications and reviews, vol. 42, no. 6.
R. Agrawal, J. Gehrke, D. Gunopulos, and P. Raghavan 1998, Automatic subspace clustering of high dimensional data for data mining applications‖, In Proceedings of the 1998 ACM SIGMOD international conference on Management of data, pages 94–105. ACM Press.
B.A Tidke, R.G Mehta, D.P Rana 2012, A novel approach for high dimensional data clustering, ISSN: 2250–3676, [IJESAT] international journal of engineering science & advanced technology Volume-2, Issue-3, 645–651.
Guanhua Chen, Xiuli Ma et al. 2009, Mining Representative Subspace Clusters in High-Dimensional Data, Sixth International Conference on Fuzzy Systems and Knowledge Discovery.
Derek Greene et al. 2004, Ensemble Clustering in Medical Diagnostics, Proceedings of the 17th IEEE Symposium on Computer-Based Medical Systems (CBMS‘04) 1063–7125/04.
K. A. Abdul Nazeer & M. P. Sebastian Improving the Accuracy and Efficiency of the K-Means Clustering Algorithm. Proceedings of the World Congress on Engineering 2009 Vol I WCE 2009, London, U.K, July 1–3, 2009.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer Science+Business Media Singapore
About this paper
Cite this paper
Paithankar, R., Tidke, B. (2016). An Integrated Approach to High-Dimensional Data Clustering. In: Saini, H., Sayal, R., Rawat, S. (eds) Innovations in Computer Science and Engineering. Advances in Intelligent Systems and Computing, vol 413. Springer, Singapore. https://doi.org/10.1007/978-981-10-0419-3_11
Download citation
DOI: https://doi.org/10.1007/978-981-10-0419-3_11
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-0417-9
Online ISBN: 978-981-10-0419-3
eBook Packages: EngineeringEngineering (R0)