An Integrated Approach to High-Dimensional Data Clustering

Paithankar, Rashmi; Tidke, Bharat

doi:10.1007/978-981-10-0419-3_11

Rashmi Paithankar⁵ &
Bharat Tidke⁵

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 413))

887 Accesses

Abstract

Applying the traditional clustering algorithms on high-dimensional datasets scales down in the efficiency and effectiveness of the output clusters. H-K Means is advancement over the problems caused in K-means algorithm such as randomness and apriority in the primary centers for K-means, still it could not clear away the problems as dimensional disaster which is due to the high-computational complexity and also the poor quality of clusters. Subspace and ensemble clustering algorithms enhance the execution of clustering high-dimensional dataset from distinctive angles in diverse degree, still in a solitary viewpoint. The proposed model conquers the limitations of traditional H-K means clustering algorithm and provides an algorithm that automatically improves the performance of output clusters, by merging the subspace clustering algorithm (ORCLUS) and ensemble clustering algorithm with the H-K Means algorithm that partitions and merge the clusters based on the number of dimensions. Proposed model is evaluated for various real datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 219.00; Price excludes VAT (USA)

Softcover Book: USD 279.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

McLachlan G., and Basford K., Mixture Models: Inference and Applications to Clustering, Marcel Dekker, New York, NY, 1988.
Google Scholar
Shi Na, Li Xumin, Guan Yong Research on K-means clustering algorithm. Proc of Third International symposium on Intelligent Information Technology and Security Informatics, IEEE 2010.
Google Scholar
Vance Faber, Clustering and the Continuous k-Means Algorithm, LosAlamos Science, 1994.
Google Scholar
A.K. Jain, M.N. Murty, and P.J. Flynn 1999, -Data Clustering: A Review, ACM Computing Surveys, vol. 31, no. 3, pp. 264–323.
Google Scholar
Zhizhou KONG et al. 2008, A Novel Clustering-Ensemble Approach, 978-1-4244-1748-3/08/ IEEE
Google Scholar
Weiwei Zhuang et al. Ensemble 2012, Clustering for Internet Security Applications, IEEE transactions on systems, man, and cybernetics part c: applications and reviews, vol. 42, no. 6.
Google Scholar
R. Agrawal, J. Gehrke, D. Gunopulos, and P. Raghavan 1998, Automatic subspace clustering of high dimensional data for data mining applications‖, In Proceedings of the 1998 ACM SIGMOD international conference on Management of data, pages 94–105. ACM Press.
Google Scholar
B.A Tidke, R.G Mehta, D.P Rana 2012, A novel approach for high dimensional data clustering, ISSN: 2250–3676, [IJESAT] international journal of engineering science & advanced technology Volume-2, Issue-3, 645–651.
Google Scholar
Guanhua Chen, Xiuli Ma et al. 2009, Mining Representative Subspace Clusters in High-Dimensional Data, Sixth International Conference on Fuzzy Systems and Knowledge Discovery.
Google Scholar
Derek Greene et al. 2004, Ensemble Clustering in Medical Diagnostics, Proceedings of the 17th IEEE Symposium on Computer-Based Medical Systems (CBMS‘04) 1063–7125/04.
Google Scholar
K. A. Abdul Nazeer & M. P. Sebastian Improving the Accuracy and Efficiency of the K-Means Clustering Algorithm. Proceedings of the World Congress on Engineering 2009 Vol I WCE 2009, London, U.K, July 1–3, 2009.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Engineering, Flora Institute of Technology, Pune, Maharashtra, India
Rashmi Paithankar & Bharat Tidke

Authors

Rashmi Paithankar
View author publications
You can also search for this author in PubMed Google Scholar
Bharat Tidke
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rashmi Paithankar .

Editor information

Editors and Affiliations

Guru Nanak Institutions, Professor & Managing Director, Ibrahimpatnam, Andhra Pradesh, India
H. S. Saini
Guru Nanak Institutions, Professor & Associate Director, Ibrahimpatnam, Andhra Pradesh, India
Rishi Sayal
Guru Nanak Institutions, Professor and Head – CSE and IT, Ibrahimpatnam, India
Sandeep Singh Rawat

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Paithankar, R., Tidke, B. (2016). An Integrated Approach to High-Dimensional Data Clustering. In: Saini, H., Sayal, R., Rawat, S. (eds) Innovations in Computer Science and Engineering. Advances in Intelligent Systems and Computing, vol 413. Springer, Singapore. https://doi.org/10.1007/978-981-10-0419-3_11

Download citation

DOI: https://doi.org/10.1007/978-981-10-0419-3_11
Published: 20 February 2016
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-0417-9
Online ISBN: 978-981-10-0419-3
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics