Skip to main content

An Integrated Approach to High-Dimensional Data Clustering

  • Conference paper
  • First Online:
Innovations in Computer Science and Engineering

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 413))

  • 887 Accesses

Abstract

Applying the traditional clustering algorithms on high-dimensional datasets scales down in the efficiency and effectiveness of the output clusters. H-K Means is advancement over the problems caused in K-means algorithm such as randomness and apriority in the primary centers for K-means, still it could not clear away the problems as dimensional disaster which is due to the high-computational complexity and also the poor quality of clusters. Subspace and ensemble clustering algorithms enhance the execution of clustering high-dimensional dataset from distinctive angles in diverse degree, still in a solitary viewpoint. The proposed model conquers the limitations of traditional H-K means clustering algorithm and provides an algorithm that automatically improves the performance of output clusters, by merging the subspace clustering algorithm (ORCLUS) and ensemble clustering algorithm with the H-K Means algorithm that partitions and merge the clusters based on the number of dimensions. Proposed model is evaluated for various real datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 219.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 279.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. McLachlan G., and Basford K., Mixture Models: Inference and Applications to Clustering, Marcel Dekker, New York, NY, 1988.

    Google Scholar 

  2. Shi Na, Li Xumin, Guan Yong Research on K-means clustering algorithm. Proc of Third International symposium on Intelligent Information Technology and Security Informatics, IEEE 2010.

    Google Scholar 

  3. Vance Faber, Clustering and the Continuous k-Means Algorithm, LosAlamos Science, 1994.

    Google Scholar 

  4. A.K. Jain, M.N. Murty, and P.J. Flynn 1999, -Data Clustering: A Review, ACM Computing Surveys, vol. 31, no. 3, pp. 264–323.

    Google Scholar 

  5. Zhizhou KONG et al. 2008, A Novel Clustering-Ensemble Approach, 978-1-4244-1748-3/08/ IEEE

    Google Scholar 

  6. Weiwei Zhuang et al. Ensemble 2012, Clustering for Internet Security Applications, IEEE transactions on systems, man, and cybernetics part c: applications and reviews, vol. 42, no. 6.

    Google Scholar 

  7. R. Agrawal, J. Gehrke, D. Gunopulos, and P. Raghavan 1998, Automatic subspace clustering of high dimensional data for data mining applications‖, In Proceedings of the 1998 ACM SIGMOD international conference on Management of data, pages 94–105. ACM Press.

    Google Scholar 

  8. B.A Tidke, R.G Mehta, D.P Rana 2012, A novel approach for high dimensional data clustering, ISSN: 2250–3676, [IJESAT] international journal of engineering science & advanced technology Volume-2, Issue-3, 645–651.

    Google Scholar 

  9. Guanhua Chen, Xiuli Ma et al. 2009, Mining Representative Subspace Clusters in High-Dimensional Data, Sixth International Conference on Fuzzy Systems and Knowledge Discovery.

    Google Scholar 

  10. Derek Greene et al. 2004, Ensemble Clustering in Medical Diagnostics, Proceedings of the 17th IEEE Symposium on Computer-Based Medical Systems (CBMS‘04) 1063–7125/04.

    Google Scholar 

  11. K. A. Abdul Nazeer & M. P. Sebastian Improving the Accuracy and Efficiency of the K-Means Clustering Algorithm. Proceedings of the World Congress on Engineering 2009 Vol I WCE 2009, London, U.K, July 1–3, 2009.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rashmi Paithankar .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer Science+Business Media Singapore

About this paper

Cite this paper

Paithankar, R., Tidke, B. (2016). An Integrated Approach to High-Dimensional Data Clustering. In: Saini, H., Sayal, R., Rawat, S. (eds) Innovations in Computer Science and Engineering. Advances in Intelligent Systems and Computing, vol 413. Springer, Singapore. https://doi.org/10.1007/978-981-10-0419-3_11

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-0419-3_11

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-0417-9

  • Online ISBN: 978-981-10-0419-3

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics