An Improved CURE Algorithm

Cai, Mingjuan; Liang, Yongquan

doi:10.1007/978-3-030-01313-4_11

Mingjuan Cai¹⁸ &
Yongquan Liang¹⁹

Part of the book series: IFIP Advances in Information and Communication Technology ((IFIPAICT,volume 539))

Included in the following conference series:

International Conference on Intelligence Science

1036 Accesses
2 Citations

Abstract

CURE algorithm is an efficient hierarchical clustering algorithm for large data sets. This paper presents an improved CURE algorithm, named ISE-RS-CURE. The algorithm adopts a sample extraction algorithm combined with statistical ideas, which can reasonably select sample points according to different data densities and can improve the representation of sample sets. When the sample set is extracted, the data set is divided at the same time, which can help to reduce the time consumption in the non-sample set allocation process. A selection strategy based on partition influence factor is proposed for the selection of representative points, which comprehensively considers the overall correlation between the data in the region where a representative point is located, so as to improve the rationality of the representative points. Experiments show that the improved CURE algorithm proposed in this paper can ensure the accuracy of the clustering results and can also improve the operating efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Hardcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Han, J., Kamber, M.: Data Mining: Concepts and Techniques, 3rd edn. China Machine Press, Beijing (2012)
MATH Google Scholar
Niu, Z.-H., Fan, J.-C., Liu, W.-H., Tang, L., Tang, S.: CDNASA: clustering data with noise and arbitrary shape. Int. J. Wirel. Mob. Comput. 11(2), 100–111 (2016)
Article Google Scholar
Guha, S., Rastogi, R., Shim, K.: CURE: an efficient clustering algorithm for large databases. In: ACM SIGMOD International Conference on Management of Data, pp. 73–84. ACM (1998)
Google Scholar
Guha, S., Rastogi, R., Shim, K., et al.: CURE: an efficient clustering algorithm for large databases. Inf. Syst. 26(1), 35–58 (2001)
Article Google Scholar
Kang, W., Ye, D.: Study of CURE based clustering algorithm. In: 18th China Conference on Computer Technology and Applications (CACIS), vol. 1. Computer Technology and Application Progress, pp. 132–135. China University of Science and Technology Press, Hefei (2007)
Google Scholar
Jie, S., Zhao, L., Yang, J., et al.: Hierarchical clustering algorithm based on partition. Comput. Eng. Appl. 43(31), 175–177 (2007)
Google Scholar
Wu, H., Li, W., Jiang, M.: Modified CURE clustering algorithm based on entropy. Comput. Appl. Res. 34(08), 2303–2305 (2017)
Google Scholar
Wang, Y., Wang, J., Chen, H., Xu, T., Sun, B.: An algorithm for approximate binary hierarchical clustering using representatives. Mini Micro Comput. Syst. 36(02), 215–219 (2015)
Google Scholar
Fray, B.J., Dueck, D.: Clustering by passing messages between data points. Science 315(5814), 972–976 (2007)
Article MathSciNet Google Scholar
Jia, R., Geng, J., Ning, Z., et al.: Fast clustering algorithm based on representative points. Comput. Eng. Appl. 46(33), 121–123+126 (2010)
Google Scholar
Zhao, Y.: Research on user clustering algorithm based on CURE. Comput. Eng. Appl. 11(1), 457–465 (2012)
Google Scholar
Shao, X., Wei, C.: Improved CURE algorithm and application of clustering for large-scale data. In: International Symposium on it in Medicine and Education, pp 305–308. IEEE (2012)
Google Scholar
Shi, N., Zhang, J., Chu, X.: CURE algorithm-based inspection of duplicated records. Comput. Eng. 35(05), 56–58 (2009)
Google Scholar
Lichman, M.: UCI machine learning repository [EB/OL] (2013). http://archive.ics.uci.edu/ml.2018/02/24
Pengli, L.U., Wang, Z.: Density-sensitive hierarchical clustering algorithm. Comput. Eng. Appl. 50(04), 190–195 (2014)
Google Scholar

Download references

Author information

Authors and Affiliations

College of Computer Science and Engineering, Shandong Province Key Laboratory of Wisdom Mine Information Technology, Shandong University of Science and Technology, Qingdao, 266590, China
Mingjuan Cai
College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao, 266590, China
Yongquan Liang

Authors

Mingjuan Cai
View author publications
You can also search for this author in PubMed Google Scholar
Yongquan Liang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yongquan Liang .

Editor information

Editors and Affiliations

Chinese Academy of Sciences, Beijing, China
Zhongzhi Shi
University of Amsterdam, Amsterdam, The Netherlands
Cyriel Pennartz
Peking University, Beijing, China
Tiejun Huang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cai, M., Liang, Y. (2018). An Improved CURE Algorithm. In: Shi, Z., Pennartz, C., Huang, T. (eds) Intelligence Science II. ICIS 2018. IFIP Advances in Information and Communication Technology, vol 539. Springer, Cham. https://doi.org/10.1007/978-3-030-01313-4_11

Download citation

DOI: https://doi.org/10.1007/978-3-030-01313-4_11
Published: 02 October 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-01312-7
Online ISBN: 978-3-030-01313-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Federation for Information Processing (opens in a new tab)