Skip to main content

Comparing Local Search Initialization for K-Means and K-Medoids Clustering in a Planar Pareto Front, a Computational Study

  • Conference paper
  • First Online:
Optimization and Learning (OLA 2021)

Abstract

Having N points in a planar Pareto Front (2D PF), k-means and k-medoids are solvable in \(O(N^3)\) time by dynamic programming algorithms. Standard local search approaches, PAM and Lloyd’s heuristics, are investigated in the 2D PF case to solve faster large instances. Specific initialization strategies related to 2D PF cases are implemented with the generic ones (Forgy’s, Hartigans, k-means++). Applying PAM and Lloyd’s local search iterations, the quality of local minimums are compared with optimal values. Numerical results are computed using generated instances, which were made public. This study highlights that local minimums of a poor quality exist for 2D PF cases. A parallel or multi-start heuristic using four initialization strategies improves the accuracy to avoid poor local optimums. Perspectives are still open to improve local search heuristics for the specific 2D PF cases.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Aloise, D., Deshpande, A., Hansen, P., Popat, P.: NP-hardness of Euclidean sum-of-squares clustering. Mach. Learn. 75(2), 245–248 (2009)

    Article  Google Scholar 

  2. Arthur, D., Vassilvitskii, S.: K-means++: the advantages of careful seeding. In: Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 1027–1035. Society for Industrial and Applied Mathematics (2007)

    Google Scholar 

  3. Celebi, M., Kingravi, H., Vela, P.: A comparative study of efficient initialization methods for the k-means clustering algorithm. Expert Syst. Appl. 40, 200–210 (2013). https://doi.org/10.1016/j.eswa.2012.07.021

    Article  Google Scholar 

  4. Dupin, N.: Polynomial algorithms for p-dispersion problems in a 2D Pareto Front. arXiv preprint arXiv:2002.11830 (2020)

  5. Dupin, N., Nielsen, F., Talbi, E.-G.: K-medoids clustering is solvable in polynomial time for a 2D Pareto Front. In: Le Thi, H.A., Le, H.M., Pham Dinh, T. (eds.) WCGO 2019. AISC, vol. 991, pp. 790–799. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-21803-4_79

    Chapter  Google Scholar 

  6. Dupin, N., Nielsen, F., Talbi, E.-G.: Clustering a 2D Pareto Front: P-center problems are solvable in polynomial time. In: Dorronsoro, B., Ruiz, P., de la Torre, J.C., Urda, D., Talbi, E.-G. (eds.) OLA 2020. CCIS, vol. 1173, pp. 179–191. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-41913-4_15

    Chapter  Google Scholar 

  7. Dupin, N., Nielsen, F., Talbi, E.: Unified polynomial Dynamic Programming algorithms for p-center variants in a 2D Pareto Front. Mathematics 9(4), 453 (2021)

    Article  Google Scholar 

  8. Dupin, N., Talbi, E.: Parallel matheuristics for the discrete unit commitment problem with min-stop ramping constraints. Int. Trans. Oper. Res. 27(1), 219–244 (2020)

    Article  MathSciNet  Google Scholar 

  9. Dupin, N., Talbi, E., Nielsen, F.: Dynamic programming heuristic for k-means clustering among a 2-dimensional pareto frontier. In: 7th International Conference on Metaheuristics and Nature Inspired Computing, META 2018 (2018)

    Google Scholar 

  10. Erkut, E., Neuman, S.: Comparison of four models for dispersing facilities. INFOR: Inf. Syst. Oper. Res. 29(2), 68–86 (1991)

    MATH  Google Scholar 

  11. Forgy, E.: Cluster analysis of multivariate data: efficiency vs. interpretability of classification. Biometrics 21(3), 768–769 (1965)

    Google Scholar 

  12. Grønlund, A., et al.: Fast exact k-means, k-medians and Bregman divergence clustering in 1D. arXiv preprint arXiv:1701.07204 (2017)

  13. Hartigan, J., Wong, M.: Algorithm AS 136: a k-means clustering algorithm. J. R. Stat. Soc. Ser. C (Appl. Stat.) 28(1), 100–108 (1979)

    Google Scholar 

  14. Hassin, R., Tamir, A.: Improved complexity bounds for location problems on the real line. Oper. Res. Lett. 10(7), 395–402 (1991)

    Article  MathSciNet  Google Scholar 

  15. Hsu, W., Nemhauser, G.: Easy and hard bottleneck location problems. Discret. Appl. Math. 1(3), 209–215 (1979)

    Article  MathSciNet  Google Scholar 

  16. Jain, A.: Data clustering: 50 years beyond k-means. Pattern Recogn. Lett. 31(8), 651–666 (2010)

    Article  Google Scholar 

  17. Kaufman, L., Rousseeuw, P.: Clustering by Means of Medoids. North-Holland (1987)

    Google Scholar 

  18. Lloyd, S.: Least squares quantization in PCM. IEEE Trans. Inf. Theory 28(2), 129–137 (1982)

    Article  MathSciNet  Google Scholar 

  19. Mahajan, M., Nimbhorkar, P., Varadarajan, K.: The planar K-means problem is NP-hard. Theor. Comput. Sci. 442, 13–21 (2012)

    Article  MathSciNet  Google Scholar 

  20. Nielsen, F.: Introduction to HPC with MPI for Data Science. Springer, Heidelberg (2016)

    Book  Google Scholar 

  21. Pena, J., Lozano, J., Larranaga, P.: An empirical comparison of four initialization methods for the k-means algorithm. Pattern Recogn. Lett. 20(10), 1027–1040 (1999)

    Article  Google Scholar 

  22. Wang, H., Song, M.: Ckmeans.1d.dp: optimal k-means clustering in one dimension by dynamic programming. R J. 3(2), 29–33 (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nicolas Dupin .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Huang, J., Chen, Z., Dupin, N. (2021). Comparing Local Search Initialization for K-Means and K-Medoids Clustering in a Planar Pareto Front, a Computational Study. In: Dorronsoro, B., Amodeo, L., Pavone, M., Ruiz, P. (eds) Optimization and Learning. OLA 2021. Communications in Computer and Information Science, vol 1443. Springer, Cham. https://doi.org/10.1007/978-3-030-85672-4_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-85672-4_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-85671-7

  • Online ISBN: 978-3-030-85672-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics