skip to main content
10.1145/3437963.3441767acmconferencesArticle/Chapter ViewAbstractPublication PageswsdmConference Proceedingsconference-collections
research-article
Public Access

Towards Scalable Spectral Embedding and Data Visualization via Spectral Coarsening

Authors Info & Claims
Published:08 March 2021Publication History

ABSTRACT

This paper proposes a scalable multilevel framework for the spectral embedding of large undirected graphs. The proposed method first computes much smaller yet sparse graphs while preserving the key spectral (structural) properties of the original graph, by exploiting a nearly-linear time spectral graph coarsening approach. Then, the resultant spectrally-coarsened graphs are leveraged for the development of much faster algorithms for multilevel spectral graph embedding (clustering) as well as visualization of large data sets. We conducted extensive experiments using a variety of large graphs and datasets and obtained very promising results. For instance, we are able to coarsen the "coPapersCiteseer" graph with 0.43 million nodes and 16 million edges into a much smaller graph with only 13K (32X fewer) nodes and 17K (950X fewer) edges in about 16 seconds; the spectrally-coarsened graphs allow us to achieve up to 1,100X speedup for multilevel spectral graph embedding (clustering) and up to 60X speedup for t-SNE visualization of large data sets.

References

  1. D. A. Bader, H. Meyerhenke, P. Sanders, C. Schulz, A. Kappes, and D. Wagner. Benchmarking for graph clustering and partitioning. In Encyclopedia of Social Network Analysis and Mining, pages 73--82. Springer, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  2. D. A. Bader, H. Meyerhenke, P. Sanders, and D. Wagner. Graph partitioning and graph clustering. In 10th DIMACS Implementation Challenge Workshop, 2012.Google ScholarGoogle Scholar
  3. J. Batson, D. Spielman, and N. Srivastava. Twice-Ramanujan Sparsifiers. SIAM Journal on Computing, 41(6):1704--1721, 2012.Google ScholarGoogle ScholarCross RefCross Ref
  4. A. A. Benczúr and D. R. Karger. Approximating st minimum cuts in o (n 2) time. In Proceedings of the twenty-eighth annual ACM symposium on Theory of computing, pages 47--55. ACM, 1996.Google ScholarGoogle Scholar
  5. A. A. Benczúr and D. R. Karger. Randomized approximation schemes for cuts and flows in capacitated graphs. SIAM Journal on Computing, 44(2):290--319, 2015.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. J. Chen and I. Safro. Algebraic distance on graphs. SIAM Journal on Scientific Computing, 33(6):3468--3490, 2011.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. P. Christiano, J. Kelner, A. Madry, D. Spielman, and S. Teng. Electrical flows, laplacian systems, and faster approximation of maximum flow in undirected graphs. In Proc. ACM STOC, pages 273--282, 2011.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. T. Davis and Y. Hu. The university of florida sparse matrix collection. ACM Trans. on Math. Soft. (TOMS), 38(1):1, 2011.Google ScholarGoogle Scholar
  9. M. Defferrard, X. Bresson, and P. Vandergheynst. Convolutional neural networks on graphs with fast localized spectral filtering. In Advances in Neural Information Processing Systems, pages 3844--3852, 2016.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. I. S. Dhillon, Y. Guan, and B. Kulis. Weighted graph cuts without eigenvectors a multilevel approach. IEEE transactions on pattern analysis and machine intelligence, 29(11):1944--1957, 2007.Google ScholarGoogle Scholar
  11. F. Dorfler and F. Bullo. Kron reduction of graphs with applications to electrical networks. IEEE Transactions on Circuits and Systems I: Regular Papers, 60(1):150--163, 2012.Google ScholarGoogle ScholarCross RefCross Ref
  12. M. Elkin and D. Peleg. Approximating k-spanner problems for k> 2. Theoretical Computer Science, 337(1--3):249--277, 2005.Google ScholarGoogle Scholar
  13. Z. Feng. Spectral graph sparsification in nearly-linear time leveraging efficient spectral perturbation analysis. In Design Automation Conference (DAC), 2016 53nd ACM/EDAC/IEEE, pages 1--6. IEEE, 2016.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Z. Feng. Similarity-aware spectral sparsification by edge filtering. In Design Automation Conference (DAC), 2018 55nd ACM/EDAC/IEEE. IEEE, 2018.Google ScholarGoogle ScholarCross RefCross Ref
  15. W. Fung, R. Hariharan, N. Harvey, and D. Panigrahi. A general framework for graph sparsification. In Proc. ACM STOC, pages 71--80, 2011.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. D. Harel and Y. Koren. A fast multi-scale method for drawing large graphs. In International symposium on graph drawing, pages 183--196. Springer, 2000.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. G. B. Hermsdorff and L. Gunderson. A unifying framework for spectrum-preserving graph sparsification and coarsening. In Advances in Neural Information Processing Systems, pages 7736--7747, 2019.Google ScholarGoogle Scholar
  18. H. Jeong, S. P. Mason, A.-L. Barabási, and Z. N. Oltvai. Lethality and centrality in protein networks. Nature, 411(6833):41, 2001.Google ScholarGoogle ScholarCross RefCross Ref
  19. G. Karypis and V. Kumar. Metis--unstructured graph partitioning and sparse matrix ordering system, version 2.0. 1995.Google ScholarGoogle Scholar
  20. G. Karypis and V. Kumar. A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM Journal on scientific Computing, 20(1):359--392, 1998.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. S. Kaski and J. Peltonen. Dimensionality reduction for data visualization [applications corner]. IEEE signal processing magazine, 28(2):100--104, 2011.Google ScholarGoogle ScholarCross RefCross Ref
  22. Y. Koren. On spectral graph drawing. In International Computing and Combinatorics Conference, pages 496--508. Springer, 2003.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. I. Koutis, G. Miller, and R. Peng. Approaching Optimality for Solving SDD Linear Systems. In Proc. IEEE FOCS, pages 235--244, 2010.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Y. T. Lee and H. Sun. An SDP-based Algorithm for Linear-sized Spectral Sparsification. In Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2017, pages 678--687, New York, NY, USA, 2017. ACM.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. H. Li and A. Schild. Spectral subspace sparsification. In 2018 IEEE 59th Annual Symposium on Foundations of Computer Science (FOCS), pages 385--396. IEEE, 2018.Google ScholarGoogle ScholarCross RefCross Ref
  26. J. Liang, S. Gurukar, and S. Parthasarathy. Mile: A multi-level framework for scalable graph embedding. arXiv preprint arXiv:1802.09612, 2018.Google ScholarGoogle Scholar
  27. G. C. Linderman and S. Steinerberger. Clustering with t-sne, provably. SIAM Journal on Mathematics of Data Science, 1(2):313--332, 2019.Google ScholarGoogle ScholarCross RefCross Ref
  28. O. Livne and A. Brandt. Lean algebraic multigrid (LAMG): Fast graph Laplacian linear solver. SIAM Journal on Scientific Computing, 34(4):B499--B522, 2012.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. A. Loukas. Graph reduction with spectral and cut guarantees. Journal of Machine Learning Research, 20(116):1--42, 2019.Google ScholarGoogle Scholar
  30. L. v. d. Maaten and G. Hinton. Visualizing data using t-sne. Journal of machine learning research, 9(Nov):2579--2605, 2008.Google ScholarGoogle Scholar
  31. D. Peleg and A. A. Sch"affer. Graph spanners. Journal of graph theory, 13(1):99--116, 1989.Google ScholarGoogle Scholar
  32. R. Peng, H. Sun, and L. Zanetti. Partitioning well-clustered graphs: Spectral clustering works. In Proceedings of The 28th Conference on Learning Theory (COLT), pages 1423--1455, 2015.Google ScholarGoogle Scholar
  33. R. Preis and R. Diekmann. Party-a software library for graph partitioning. Advances in Computational Mechanics with Parallel and Distributed Processing, pages 63--71, 1997.Google ScholarGoogle ScholarCross RefCross Ref
  34. M. Purohit, B. A. Prakash, C. Kang, Y. Zhang, and V. Subrahmanian. Fast influence-based coarsening for large networks. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 1296--1305, 2014.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Y. Saad. Iterative methods for sparse linear systems, volume 82. siam, 2003.Google ScholarGoogle Scholar
  36. V. Satuluri, S. Parthasarathy, and Y. Ruan. Local graph sparsification for scalable clustering. In Proceedings of the 2011 ACM SIGMOD International Conference on Management of data, pages 721--732, 2011.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. J. Shi and J. Malik. Normalized cuts and image segmentation. IEEE Transactions on pattern analysis and machine intelligence, 22(8):888--905, 2000.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. D. I. Shuman, M. J. Faraji, and P. Vandergheynst. A multiscale pyramid transform for graph signals. IEEE Transactions on Signal Processing, 64(8):2119--2134, 2015.Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. D. I. Shuman, S. K. Narang, P. Frossard, A. Ortega, and P. Vandergheynst. The emerging field of signal processing on graphs: Extending high-dimensional data analysis to networks and other irregular domains. IEEE Signal Processing Magazine, 30(3):83--98, 2013.Google ScholarGoogle ScholarCross RefCross Ref
  40. D. Spielman and S. Teng. Nearly linear time algorithms for preconditioning and solving symmetric, diagonally dominant linear systems. SIAM Journal on Matrix Analysis and Applications, 35(3):835--885, 2014.Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. D. A. Spielman and N. Srivastava. Graph sparsification by effective resistances. SIAM Journal on Computing, 40(6):1913--1926, 2011.Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. D. A. Spielman and S.-H. Teng. Spectral sparsification of graphs. SIAM Journal on Computing, 40(4):981--1025, 2011.Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. D. A. Spielman and J. Woo. A note on preconditioning by low-stretch spanning trees. arXiv preprint arXiv:0903.2816, 2009.Google ScholarGoogle Scholar
  44. J. Tang, J. Liu, M. Zhang, and Q. Mei. Visualizing large-scale and high-dimensional data. In Proceedings of the 25th international conference on world wide web, pages 287--297. International World Wide Web Conferences Steering Committee, 2016.Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. S.-H. Teng. Scalable algorithms for data and network analysis. Foundations and Trends® in Theoretical Computer Science, 12(1--2):1--274, 2016.Google ScholarGoogle Scholar
  46. G. Turk and M. Levoy. Zippered polygon meshes from range images. In Proceedings of the 21st annual conference on Computer graphics and interactive techniques, pages 311--318. ACM, 1994.Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. L. Van Der Maaten. Accelerating t-sne using tree-based algorithms. The Journal of Machine Learning Research, 15(1):3221--3245, 2014.Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. U. Von Luxburg. A tutorial on spectral clustering. Statistics and computing, 17(4):395--416, 2007.Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. C. Walshaw. A multilevel algorithm for force-directed graph drawing. In International Symposium on Graph Drawing, pages 171--182. Springer, 2000.Google ScholarGoogle Scholar
  50. L. Wang, Y. Xiao, B. Shao, and H. Wang. How to partition a billion-node graph. In 2014 IEEE 30th International Conference on Data Engineering, pages 568--579. IEEE, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  51. Z. Zhao and Z. Feng. A spectral graph sparsification approach to scalable vectorless power grid integrity verification. In Proceedings of the 54th Annual Design Automation Conference 2017, page 68. ACM, 2017.Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Z. Zhao and Z. Feng. Effective-resistance preserving spectral reduction of graphs. In Proceedings of the 56th Annual Design Automation Conference 2019, page 109. ACM, 2019.Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Z. Zhao, Y. Wang, and Z. Feng. SAMG: Sparsified graph theoretic algebraic multigrid for solving large symmetric diagonally dominant (SDD) matrices. In Proceedings of ACM/IEEE International Conference on Computer-Aided Design, pages 601--606, 2017.Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Z. Zhao, Y. Wang, and Z. Feng. Nearly-linear time spectral graph reduction for scalable graph partitioning and data visualization. arXiv preprint arXiv:1812.08942, 2018.Google ScholarGoogle Scholar

Index Terms

  1. Towards Scalable Spectral Embedding and Data Visualization via Spectral Coarsening

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            WSDM '21: Proceedings of the 14th ACM International Conference on Web Search and Data Mining
            March 2021
            1192 pages
            ISBN:9781450382977
            DOI:10.1145/3437963

            Copyright © 2021 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 8 March 2021

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article

            Acceptance Rates

            Overall Acceptance Rate498of2,863submissions,17%

            Upcoming Conference

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader