ABSTRACT
This paper proposes a scalable multilevel framework for the spectral embedding of large undirected graphs. The proposed method first computes much smaller yet sparse graphs while preserving the key spectral (structural) properties of the original graph, by exploiting a nearly-linear time spectral graph coarsening approach. Then, the resultant spectrally-coarsened graphs are leveraged for the development of much faster algorithms for multilevel spectral graph embedding (clustering) as well as visualization of large data sets. We conducted extensive experiments using a variety of large graphs and datasets and obtained very promising results. For instance, we are able to coarsen the "coPapersCiteseer" graph with 0.43 million nodes and 16 million edges into a much smaller graph with only 13K (32X fewer) nodes and 17K (950X fewer) edges in about 16 seconds; the spectrally-coarsened graphs allow us to achieve up to 1,100X speedup for multilevel spectral graph embedding (clustering) and up to 60X speedup for t-SNE visualization of large data sets.
- D. A. Bader, H. Meyerhenke, P. Sanders, C. Schulz, A. Kappes, and D. Wagner. Benchmarking for graph clustering and partitioning. In Encyclopedia of Social Network Analysis and Mining, pages 73--82. Springer, 2014.Google ScholarCross Ref
- D. A. Bader, H. Meyerhenke, P. Sanders, and D. Wagner. Graph partitioning and graph clustering. In 10th DIMACS Implementation Challenge Workshop, 2012.Google Scholar
- J. Batson, D. Spielman, and N. Srivastava. Twice-Ramanujan Sparsifiers. SIAM Journal on Computing, 41(6):1704--1721, 2012.Google ScholarCross Ref
- A. A. Benczúr and D. R. Karger. Approximating st minimum cuts in o (n 2) time. In Proceedings of the twenty-eighth annual ACM symposium on Theory of computing, pages 47--55. ACM, 1996.Google Scholar
- A. A. Benczúr and D. R. Karger. Randomized approximation schemes for cuts and flows in capacitated graphs. SIAM Journal on Computing, 44(2):290--319, 2015.Google ScholarDigital Library
- J. Chen and I. Safro. Algebraic distance on graphs. SIAM Journal on Scientific Computing, 33(6):3468--3490, 2011.Google ScholarDigital Library
- P. Christiano, J. Kelner, A. Madry, D. Spielman, and S. Teng. Electrical flows, laplacian systems, and faster approximation of maximum flow in undirected graphs. In Proc. ACM STOC, pages 273--282, 2011.Google ScholarDigital Library
- T. Davis and Y. Hu. The university of florida sparse matrix collection. ACM Trans. on Math. Soft. (TOMS), 38(1):1, 2011.Google Scholar
- M. Defferrard, X. Bresson, and P. Vandergheynst. Convolutional neural networks on graphs with fast localized spectral filtering. In Advances in Neural Information Processing Systems, pages 3844--3852, 2016.Google ScholarDigital Library
- I. S. Dhillon, Y. Guan, and B. Kulis. Weighted graph cuts without eigenvectors a multilevel approach. IEEE transactions on pattern analysis and machine intelligence, 29(11):1944--1957, 2007.Google Scholar
- F. Dorfler and F. Bullo. Kron reduction of graphs with applications to electrical networks. IEEE Transactions on Circuits and Systems I: Regular Papers, 60(1):150--163, 2012.Google ScholarCross Ref
- M. Elkin and D. Peleg. Approximating k-spanner problems for k> 2. Theoretical Computer Science, 337(1--3):249--277, 2005.Google Scholar
- Z. Feng. Spectral graph sparsification in nearly-linear time leveraging efficient spectral perturbation analysis. In Design Automation Conference (DAC), 2016 53nd ACM/EDAC/IEEE, pages 1--6. IEEE, 2016.Google ScholarDigital Library
- Z. Feng. Similarity-aware spectral sparsification by edge filtering. In Design Automation Conference (DAC), 2018 55nd ACM/EDAC/IEEE. IEEE, 2018.Google ScholarCross Ref
- W. Fung, R. Hariharan, N. Harvey, and D. Panigrahi. A general framework for graph sparsification. In Proc. ACM STOC, pages 71--80, 2011.Google ScholarDigital Library
- D. Harel and Y. Koren. A fast multi-scale method for drawing large graphs. In International symposium on graph drawing, pages 183--196. Springer, 2000.Google ScholarDigital Library
- G. B. Hermsdorff and L. Gunderson. A unifying framework for spectrum-preserving graph sparsification and coarsening. In Advances in Neural Information Processing Systems, pages 7736--7747, 2019.Google Scholar
- H. Jeong, S. P. Mason, A.-L. Barabási, and Z. N. Oltvai. Lethality and centrality in protein networks. Nature, 411(6833):41, 2001.Google ScholarCross Ref
- G. Karypis and V. Kumar. Metis--unstructured graph partitioning and sparse matrix ordering system, version 2.0. 1995.Google Scholar
- G. Karypis and V. Kumar. A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM Journal on scientific Computing, 20(1):359--392, 1998.Google ScholarDigital Library
- S. Kaski and J. Peltonen. Dimensionality reduction for data visualization [applications corner]. IEEE signal processing magazine, 28(2):100--104, 2011.Google ScholarCross Ref
- Y. Koren. On spectral graph drawing. In International Computing and Combinatorics Conference, pages 496--508. Springer, 2003.Google ScholarDigital Library
- I. Koutis, G. Miller, and R. Peng. Approaching Optimality for Solving SDD Linear Systems. In Proc. IEEE FOCS, pages 235--244, 2010.Google ScholarDigital Library
- Y. T. Lee and H. Sun. An SDP-based Algorithm for Linear-sized Spectral Sparsification. In Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2017, pages 678--687, New York, NY, USA, 2017. ACM.Google ScholarDigital Library
- H. Li and A. Schild. Spectral subspace sparsification. In 2018 IEEE 59th Annual Symposium on Foundations of Computer Science (FOCS), pages 385--396. IEEE, 2018.Google ScholarCross Ref
- J. Liang, S. Gurukar, and S. Parthasarathy. Mile: A multi-level framework for scalable graph embedding. arXiv preprint arXiv:1802.09612, 2018.Google Scholar
- G. C. Linderman and S. Steinerberger. Clustering with t-sne, provably. SIAM Journal on Mathematics of Data Science, 1(2):313--332, 2019.Google ScholarCross Ref
- O. Livne and A. Brandt. Lean algebraic multigrid (LAMG): Fast graph Laplacian linear solver. SIAM Journal on Scientific Computing, 34(4):B499--B522, 2012.Google ScholarDigital Library
- A. Loukas. Graph reduction with spectral and cut guarantees. Journal of Machine Learning Research, 20(116):1--42, 2019.Google Scholar
- L. v. d. Maaten and G. Hinton. Visualizing data using t-sne. Journal of machine learning research, 9(Nov):2579--2605, 2008.Google Scholar
- D. Peleg and A. A. Sch"affer. Graph spanners. Journal of graph theory, 13(1):99--116, 1989.Google Scholar
- R. Peng, H. Sun, and L. Zanetti. Partitioning well-clustered graphs: Spectral clustering works. In Proceedings of The 28th Conference on Learning Theory (COLT), pages 1423--1455, 2015.Google Scholar
- R. Preis and R. Diekmann. Party-a software library for graph partitioning. Advances in Computational Mechanics with Parallel and Distributed Processing, pages 63--71, 1997.Google ScholarCross Ref
- M. Purohit, B. A. Prakash, C. Kang, Y. Zhang, and V. Subrahmanian. Fast influence-based coarsening for large networks. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 1296--1305, 2014.Google ScholarDigital Library
- Y. Saad. Iterative methods for sparse linear systems, volume 82. siam, 2003.Google Scholar
- V. Satuluri, S. Parthasarathy, and Y. Ruan. Local graph sparsification for scalable clustering. In Proceedings of the 2011 ACM SIGMOD International Conference on Management of data, pages 721--732, 2011.Google ScholarDigital Library
- J. Shi and J. Malik. Normalized cuts and image segmentation. IEEE Transactions on pattern analysis and machine intelligence, 22(8):888--905, 2000.Google ScholarDigital Library
- D. I. Shuman, M. J. Faraji, and P. Vandergheynst. A multiscale pyramid transform for graph signals. IEEE Transactions on Signal Processing, 64(8):2119--2134, 2015.Google ScholarDigital Library
- D. I. Shuman, S. K. Narang, P. Frossard, A. Ortega, and P. Vandergheynst. The emerging field of signal processing on graphs: Extending high-dimensional data analysis to networks and other irregular domains. IEEE Signal Processing Magazine, 30(3):83--98, 2013.Google ScholarCross Ref
- D. Spielman and S. Teng. Nearly linear time algorithms for preconditioning and solving symmetric, diagonally dominant linear systems. SIAM Journal on Matrix Analysis and Applications, 35(3):835--885, 2014.Google ScholarDigital Library
- D. A. Spielman and N. Srivastava. Graph sparsification by effective resistances. SIAM Journal on Computing, 40(6):1913--1926, 2011.Google ScholarDigital Library
- D. A. Spielman and S.-H. Teng. Spectral sparsification of graphs. SIAM Journal on Computing, 40(4):981--1025, 2011.Google ScholarDigital Library
- D. A. Spielman and J. Woo. A note on preconditioning by low-stretch spanning trees. arXiv preprint arXiv:0903.2816, 2009.Google Scholar
- J. Tang, J. Liu, M. Zhang, and Q. Mei. Visualizing large-scale and high-dimensional data. In Proceedings of the 25th international conference on world wide web, pages 287--297. International World Wide Web Conferences Steering Committee, 2016.Google ScholarDigital Library
- S.-H. Teng. Scalable algorithms for data and network analysis. Foundations and Trends® in Theoretical Computer Science, 12(1--2):1--274, 2016.Google Scholar
- G. Turk and M. Levoy. Zippered polygon meshes from range images. In Proceedings of the 21st annual conference on Computer graphics and interactive techniques, pages 311--318. ACM, 1994.Google ScholarDigital Library
- L. Van Der Maaten. Accelerating t-sne using tree-based algorithms. The Journal of Machine Learning Research, 15(1):3221--3245, 2014.Google ScholarDigital Library
- U. Von Luxburg. A tutorial on spectral clustering. Statistics and computing, 17(4):395--416, 2007.Google ScholarDigital Library
- C. Walshaw. A multilevel algorithm for force-directed graph drawing. In International Symposium on Graph Drawing, pages 171--182. Springer, 2000.Google Scholar
- L. Wang, Y. Xiao, B. Shao, and H. Wang. How to partition a billion-node graph. In 2014 IEEE 30th International Conference on Data Engineering, pages 568--579. IEEE, 2014.Google ScholarCross Ref
- Z. Zhao and Z. Feng. A spectral graph sparsification approach to scalable vectorless power grid integrity verification. In Proceedings of the 54th Annual Design Automation Conference 2017, page 68. ACM, 2017.Google ScholarDigital Library
- Z. Zhao and Z. Feng. Effective-resistance preserving spectral reduction of graphs. In Proceedings of the 56th Annual Design Automation Conference 2019, page 109. ACM, 2019.Google ScholarDigital Library
- Z. Zhao, Y. Wang, and Z. Feng. SAMG: Sparsified graph theoretic algebraic multigrid for solving large symmetric diagonally dominant (SDD) matrices. In Proceedings of ACM/IEEE International Conference on Computer-Aided Design, pages 601--606, 2017.Google ScholarDigital Library
- Z. Zhao, Y. Wang, and Z. Feng. Nearly-linear time spectral graph reduction for scalable graph partitioning and data visualization. arXiv preprint arXiv:1812.08942, 2018.Google Scholar
Index Terms
- Towards Scalable Spectral Embedding and Data Visualization via Spectral Coarsening
Recommendations
Scalable Graph Topology Learning via Spectral Densification
WSDM '22: Proceedings of the Fifteenth ACM International Conference on Web Search and Data MiningGraph learning plays an important role in many data mining and machine learning tasks, such as manifold learning, data representation and analysis, dimensionality reduction, data clustering, and visualization, etc. In this work, we introduce a highly-...
Effective-Resistance Preserving Spectral Reduction of Graphs
DAC '19: Proceedings of the 56th Annual Design Automation Conference 2019This paper proposes a scalable algorithmic framework for effective-resistance preserving spectral reduction of large undirected graphs. The proposed method allows computing much smaller graphs while preserving the key spectral (structural) properties of ...
Spectral Cluster Maps Versus Spectral Clustering
Computer Information Systems and Industrial ManagementAbstractThe paper investigates several notions of graph Laplacians and graph kernels from the perspective of understanding the graph clustering via the graph embedding into an Euclidean space. We propose hereby a unified view of spectral graph clustering ...
Comments