skip to main content
10.1145/3583780.3614955acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article
Open Access

Low-bit Quantization for Deep Graph Neural Networks with Smoothness-aware Message Propagation

Published:21 October 2023Publication History

ABSTRACT

Graph Neural Network (GNN) training and inference involve significant challenges of scalability with respect to both model sizes and number of layers, resulting in degradation of efficiency and accuracy for large and deep GNNs. We present an end-to-end solution that aims to address these challenges for efficient GNNs in resource constrained environments while avoiding the oversmoothing problem in deep GNNs. We introduce a quantization based approach for all stages of GNNs, from message passing in training to node classification, compressing the model and enabling efficient processing. The proposed GNN quantizer learns quantization ranges and reduces the model size with comparable accuracy even under low-bit quantization. To scale with the number of layers, we devise a message propagation mechanism in training that controls layer-wise changes of similarities between neighboring nodes. This objective is incorporated into a Lagrangian function with constraints and a differential multiplier method is utilized to iteratively find optimal embeddings. This mitigates oversmoothing and suppresses the quantization error to a bound. Significant improvements are demonstrated over state-of-the-art quantization methods and deep GNN approaches in both full-precision and quantized models. The proposed quantizer demonstrates superior performance in INT2 configurations across all stages of GNN, achieving a notable level of accuracy. In contrast, existing quantization approaches fail to generate satisfactory accuracy levels. Finally, the inference with INT2 and INT4 representations exhibits a speedup of 5.11 × and 4.70 × compared to full precision counterparts, respectively.

References

  1. Mehdi Bahri, Gaétan Bahl, and Stefanos Zafeiriou. 2021. Binary graph neural networks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 9492--9501.Google ScholarGoogle ScholarCross RefCross Ref
  2. Neil Band. 2020. MemFlow: Memory-Aware Distributed Deep Learning. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data. 2883--2885.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Ron Banner, Itay Hubara, Elad Hoffer, and Daniel Soudry. 2018. Scalable Methods for 8-Bit Training of Neural Networks. In Proceedings of the 32nd International Conference on Neural Information Processing Systems (Montréal, Canada) (NIPS'18). Curran Associates Inc., Red Hook, NY, USA, 5151--5159.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Ron Banner, Yury Nahshan, and Daniel Soudry. 2019. Post Training 4-Bit Quantization of Convolutional Networks for Rapid-Deployment. Curran Associates Inc., Red Hook, NY, USA.Google ScholarGoogle Scholar
  5. Yoshua Bengio, Nicholas Léonard, and Aaron Courville. 2013. Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013).Google ScholarGoogle Scholar
  6. S. Berchtold, C. Bohm, H.V. Jagadish, H.-P. Kriegel, and J. Sander. 2000. Independent quantization: an index compression technique for high-dimensional data spaces. In Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073). 577--588. https://doi.org/10.1109/ICDE.2000.839456Google ScholarGoogle ScholarCross RefCross Ref
  7. Shaosheng Cao, Wei Lu, and Qiongkai Xu. 2016. Deep Neural Networks for Learning Graph Representations. Proceedings of the AAAI Conference on Artificial Intelligence 30, 1 (Feb. 2016). https://ojs.aaai.org/index.php/AAAI/article/view/10179Google ScholarGoogle ScholarCross RefCross Ref
  8. Yanqing Chen, Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. 2013. The expressive power of word embeddings. arXiv preprint arXiv:1301.3226 (2013).Google ScholarGoogle Scholar
  9. Matthieu Courbariaux, Yoshua Bengio, and Jean-Pierre David. 2015. Training deep neural networks with low precision multiplications. arXiv:1412.7024 [cs.LG]Google ScholarGoogle Scholar
  10. Gunduz Vehbi Demirci, Aparajita Haldar, and Hakan Ferhatosmanoglu. 2022. Scalable Graph Convolutional Network Training on Distributed-Memory Systems. Proc. VLDB Endow. 16, 4 (dec 2022), 711--724. https://doi.org/10.14778/3574245.3574256Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Mucong Ding, Kezhi Kong, Jingling Li, Chen Zhu, John Dickerson, Furong Huang, and Tom Goldstein. 2021. VQ-GNN: A universal framework to scale up graph neural networks using vector quantization. Advances in Neural Information Processing Systems 34 (2021), 6733--6746.Google ScholarGoogle Scholar
  12. Zhen Dong, Zhewei Yao, Amir Gholami, Michael W Mahoney, and Kurt Keutzer. 2019. Hawq: Hessian aware quantization of neural networks with mixed-precision. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 293--302.Google ScholarGoogle ScholarCross RefCross Ref
  13. Karima Echihabi. 2020. High-dimensional vector similarity search: from time series to deep network embeddings. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data. 2829--2832.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Karima Echihabi, Kostas Zoumpatianos, and Themis Palpanas. 2021. New Trends in High-D Vector Similarity Search: Al-Driven, Progressive, and Distributed. Proc. VLDB Endow. 14, 12 (oct 2021), 3198--3201. https://doi.org/10.14778/3476311.3476407Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Steven K Esser, Jeffrey L McKinstry, Deepika Bablani, Rathinakumar Appuswamy, and Dharmendra S Modha. 2019. LEARNED STEP SIZE QUANTIZATION. In International Conference on Learning Representations.Google ScholarGoogle Scholar
  16. Boyuan Feng, Yuke Wang, Xu Li, Shu Yang, Xueqiao Peng, and Yufei Ding. 2020. Sgquant: Squeezing the last bit on graph neural networks with specialized quantization. In 2020 IEEE 32nd International Conference on Tools with Artificial Intelligence (ICTAI). IEEE, 1044--1052.Google ScholarGoogle ScholarCross RefCross Ref
  17. Hakan Ferhatosmanoglu, Ertem Tuncel, Divyakant Agrawal, and Amr El Abbadi. 2000. Vector approximation based indexing for non-uniform high dimensional data sets. In Proceedings of the 9th International Conference on Information and Knowledge Management (CIKM). 202--209.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Matthias Fey and Jan Eric Lenssen. 2019. Fast graph representation learning with PyTorch Geometric. arXiv preprint arXiv:1903.02428 (2019).Google ScholarGoogle Scholar
  19. Johannes Gasteiger, Aleksandar Bojchevski, and Stephan Günnemann. 2019. Combining Neural Networks with Personalized PageRank for Classification on Graphs. In International Conference on Learning Representations. https://openreview.net/forum?id=H1gL-2A9YmGoogle ScholarGoogle Scholar
  20. Amir Gholami, Sehoon Kim, Zhen Dong, Zhewei Yao, Michael W Mahoney, and Kurt Keutzer. 2022. A survey of quantization methods for efficient neural network inference. In Low-Power Computer Vision. Chapman and Hall/CRC, 291--326.Google ScholarGoogle Scholar
  21. Justin Gilmer, Samuel S. Schoenholz, Patrick F. Riley, Oriol Vinyals, and George E. Dahl. 2017. Neural Message Passing for Quantum Chemistry. In Proceedings of the 34th International Conference on Machine Learning - Volume 70 (Sydney, NSW, Australia) (ICML'17). 1263--1272.Google ScholarGoogle Scholar
  22. Richard A Groeneveld and Glen Meeden. 1984. Measuring skewness and kurtosis. Journal of the Royal Statistical Society: Series D (The Statistician) 33, 4 (1984), 391--399.Google ScholarGoogle Scholar
  23. Will Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs. Advances in neural information processing systems 30 (2017).Google ScholarGoogle Scholar
  24. Song Han, Huizi Mao, and William J Dally. 2015. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1510.00149 (2015).Google ScholarGoogle Scholar
  25. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.Google ScholarGoogle ScholarCross RefCross Ref
  26. Mikael Henaff, Joan Bruna, and Yann LeCun. 2015. Deep Convolutional Networks on Graph-Structured Data. arXiv:1506.05163 [cs.LG]Google ScholarGoogle Scholar
  27. Linyong Huang, Zhe Zhang, Zhaoyang Du, Shuangchen Li, Hongzhong Zheng, Yuan Xie, and Nianxiong Tan. 2022. EPQuant: A Graph Neural Network compression approach based on product quantization. Neurocomputing 503 (2022), 49--61.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Itay Hubara, Matthieu Courbariaux, Daniel Soudry, Ran El-Yaniv, and Yoshua Bengio. 2017. Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations. J. Mach. Learn. Res. 18, 1 (jan 2017), 6869--6898.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Benoit Jacob et al . 2017. gemmlowp: a small self-contained low-precision GEMM library.(2017).Google ScholarGoogle Scholar
  30. Vadim Kantorov. 2020. Pack bool and other integer tensors into smaller bitwidth in PyTorch. https://gist.github.com/vadimkantorov/30ea6d278bc492abf6ad328c6965613aGoogle ScholarGoogle Scholar
  31. Thomas N. Kipf and Max Welling. 2016. Variational Graph Auto-Encoders. arXiv:1611.07308 [stat.ML]Google ScholarGoogle Scholar
  32. Thomas N. Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In International Conference on Learning Representations. https://openreview.net/forum?id=SJU4ayYglGoogle ScholarGoogle Scholar
  33. Xiaofan Lin, Cong Zhao, and Wei Pan. 2017. Towards accurate binary convolutional neural network. Advances in neural information processing systems 30 (2017).Google ScholarGoogle Scholar
  34. Meng Liu, Hongyang Gao, and Shuiwang Ji. 2020. Towards deeper graph neural networks. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 338--348.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Xiaorui Liu, Wei Jin, Yao Ma, Yaxin Li, Hua Liu, Yiqi Wang, Ming Yan, and Jiliang Tang. 2021. Elastic graph neural networks. In International Conference on Machine Learning. PMLR, 6837--6849.Google ScholarGoogle Scholar
  36. Yao Ma, Xiaorui Liu, Tong Zhao, Yozen Liu, Jiliang Tang, and Neil Shah. 2021. A unified view on graph neural networks as graph signal denoising. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management (CIKM). 1202--1211.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Jayanta Mondal and Amol Deshpande. 2012. Managing large dynamic graphs efficiently. In Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data. 145--156.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Mathias Niepert, Mohamed Ahmed, and Konstantin Kutzkov. 2016. Learning Convolutional Neural Networks for Graphs. In Proceedings of the 33rd International Conference on International Conference on Machine Learning - Volume 48 (New York, NY, USA) (ICML'16). JMLR.org, 2014--2023.Google ScholarGoogle Scholar
  39. Kenta Oono and Taiji Suzuki. 2019. Graph neural networks exponentially lose expressive power for node classification. 8th International Conference on Learning Representations (2019).Google ScholarGoogle Scholar
  40. Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, et al . 2019. Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019).Google ScholarGoogle Scholar
  41. John C Platt and Alan H Barr. 1988. Constrained differential optimization for neural networks. (1988).Google ScholarGoogle Scholar
  42. Yu Rong, Wenbing Huang, Tingyang Xu, and Junzhou Huang. 2019. Dropedge: Towards deep graph convolutional networks on node classification. arXiv preprint arXiv:1907.10903 (2019).Google ScholarGoogle Scholar
  43. Andrea Rossi, Donatella Firmani, Paolo Merialdo, and Tommaso Teofili. 2022. Explaining link prediction systems based on knowledge graph embeddings. In Proceedings of the 2022 ACM SIGMOD International Conference on Management of Data. 2062--2075.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Siddhartha Sahu, Amine Mhedhbi, Semih Salihoglu, Jimmy Lin, and M Tamer Özsu. 2017. The ubiquity of large graphs and surprising challenges of graph processing. Proceedings of the VLDB Endowment 11, 4 (2017), 420--431.Google ScholarGoogle ScholarCross RefCross Ref
  45. Siddhartha Sahu, Amine Mhedhbi, Semih Salihoglu, Jimmy Lin, and M. Tamer Özsu. 2020. The ubiquity of large graphs and surprising challenges of graph processing: extended survey. VLDB J. 29, 2--3 (2020), 595--618. https://doi.org/10.1007/s00778-019-00548-xGoogle ScholarGoogle ScholarCross RefCross Ref
  46. Franco Scarselli, Marco Gori, Ah Chung Tsoi, Markus Hagenbuchner, and Gabriele Monfardini. 2009. The Graph Neural Network Model. IEEE Transactions on Neural Networks 20, 1 (2009), 61--80. https://doi.org/10.1109/TNN.2008.2005605Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Prithviraj Sen, Galileo Namata, Mustafa Bilgic, Lise Getoor, Brian Galligher, and Tina Eliassi-Rad. 2008. Collective classification in network data. AI magazine 29, 3 (2008), 93--93.Google ScholarGoogle Scholar
  48. Bin Shao, Haixun Wang, and Yatao Li. 2013. Trinity: A distributed graph engine on a memory cloud. In Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data. 505--516.Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Oleksandr Shchur, Maximilian Mumme, Aleksandar Bojchevski, and Stephan Günnemann. 2018. Pitfalls of graph neural network evaluation. arXiv preprint arXiv:1811.05868 (2018).Google ScholarGoogle Scholar
  50. Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jon Shlens, and Zbigniew Wojna. 2016. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2818--2826.Google ScholarGoogle ScholarCross RefCross Ref
  51. Shyam A. Tailor, Javier Fernandez-Marques, and Nicholas D. Lane. 2021. Degree-Quant: Quantization-Aware Training for Graph Neural Networks. In International Conference on Learning Representations. https://openreview.net/forum?id=NSBrFgJAHgGoogle ScholarGoogle Scholar
  52. Ertem Tuncel, Hakan Ferhatosmanoglu, and Kenneth Rose. 2002. VQ-index: An index structure for similarity searching in multimedia databases. In Proceedings of the tenth ACM international conference on Multimedia. 543--552.Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Alina Vretinaris, Chuan Lei, Vasilis Efthymiou, Xiao Qin, and Fatma Özcan. 2021. Medical entity disambiguation using graph neural networks. In Proceedings of the 2021 ACM SIGMOD International Conference on Management of Data. 2310--2318.Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Hanchen Wang, Defu Lian, Ying Zhang, Lu Qin, Xiangjian He, Yiguang Lin, and Xuemin Lin. 2021. Binarized graph neural network. World Wide Web 24, 3 (2021), 825--848. https://doi.org/10.1007/s11280-021-00878--3Google ScholarGoogle ScholarCross RefCross Ref
  55. Peiqi Wang, Xinfeng Xie, Lei Deng, Guoqi Li, Dongsheng Wang, and Yuan Xie. 2018. HitNet: Hybrid ternary recurrent neural network. In Proceedings of the 32nd International Conference on Neural Information Processing Systems. 602--612.Google ScholarGoogle Scholar
  56. Runhui Wang and Dong Deng. 2020. DeltaPQ: lossless product quantization code compression for high dimensional similarity search. Proceedings of the VLDB Endowment 13, 13 (2020), 3603--3616.Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Shuang Wang and Hakan Ferhatosmanoglu. 2020. PPQ-trajectory: spatio-temporal quantization for querying in large trajectory repositories. Proceedings of the VLDB Endowment 14, 2 (2020), 215--227.Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. Yuke Wang, Boyuan Feng, and Yufei Ding. 2022. Qgtc: accelerating quantized graph neural networks via gpu tensor core. In Proceedings of the 27th ACM SIGPLAN symposium on principles and practice of parallel programming. 107--119.Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. Zonghan Wu, Shirui Pan, Fengwen Chen, Guodong Long, Chengqi Zhang, and Philip S. Yu. 2021. A Comprehensive Survey on Graph Neural Networks. IEEE Transactions on Neural Networks and Learning Systems 32, 1 (2021), 4--24. https://doi.org/10.1109/TNNLS.2020.2978386Google ScholarGoogle ScholarCross RefCross Ref
  60. Zhewei Yao, Zhen Dong, Zhangcheng Zheng, Amir Gholami, Jiali Yu, Eric Tan, Leyuan Wang, Qijing Huang, Yida Wang, Michael Mahoney, et al . 2021. Hawq-v3: Dyadic neural network quantization. In International Conference on Machine Learning. PMLR, 11875--11886.Google ScholarGoogle Scholar
  61. Jingtao Zhan, Jiaxin Mao, Yiqun Liu, Jiafeng Guo, Min Zhang, and Shaoping Ma. 2021. Jointly optimizing query encoder and product quantization to improve retrieval performance. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management (CIKM). 2487--2496.Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. Dalong Zhang, Xin Huang, Ziqi Liu, Jun Zhou, Zhiyang Hu, Xianzheng Song, Zhibang Ge, Lin Wang, Zhiqiang Zhang, and Yuan Qi. 2020. AGL: A Scalable System for Industrial-purpose Graph Machine Learning. Proceedings of the VLDB Endowment 13, 12 (2020), 3125--3137.Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. Lingxiao Zhao and Leman Akoglu. 2020. PairNorm: Tackling Oversmoothing in GNNs. In International Conference on Learning Representations. https://openreview.net/forum?id=rkecl1rtwBGoogle ScholarGoogle Scholar
  64. Yiren Zhao, Duo Wang, Daniel Bates, Robert Mullins, Mateja Jamnik, and Pietro Lio. 2020. Learned Low Precision Graph Neural Networks. arXiv preprint arXiv:2009.09232 (2020).Google ScholarGoogle Scholar
  65. Chenguang Zheng, Hongzhi Chen, Yuxuan Cheng, Zhezheng Song, Yifan Wu, Changji Li, James Cheng, Hao Yang, and Shuai Zhang. 2022. ByteGNN: efficient graph neural network training at large scale. Proceedings of the VLDB Endowment 15, 6 (2022), 1228--1242.Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. Meiqi Zhu, Xiao Wang, Chuan Shi, Houye Ji, and Peng Cui. 2021. Interpreting and unifying graph neural networks with an optimization framework. In Proceedings of the Web Conference 2021. 1215--1226.Google ScholarGoogle ScholarDigital LibraryDigital Library
  67. Zeyu Zhu, Fanrong Li, Zitao Mo, Qinghao Hu, Gang Li, Zejian Liu, Xiaoyao Liang, and Jian Cheng. 2022. A2Q:Aggregation-Aware Quantization for Graph Neural Networks. In The Eleventh International Conference on Learning Representations.Google ScholarGoogle Scholar

Index Terms

  1. Low-bit Quantization for Deep Graph Neural Networks with Smoothness-aware Message Propagation

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Article Metrics

          • Downloads (Last 12 months)258
          • Downloads (Last 6 weeks)53

          Other Metrics

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader