Low-bit Quantization for Deep Graph Neural Networks with Smoothness-aware Message Propagation

Authors:
Shuang Wang

University of Warwick, Coventry, United Kingdom

University of Warwick, Coventry, United Kingdom

0000-0002-2498-4740
View Profile

,
Bahaeddin Eravci

TOBB University of Economics and Technology, Ankara, Turkey

TOBB University of Economics and Technology, Ankara, Turkey

0000-0002-9006-917X
View Profile

,
Rustam Guliyev

University of Warwick, Coventry, United Kingdom

University of Warwick, Coventry, United Kingdom

0009-0008-5515-0766
View Profile

,
Hakan Ferhatosmanoglu

University of Warwick, Coventry, United Kingdom

University of Warwick, Coventry, United Kingdom

0000-0002-5181-4712
View Profile

CIKM '23: Proceedings of the 32nd ACM International Conference on Information and Knowledge ManagementOctober 2023Pages 2626–2636https://doi.org/10.1145/3583780.3614955

Published:21 October 2023Publication History

CIKM '23: Proceedings of the 32nd ACM International Conference on Information and Knowledge Management

Pages 2626–2636

ABSTRACT

Graph Neural Network (GNN) training and inference involve significant challenges of scalability with respect to both model sizes and number of layers, resulting in degradation of efficiency and accuracy for large and deep GNNs. We present an end-to-end solution that aims to address these challenges for efficient GNNs in resource constrained environments while avoiding the oversmoothing problem in deep GNNs. We introduce a quantization based approach for all stages of GNNs, from message passing in training to node classification, compressing the model and enabling efficient processing. The proposed GNN quantizer learns quantization ranges and reduces the model size with comparable accuracy even under low-bit quantization. To scale with the number of layers, we devise a message propagation mechanism in training that controls layer-wise changes of similarities between neighboring nodes. This objective is incorporated into a Lagrangian function with constraints and a differential multiplier method is utilized to iteratively find optimal embeddings. This mitigates oversmoothing and suppresses the quantization error to a bound. Significant improvements are demonstrated over state-of-the-art quantization methods and deep GNN approaches in both full-precision and quantized models. The proposed quantizer demonstrates superior performance in INT2 configurations across all stages of GNN, achieving a notable level of accuracy. In contrast, existing quantization approaches fail to generate satisfactory accuracy levels. Finally, the inference with INT2 and INT4 representations exhibits a speedup of 5.11 × and 4.70 × compared to full precision counterparts, respectively.

References

Mehdi Bahri, Gaétan Bahl, and Stefanos Zafeiriou. 2021. Binary graph neural networks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 9492--9501.Google ScholarCross Ref
Neil Band. 2020. MemFlow: Memory-Aware Distributed Deep Learning. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data. 2883--2885.Google ScholarDigital Library
Ron Banner, Itay Hubara, Elad Hoffer, and Daniel Soudry. 2018. Scalable Methods for 8-Bit Training of Neural Networks. In Proceedings of the 32nd International Conference on Neural Information Processing Systems (Montréal, Canada) (NIPS'18). Curran Associates Inc., Red Hook, NY, USA, 5151--5159.Google ScholarDigital Library
Ron Banner, Yury Nahshan, and Daniel Soudry. 2019. Post Training 4-Bit Quantization of Convolutional Networks for Rapid-Deployment. Curran Associates Inc., Red Hook, NY, USA.Google Scholar
Yoshua Bengio, Nicholas Léonard, and Aaron Courville. 2013. Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013).Google Scholar
S. Berchtold, C. Bohm, H.V. Jagadish, H.-P. Kriegel, and J. Sander. 2000. Independent quantization: an index compression technique for high-dimensional data spaces. In Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073). 577--588. https://doi.org/10.1109/ICDE.2000.839456Google ScholarCross Ref
Shaosheng Cao, Wei Lu, and Qiongkai Xu. 2016. Deep Neural Networks for Learning Graph Representations. Proceedings of the AAAI Conference on Artificial Intelligence 30, 1 (Feb. 2016). https://ojs.aaai.org/index.php/AAAI/article/view/10179Google ScholarCross Ref
Yanqing Chen, Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. 2013. The expressive power of word embeddings. arXiv preprint arXiv:1301.3226 (2013).Google Scholar
Matthieu Courbariaux, Yoshua Bengio, and Jean-Pierre David. 2015. Training deep neural networks with low precision multiplications. arXiv:1412.7024 [cs.LG]Google Scholar
Gunduz Vehbi Demirci, Aparajita Haldar, and Hakan Ferhatosmanoglu. 2022. Scalable Graph Convolutional Network Training on Distributed-Memory Systems. Proc. VLDB Endow. 16, 4 (dec 2022), 711--724. https://doi.org/10.14778/3574245.3574256Google ScholarDigital Library
Mucong Ding, Kezhi Kong, Jingling Li, Chen Zhu, John Dickerson, Furong Huang, and Tom Goldstein. 2021. VQ-GNN: A universal framework to scale up graph neural networks using vector quantization. Advances in Neural Information Processing Systems 34 (2021), 6733--6746.Google Scholar
Zhen Dong, Zhewei Yao, Amir Gholami, Michael W Mahoney, and Kurt Keutzer. 2019. Hawq: Hessian aware quantization of neural networks with mixed-precision. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 293--302.Google ScholarCross Ref
Karima Echihabi. 2020. High-dimensional vector similarity search: from time series to deep network embeddings. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data. 2829--2832.Google ScholarDigital Library
Karima Echihabi, Kostas Zoumpatianos, and Themis Palpanas. 2021. New Trends in High-D Vector Similarity Search: Al-Driven, Progressive, and Distributed. Proc. VLDB Endow. 14, 12 (oct 2021), 3198--3201. https://doi.org/10.14778/3476311.3476407Google ScholarDigital Library
Steven K Esser, Jeffrey L McKinstry, Deepika Bablani, Rathinakumar Appuswamy, and Dharmendra S Modha. 2019. LEARNED STEP SIZE QUANTIZATION. In International Conference on Learning Representations.Google Scholar
Boyuan Feng, Yuke Wang, Xu Li, Shu Yang, Xueqiao Peng, and Yufei Ding. 2020. Sgquant: Squeezing the last bit on graph neural networks with specialized quantization. In 2020 IEEE 32nd International Conference on Tools with Artificial Intelligence (ICTAI). IEEE, 1044--1052.Google ScholarCross Ref
Hakan Ferhatosmanoglu, Ertem Tuncel, Divyakant Agrawal, and Amr El Abbadi. 2000. Vector approximation based indexing for non-uniform high dimensional data sets. In Proceedings of the 9th International Conference on Information and Knowledge Management (CIKM). 202--209.Google ScholarDigital Library
Matthias Fey and Jan Eric Lenssen. 2019. Fast graph representation learning with PyTorch Geometric. arXiv preprint arXiv:1903.02428 (2019).Google Scholar
Johannes Gasteiger, Aleksandar Bojchevski, and Stephan Günnemann. 2019. Combining Neural Networks with Personalized PageRank for Classification on Graphs. In International Conference on Learning Representations. https://openreview.net/forum?id=H1gL-2A9YmGoogle Scholar
Amir Gholami, Sehoon Kim, Zhen Dong, Zhewei Yao, Michael W Mahoney, and Kurt Keutzer. 2022. A survey of quantization methods for efficient neural network inference. In Low-Power Computer Vision. Chapman and Hall/CRC, 291--326.Google Scholar
Justin Gilmer, Samuel S. Schoenholz, Patrick F. Riley, Oriol Vinyals, and George E. Dahl. 2017. Neural Message Passing for Quantum Chemistry. In Proceedings of the 34th International Conference on Machine Learning - Volume 70 (Sydney, NSW, Australia) (ICML'17). 1263--1272.Google Scholar
Richard A Groeneveld and Glen Meeden. 1984. Measuring skewness and kurtosis. Journal of the Royal Statistical Society: Series D (The Statistician) 33, 4 (1984), 391--399.Google Scholar
Will Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs. Advances in neural information processing systems 30 (2017).Google Scholar
Song Han, Huizi Mao, and William J Dally. 2015. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1510.00149 (2015).Google Scholar
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.Google ScholarCross Ref
Mikael Henaff, Joan Bruna, and Yann LeCun. 2015. Deep Convolutional Networks on Graph-Structured Data. arXiv:1506.05163 [cs.LG]Google Scholar
Linyong Huang, Zhe Zhang, Zhaoyang Du, Shuangchen Li, Hongzhong Zheng, Yuan Xie, and Nianxiong Tan. 2022. EPQuant: A Graph Neural Network compression approach based on product quantization. Neurocomputing 503 (2022), 49--61.Google ScholarDigital Library
Itay Hubara, Matthieu Courbariaux, Daniel Soudry, Ran El-Yaniv, and Yoshua Bengio. 2017. Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations. J. Mach. Learn. Res. 18, 1 (jan 2017), 6869--6898.Google ScholarDigital Library
Benoit Jacob et al . 2017. gemmlowp: a small self-contained low-precision GEMM library.(2017).Google Scholar
Vadim Kantorov. 2020. Pack bool and other integer tensors into smaller bitwidth in PyTorch. https://gist.github.com/vadimkantorov/30ea6d278bc492abf6ad328c6965613aGoogle Scholar
Thomas N. Kipf and Max Welling. 2016. Variational Graph Auto-Encoders. arXiv:1611.07308 [stat.ML]Google Scholar
Thomas N. Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In International Conference on Learning Representations. https://openreview.net/forum?id=SJU4ayYglGoogle Scholar
Xiaofan Lin, Cong Zhao, and Wei Pan. 2017. Towards accurate binary convolutional neural network. Advances in neural information processing systems 30 (2017).Google Scholar
Meng Liu, Hongyang Gao, and Shuiwang Ji. 2020. Towards deeper graph neural networks. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 338--348.Google ScholarDigital Library
Xiaorui Liu, Wei Jin, Yao Ma, Yaxin Li, Hua Liu, Yiqi Wang, Ming Yan, and Jiliang Tang. 2021. Elastic graph neural networks. In International Conference on Machine Learning. PMLR, 6837--6849.Google Scholar
Yao Ma, Xiaorui Liu, Tong Zhao, Yozen Liu, Jiliang Tang, and Neil Shah. 2021. A unified view on graph neural networks as graph signal denoising. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management (CIKM). 1202--1211.Google ScholarDigital Library
Jayanta Mondal and Amol Deshpande. 2012. Managing large dynamic graphs efficiently. In Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data. 145--156.Google ScholarDigital Library
Mathias Niepert, Mohamed Ahmed, and Konstantin Kutzkov. 2016. Learning Convolutional Neural Networks for Graphs. In Proceedings of the 33rd International Conference on International Conference on Machine Learning - Volume 48 (New York, NY, USA) (ICML'16). JMLR.org, 2014--2023.Google Scholar
Kenta Oono and Taiji Suzuki. 2019. Graph neural networks exponentially lose expressive power for node classification. 8th International Conference on Learning Representations (2019).Google Scholar
Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, et al . 2019. Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019).Google Scholar
John C Platt and Alan H Barr. 1988. Constrained differential optimization for neural networks. (1988).Google Scholar
Yu Rong, Wenbing Huang, Tingyang Xu, and Junzhou Huang. 2019. Dropedge: Towards deep graph convolutional networks on node classification. arXiv preprint arXiv:1907.10903 (2019).Google Scholar
Andrea Rossi, Donatella Firmani, Paolo Merialdo, and Tommaso Teofili. 2022. Explaining link prediction systems based on knowledge graph embeddings. In Proceedings of the 2022 ACM SIGMOD International Conference on Management of Data. 2062--2075.Google ScholarDigital Library
Siddhartha Sahu, Amine Mhedhbi, Semih Salihoglu, Jimmy Lin, and M Tamer Özsu. 2017. The ubiquity of large graphs and surprising challenges of graph processing. Proceedings of the VLDB Endowment 11, 4 (2017), 420--431.Google ScholarCross Ref
Siddhartha Sahu, Amine Mhedhbi, Semih Salihoglu, Jimmy Lin, and M. Tamer Özsu. 2020. The ubiquity of large graphs and surprising challenges of graph processing: extended survey. VLDB J. 29, 2--3 (2020), 595--618. https://doi.org/10.1007/s00778-019-00548-xGoogle ScholarCross Ref
Franco Scarselli, Marco Gori, Ah Chung Tsoi, Markus Hagenbuchner, and Gabriele Monfardini. 2009. The Graph Neural Network Model. IEEE Transactions on Neural Networks 20, 1 (2009), 61--80. https://doi.org/10.1109/TNN.2008.2005605Google ScholarDigital Library
Prithviraj Sen, Galileo Namata, Mustafa Bilgic, Lise Getoor, Brian Galligher, and Tina Eliassi-Rad. 2008. Collective classification in network data. AI magazine 29, 3 (2008), 93--93.Google Scholar
Bin Shao, Haixun Wang, and Yatao Li. 2013. Trinity: A distributed graph engine on a memory cloud. In Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data. 505--516.Google ScholarDigital Library
Oleksandr Shchur, Maximilian Mumme, Aleksandar Bojchevski, and Stephan Günnemann. 2018. Pitfalls of graph neural network evaluation. arXiv preprint arXiv:1811.05868 (2018).Google Scholar
Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jon Shlens, and Zbigniew Wojna. 2016. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2818--2826.Google ScholarCross Ref
Shyam A. Tailor, Javier Fernandez-Marques, and Nicholas D. Lane. 2021. Degree-Quant: Quantization-Aware Training for Graph Neural Networks. In International Conference on Learning Representations. https://openreview.net/forum?id=NSBrFgJAHgGoogle Scholar
Ertem Tuncel, Hakan Ferhatosmanoglu, and Kenneth Rose. 2002. VQ-index: An index structure for similarity searching in multimedia databases. In Proceedings of the tenth ACM international conference on Multimedia. 543--552.Google ScholarDigital Library
Alina Vretinaris, Chuan Lei, Vasilis Efthymiou, Xiao Qin, and Fatma Özcan. 2021. Medical entity disambiguation using graph neural networks. In Proceedings of the 2021 ACM SIGMOD International Conference on Management of Data. 2310--2318.Google ScholarDigital Library
Hanchen Wang, Defu Lian, Ying Zhang, Lu Qin, Xiangjian He, Yiguang Lin, and Xuemin Lin. 2021. Binarized graph neural network. World Wide Web 24, 3 (2021), 825--848. https://doi.org/10.1007/s11280-021-00878--3Google ScholarCross Ref
Peiqi Wang, Xinfeng Xie, Lei Deng, Guoqi Li, Dongsheng Wang, and Yuan Xie. 2018. HitNet: Hybrid ternary recurrent neural network. In Proceedings of the 32nd International Conference on Neural Information Processing Systems. 602--612.Google Scholar
Runhui Wang and Dong Deng. 2020. DeltaPQ: lossless product quantization code compression for high dimensional similarity search. Proceedings of the VLDB Endowment 13, 13 (2020), 3603--3616.Google ScholarDigital Library
Shuang Wang and Hakan Ferhatosmanoglu. 2020. PPQ-trajectory: spatio-temporal quantization for querying in large trajectory repositories. Proceedings of the VLDB Endowment 14, 2 (2020), 215--227.Google ScholarDigital Library
Yuke Wang, Boyuan Feng, and Yufei Ding. 2022. Qgtc: accelerating quantized graph neural networks via gpu tensor core. In Proceedings of the 27th ACM SIGPLAN symposium on principles and practice of parallel programming. 107--119.Google ScholarDigital Library
Zonghan Wu, Shirui Pan, Fengwen Chen, Guodong Long, Chengqi Zhang, and Philip S. Yu. 2021. A Comprehensive Survey on Graph Neural Networks. IEEE Transactions on Neural Networks and Learning Systems 32, 1 (2021), 4--24. https://doi.org/10.1109/TNNLS.2020.2978386Google ScholarCross Ref
Zhewei Yao, Zhen Dong, Zhangcheng Zheng, Amir Gholami, Jiali Yu, Eric Tan, Leyuan Wang, Qijing Huang, Yida Wang, Michael Mahoney, et al . 2021. Hawq-v3: Dyadic neural network quantization. In International Conference on Machine Learning. PMLR, 11875--11886.Google Scholar
Jingtao Zhan, Jiaxin Mao, Yiqun Liu, Jiafeng Guo, Min Zhang, and Shaoping Ma. 2021. Jointly optimizing query encoder and product quantization to improve retrieval performance. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management (CIKM). 2487--2496.Google ScholarDigital Library
Dalong Zhang, Xin Huang, Ziqi Liu, Jun Zhou, Zhiyang Hu, Xianzheng Song, Zhibang Ge, Lin Wang, Zhiqiang Zhang, and Yuan Qi. 2020. AGL: A Scalable System for Industrial-purpose Graph Machine Learning. Proceedings of the VLDB Endowment 13, 12 (2020), 3125--3137.Google ScholarDigital Library
Lingxiao Zhao and Leman Akoglu. 2020. PairNorm: Tackling Oversmoothing in GNNs. In International Conference on Learning Representations. https://openreview.net/forum?id=rkecl1rtwBGoogle Scholar
Yiren Zhao, Duo Wang, Daniel Bates, Robert Mullins, Mateja Jamnik, and Pietro Lio. 2020. Learned Low Precision Graph Neural Networks. arXiv preprint arXiv:2009.09232 (2020).Google Scholar
Chenguang Zheng, Hongzhi Chen, Yuxuan Cheng, Zhezheng Song, Yifan Wu, Changji Li, James Cheng, Hao Yang, and Shuai Zhang. 2022. ByteGNN: efficient graph neural network training at large scale. Proceedings of the VLDB Endowment 15, 6 (2022), 1228--1242.Google ScholarDigital Library
Meiqi Zhu, Xiao Wang, Chuan Shi, Houye Ji, and Peng Cui. 2021. Interpreting and unifying graph neural networks with an optimization framework. In Proceedings of the Web Conference 2021. 1215--1226.Google ScholarDigital Library
Zeyu Zhu, Fanrong Li, Zitao Mo, Qinghao Hu, Gang Li, Zejian Liu, Xiaoyao Liang, and Jian Cheng. 2022. A2Q:Aggregation-Aware Quantization for Graph Neural Networks. In The Eleventh International Conference on Learning Representations.Google Scholar

Index Terms

Low-bit Quantization for Deep Graph Neural Networks with Smoothness-aware Message Propagation

Recommendations

LQ-Nets: Learned Quantization for Highly Accurate and Compact Deep Neural Networks
Computer Vision – ECCV 2018
Abstract
Although weight and activation quantization is an effective approach for Deep Neural Network (DNN) compression and has a lot of potentials to increase inference speed leveraging bit-operations, there is still a noticeable gap in terms of ...
Read More
Robustness-aware 2-bit quantization with real-time performance for neural network
Abstract
Quantized neural networks (NN) with reduced bit precision are practical solutions to minimize computational and memory resource requirements and play a vital role in machine learning. However, it is still challenging to avoid ...
Read More
Efficient bit-rate scalability for weighted squared error optimization in audio coding

We propose two quantization techniques for improving the bit-rate scalability of compression systems that optimize a weighted squared error (WSE) distortion metric. We show that quantization of the base-layer reconstruction error using entropy-coded ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CIKM '23: Proceedings of the 32nd ACM International Conference on Information and Knowledge Management
October 2023
5508 pages
ISBN:9798400701245
DOI:10.1145/3583780
General Chairs:
Ingo Frommholz
University of Wolverhampton, UK
,
Frank Hopfgartner
University of Koblenz, Germany
,
Mark Lee
University of Birmingham, UK
,
Michael Oakes
University of Birmingham, UK
,
Program Chairs:
Mounia Lalmas
Spotify, UK
,
Min Zhang
Tsinghua University, China
,
Rodrygo Santos
Federal University of Minas Gerais, Brazil
Copyright © 2023 Owner/Author
This work is licensed under a Creative Commons Attribution International 4.0 License.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 21 October 2023
Check for updates
Author Tags
graph neural networks
large-scale graph management
oversmoothing in GNNs
quantization
scalable machine learning
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate1,861of8,427submissions,22%
Upcoming Conference
CIKM '24

Sponsor:

sigir

sigir

The 33rd ACM International Conference on Information and Knowledge Management

October 21 - 25, 2024

Boise , ID , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 258
  Total Downloads
- Downloads (Last 12 months)258
- Downloads (Last 6 weeks)53
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Low-bit Quantization for Deep Graph Neural Networks with Smoothness-aware Message Propagation

CIKM '23: Proceedings of the 32nd ACM International Conference on Information and Knowledge Management

ABSTRACT

References

Cited By

Index Terms

Recommendations

LQ-Nets: Learned Quantization for Highly Accurate and Compact Deep Neural Networks

Robustness-aware 2-bit quantization with real-time performance for neural network

Efficient bit-rate scalability for weighted squared error optimization in audio coding