ABSTRACT
In recent years, graph neural networks (GNNs) have been widely used in many domains due to their powerful capability in representation learning on graph-structured data. While a majority of extant studies focus on mitigating the over-smoothing problem, recent works also reveal the limitation of GNN from a new over-correlation perspective which states that the learned representation becomes highly correlated after feature transformation and propagation in GNNs. In this paper, we thoroughly re-examine the issue of over-correlation in deep GNNs, both empirically and theoretically. We demonstrate that the propagation operator in GNNs exacerbates the feature correlation. In addition, we discovered through empirical study that existing decorrelation solutions fall short of maintaining a low feature correlation, potentially encoding redundant information. Thus, to more effectively address the over-correlation problem, we propose a decorrelated propagation scheme (DeProp) as a fundamental component to decorrelate the feature learning in GNN models, which achieves feature decorrelation at the propagation step. Comprehensive experiments on multiple real-world datasets demonstrate that DeProp can be easily integrated into prevalent GNNs, leading to significant performance enhancements. Furthermore, we find that it can be used to solve over-smoothing and over-correlation problems simultaneously and significantly outperform state-of-the-art methods on missing feature settings. The code is available at https://github.com/hualiu829/DeProp.
Supplemental Material
- Jacob Benesty, Jingdong Chen, Yiteng Huang, and Israel Cohen. 2009. Pearson correlation coefficient. In Noise reduction in speech processing. Springer, 1--4.Google ScholarDigital Library
- Joan Bruna,Wojciech Zaremba, Arthur Szlam, and Yann LeCun. 2013. Spectral networks and locally connected networks on graphs. arXiv preprint arXiv:1312.6203 (2013).Google Scholar
- Deli Chen, Yankai Lin, Wei Li, Peng Li, Jie Zhou, and Xu Sun. 2020. Measuring and relieving the over-smoothing problem for graph neural networks from the topological view. In Proceedings of the AAAI conference on artificial intelligence, Vol. 34. 3438--3445.Google ScholarCross Ref
- Ming Chen, Zhewei Wei, Zengfeng Huang, Bolin Ding, and Yaliang Li. 2020. Simple and deep graph convolutional networks. In International Conference on Machine Learning. PMLR, 1725--1735.Google Scholar
- Michael Cogswell, Faruk Ahmed, Ross Girshick, Larry Zitnick, and Dhruv Batra. 2015. Reducing overfitting in deep networks by decorrelating representations. arXiv preprint arXiv:1511.06068 (2015).Google Scholar
- Michaël Defferrard, Xavier Bresson, and Pierre Vandergheynst. 2016. Convolutional neural networks on graphs with fast localized spectral filtering. Advances in neural information processing systems 29 (2016).Google Scholar
- David Duvenaud, Dougal Maclaurin, Jorge Aguilera-Iparraguirre, Rafael Gómez- Bombarelli, Timothy Hirzel, Alán Aspuru-Guzik, and Ryan P. Adams. 2015. Convolutional Networks on Graphs for Learning Molecular Fingerprints. In NeurIPS.Google Scholar
- Wenqi Fan, Xiaorui Liu, Wei Jin, Xiangyu Zhao, Jiliang Tang, and Qing Li. 2022. Graph Trend Filtering Networks for Recommendation. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval. 112--121.Google ScholarDigital Library
- Xinshun Feng, Herun Wan, Shangbin Feng, Hongrui Wang, Qinghua Zheng, Jun Zhou, and Minnan Luo. 2022. GraTO: Graph Neural Network Framework Tackling Over-smoothing with Neural Architecture Search. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management. 520--529.Google ScholarDigital Library
- Justin Gilmer, Samuel S Schoenholz, Patrick F Riley, Oriol Vinyals, and George E Dahl. 2017. Neural message passing for quantum chemistry. In International conference on machine learning. PMLR, 1263--1272.Google Scholar
- Kai Guo, Kaixiong Zhou, Xia Hu, Yu Li, Yi Chang, and Xin Wang. 2022. Orthogonal graph neural networks. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36. 3996--4004.Google ScholarCross Ref
- Will Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs. Advances in neural information processing systems 30 (2017).Google Scholar
- Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.Google ScholarCross Ref
- Weihua Hu, Matthias Fey, Marinka Zitnik, Yuxiao Dong, Hongyu Ren, Bowen Liu, Michele Catasta, and Jure Leskovec. 2020. Open graph benchmark: Datasets for machine learning on graphs. Advances in neural information processing systems 33 (2020), 22118--22133.Google Scholar
- Sergey Ioffe and Christian Szegedy. 2015. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In International conference on machine learning. PMLR, 448--456.Google Scholar
- Wei Jin, Xiaorui Liu, Yao Ma, Charu Aggarwal, and Jiliang Tang. 2022. Feature overcorrelation in deep graph neural networks: A new perspective. arXiv preprint arXiv:2206.07743 (2022).Google Scholar
- Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).Google Scholar
- Johannes Klicpera, Aleksandar Bojchevski, and Stephan Günnemann. 2018. Predict then propagate: Graph neural networks meet personalized pagerank. arXiv preprint arXiv:1810.05997 (2018).Google Scholar
- Guohao Li, Matthias Müller, Ali K. Thabet, and Bernard Ghanem. 2019. Deep-GCNs: Can GCNs Go As Deep As CNNs?. In ICCV.Google Scholar
- Meng Liu, Hongyang Gao, and Shuiwang Ji. 2020. Towards deeper graph neural networks. In Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining. 338--348.Google ScholarDigital Library
- Xiaorui Liu, Jiayuan Ding, Wei Jin, Han Xu, Yao Ma, Zitao Liu, and Jiliang Tang. 2021. Graph neural networks with adaptive residual. Advances in Neural Information Processing Systems 34 (2021), 9720--9733.Google Scholar
- Xiaorui Liu,Wei Jin, Yao Ma, Yaxin Li, Hua Liu, YiqiWang, Ming Yan, and Jiliang Tang. 2021. Elastic graph neural networks. In International Conference on Machine Learning. PMLR, 6837--6849.Google Scholar
- Yao Ma, Xiaorui Liu, Tong Zhao, Yozen Liu, Jiliang Tang, and Neil Shah. 2021. A unified viewon graph neural networks as graph signal denoising. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management.Google ScholarDigital Library
- Yao Ma and Jiliang Tang. 2021. Deep learning on graphs. Cambridge University Press.Google Scholar
- Kenta Oono and Taiji Suzuki. 2019. Graph neural networks exponentially lose expressive power for node classification. arXiv preprint arXiv:1905.10947 (2019).Google Scholar
- Hongbin Pei, Bingzhe Wei, Kevin Chen-Chuan Chang, Yu Lei, and Bo Yang. 2020. Geom-gcn: Geometric graph convolutional networks. arXiv preprint arXiv:2002.05287 (2020).Google Scholar
- Al Mamunur Rashid, George Karypis, and John Riedl. 2008. Learning preferences of new users in recommender systems: an information theoretic approach. Acm Sigkdd Explorations Newsletter 10, 2 (2008), 90--100.Google ScholarDigital Library
- Pau Rodríguez, Jordi Gonzalez, Guillem Cucurull, Josep M Gonfaus, and Xavier Roca. 2016. Regularizing cnns with locally constrained decorrelations. arXiv preprint arXiv:1611.01967 (2016).Google Scholar
- Yu Rong, Wenbing Huang, Tingyang Xu, and Junzhou Huang. 2020. DropEdge: Towards Deep Graph Convolutional Networks on Node Classification. In International Conference on Learning Representations. https://openreview.net/forum? id=Hkx1qkrKPrGoogle Scholar
- B Rozemberczki, C Allen, and R Sarkar. 2019. Multi-scale attributed node embedding. CoRR. arXiv preprint arXiv:1909.13021 (2019).Google Scholar
- Prithviraj Sen, Galileo Namata, Mustafa Bilgic, Lise Getoor, Brian Galligher, and Tina Eliassi-Rad. 2008. Collective classification in network data. AI magazine 29, 3 (2008), 93--93.Google Scholar
- Oleksandr Shchur, Maximilian Mumme, Aleksandar Bojchevski, and Stephan Günnemann. 2018. Pitfalls of graph neural network evaluation. arXiv preprint arXiv:1811.05868 (2018).Google Scholar
- Petar Veličković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. 2017. Graph attention networks. arXiv preprint arXiv:1710.10903 (2017).Google Scholar
- Daixin Wang, Jianbin Lin, Peng Cui, Quanhui Jia, Zhen Wang, Yanming Fang, Quan Yu, Jun Zhou, Shuang Yang, and Yuan Qi. 2019. A Semi-Supervised Graph Attentive Network for Financial Fraud Detection. In ICDM.Google Scholar
- MaxWelling and Thomas N Kipf. 2016. Semi-supervised classification with graph convolutional networks. In J. International Conference on Learning Representations (ICLR 2017).Google Scholar
- Zonghan Wu, Shirui Pan, Fengwen Chen, Guodong Long, Chengqi Zhang, and S Yu Philip. 2020. A comprehensive survey on graph neural networks. IEEE transactions on neural networks and learning systems 32, 1 (2020), 4--24.Google ScholarCross Ref
- Di Xie, Jiang Xiong, and Shiliang Pu. 2017. All you need is beyond a good init: Exploring better solution for training extremely deep convolutional neural networks with orthonormality and modulation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6176--6185.Google ScholarCross Ref
- Yujun Yan, Milad Hashemi, Kevin Swersky, Yaoqing Yang, and Danai Koutra. 2022. Two sides of the same coin: Heterophily and oversmoothing in graph convolutional neural networks. In 2022 IEEE International Conference on Data Mining (ICDM). IEEE, 1287--1292.Google ScholarCross Ref
- Liang Yang, Chuan Wang, Junhua Gu, Xiaochun Cao, and Bingxin Niu. 2021. Why Do Attributes Propagate in Graph Convolutional Neural Networks?. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 4590--4598.Google ScholarCross Ref
- Yongyi Yang, Tang Liu, Yangkun Wang, Jinjing Zhou, Quan Gan, Zhewei Wei, Zheng Zhang, Zengfeng Huang, and David Wipf. 2021. Graph neural networks inspired by classical iterative algorithms. In International Conference on Machine Learning. PMLR, 11773--11783.Google Scholar
- Rex Ying, Ruining He, Kaifeng Chen, Pong Eksombatchai, William L. Hamilton, and Jure Leskovec. 2018. Graph Convolutional Neural Networks for Web-Scale Recommender Systems. In KDD.Google Scholar
- Lingxiao Zhao and Leman Akoglu. 2019. Pairnorm: Tackling oversmoothing in gnns. arXiv preprint arXiv:1909.12223 (2019).Google Scholar
- Kaixiong Zhou, Xiao Huang, Yuening Li, Daochen Zha, Rui Chen, and Xia Hu. 2020. Towards deeper graph neural networks with differentiable group normalization. Advances in neural information processing systems 33 (2020), 4917--4928.Google Scholar
- Kaixiong Zhou, Xiao Huang, Daochen Zha, Rui Chen, Li Li, Soo-Hyun Choi, and Xia Hu. 2021. Dirichlet energy constrained learning for deep graph neural networks. Advances in Neural Information Processing Systems 34 (2021),Google Scholar
- Meiqi Zhu, XiaoWang, Chuan Shi, Houye Ji, and Peng Cui. 2021. Interpreting and unifying graph neural networks with an optimization framework. In Proceedings of the Web Conference 2021. 1215--1226.Google ScholarDigital Library
Index Terms
- Enhancing Graph Representations Learning with Decorrelated Propagation
Recommendations
Node classification using kernel propagation in graph neural networks
Highlights- Spectral Kernel propagation layer for differentiating local/global connectivity.
AbstractIn this work, we introduce a kernel propagation method that enables graph neural networks (GNNs) to leverage higher-order network structural information without increasing the complexity of the networks. Recent studies have introduced ...
Inflation Improves Graph Neural Networks
WWW '22: Proceedings of the ACM Web Conference 2022Graph neural networks (GNNs) have gained significant success in graph representation learning and become the go-to approach for many graph-based tasks. Despite their effectiveness, the performance of GNNs is known to decline gradually as the number of ...
Policy-GNN: Aggregation Optimization for Graph Neural Networks
KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data MiningGraph data are pervasive in many real-world applications. Recently, increasing attention has been paid on graph neural networks (GNNs), which aim to model the local graph structures and capture the hierarchical patterns by aggregating the information ...
Comments