ABSTRACT
Graph contrastive learning (GCL) improves graph representation learning, leading to SOTA on various downstream tasks. The graph augmentation step is a vital but scarcely studied step of GCL. In this paper, we show that the node embedding obtained via the graph augmentations is highly biased, somewhat limiting contrastive models from learning discriminative features for downstream tasks.Thus, instead of investigating graph augmentation in the input space, we alternatively propose to perform augmentations on the hidden features (feature augmentation). Inspired by so-called matrix sketching, we propose COSTA, a novel Covariance-preServing feaTure space Augmentation framework for GCL, which generates augmented features by maintaining a "good sketch" of original features. To highlight the superiority of feature augmentation with COSTA, we investigate a single-view setting (in addition to multi-view one) which conserves memory and computations. We show that the feature augmentation with COSTA achieves comparable/better results than graph augmentation based models.
Supplemental Material
- Philip Bachman, R Devon Hjelm, and William Buchwalter. 2019. Learning representations by maximizing mutual information across views. arXiv preprint arXiv:1906.00910 (2019).Google Scholar
- Piotr Bielak, Tomasz Kajdanowicz, and Nitesh V Chawla. 2021. Graph Barlow Twins: A self-supervised representation learning framework for graphs. arXiv preprint arXiv:2106.02466 (2021).Google Scholar
- Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. 2020. A simple framework for contrastive learning of visual representations. In International conference on machine learning. PMLR, 1597--1607.Google Scholar
- Yankai Chen, Menglin Yang, Yingxue Zhang, Mengchen Zhao, Ziqiao Meng, Jianye Hao, and Irwin King. 2022. Modeling Scale-free Graphs with Hyperbolic Geometry for Knowledge-aware Recommendation. In WSDM '22: The Fifteenth ACM International Conference on Web Search and Data Mining. ACM.Google Scholar
- Yankai Chen, Yaming Yang, Yujing Wang, Jing Bai, Xiangchen Song, and Irwin King. 2022. Attentive Knowledge-aware Graph Convolutional Networks with Collaborative Guidance for Personalized Recommendation. In The 38th IEEE International Conference on Data Engineering.Google Scholar
- Terrance DeVries and Graham W Taylor. 2017. Dataset augmentation in feature space. arXiv preprint arXiv:1702.05538 (2017).Google Scholar
- Petros Drineas, Ravi Kannan, and Michael W Mahoney. 2006. Fast Monte Carlo algorithms for matrices I: Approximating matrix multiplication. SIAM J. Comput. 36, 1 (2006), 132--157.Google ScholarDigital Library
- Steven Y Feng, Varun Gangal, Jason Wei, Sarath Chandar, Soroush Vosoughi, Teruko Mitamura, and Eduard Hovy. 2021. A survey of data augmentation approaches for nlp. arXiv preprint arXiv:2105.03075 (2021).Google Scholar
- Tianyu Gao, Xingcheng Yao, and Danqi Chen. 2021. SimCSE: Simple Contrastive Learning of Sentence Embeddings. arXiv preprint arXiv:2104.08821 (2021).Google Scholar
- Gene H Golub, Alan Hoffman, and Gilbert W Stewart. 1987. A generalization of the Eckart-Young-Mirsky matrix approximation theorem. Linear Algebra and its applications 88 (1987), 317--327.Google Scholar
- Aditya Grover and Jure Leskovec. 2016. node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining. 855--864.Google ScholarDigital Library
- Hakim Hafidi, Mounir Ghogho, Philippe Ciblat, and Ananthram Swami. 2020. Graphcl: Contrastive self-supervised learning of graph representations. arXiv preprint arXiv:2007.08025 (2020).Google Scholar
- William L Hamilton, Rex Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs. In Proceedings of the 31st International Conference on Neural Information Processing Systems. 1025--1035.Google Scholar
- Bharath Hariharan and Ross Girshick. 2017. Low-shot visual recognition by shrinking and hallucinating features. In Proceedings of the IEEE International Conference on Computer Vision. 3018--3027.Google ScholarCross Ref
- Kaveh Hassani and Amir Hosein Khasahmadi. 2020. Contrastive multi-view representation learning on graphs. In International Conference on Machine Learning. PMLR, 4116--4126.Google Scholar
- Kaiming He, Haoqi Fan, YuxinWu, Saining Xie, and Ross Girshick. 2020. Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 9729--9738.Google ScholarCross Ref
- Thomas N Kipf and MaxWelling. 2016. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016).Google Scholar
- Thomas N Kipf and Max Welling. 2016. Variational graph auto-encoders. arXiv preprint arXiv:1611.07308 (2016).Google Scholar
- Piotr Koniusz and Hongguang Zhang. 2020. Power Normalizations in Finegrained Image, Few-shot Image and Graph Classification. In IEEE Transactions on Pattern Analysis and Machine Intelligence. https://doi.org/10.1109/TPAMI.2021.3107164Google ScholarDigital Library
- Ping Li, Trevor J Hastie, and Kenneth W Church. 2006. Very sparse random projections. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining. 287--296.Google ScholarDigital Library
- Edo Liberty. 2013. Simple and deterministic matrix sketching. In Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining. 581--588.Google ScholarDigital Library
- Julian McAuley, Christopher Targett, Qinfeng Shi, and Anton Van Den Hengel. 2015. Image-based recommendations on styles and substitutes. In Proceedings of the 38th international ACM SIGIR conference on research and development in information retrieval. 43--52.Google ScholarDigital Library
- Péter Mernyei and Cătălina Cangea. 2020. Wiki-cs: A wikipedia-based benchmark for graph neural networks. arXiv preprint arXiv:2007.02901 (2020).Google Scholar
- Lawrence Page, Sergey Brin, Rajeev Motwani, and Terry Winograd. 1999. The PageRank citation ranking: Bringing order to the web. Technical Report.Google Scholar
- Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, et al. 2011. Scikit-learn: Machine learning in Python. the Journal of machine Learning research 12 (2011), 2825--2830.Google Scholar
- Zhen Peng, Wenbing Huang, Minnan Luo, Qinghua Zheng, Yu Rong, Tingyang Xu, and Junzhou Huang. 2020. Graph representation learning via graphical mutual information maximization. In Proceedings of The Web Conference 2020.Google ScholarDigital Library
- Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. 2014. Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. 701--710.Google ScholarDigital Library
- Connor Shorten and Taghi M Khoshgoftaar. 2019. A survey on image data augmentation for deep learning. Journal of Big Data 6, 1 (2019), 1--48.Google ScholarCross Ref
- Arnab Sinha, Zhihong Shen, Yang Song, Hao Ma, Darrin Eide, Bo-June Hsu, and Kuansan Wang. 2015. An overview of microsoft academic service (mas) and applications. In Proceedings of the 24th international conference on world wide web. 243--246.Google ScholarDigital Library
- Zixing Song, Ziqiao Meng, Yifei Zhang, and Irwin King. 2021. Semi-supervised Multi-label Learning for Graph-structured Data. In CIKM. ACM, 1723--1733.Google Scholar
- Zixing Song, Xiangli Yang, Zenglin Xu, and Irwin King. 2022. Graph-Based Semi-Supervised Learning: A Comprehensive Review. IEEE Transactions on Neural Networks and Learning Systems (2022), 1--21. https://doi.org/10.1109/TNNLS. 2022.3155478Google Scholar
- Ke Sun, Piotr Koniusz, and Zhen Wang. 2019. Fisher-Bures Adversary Graph Convolutional Networks. Conference on Uncertainty in Artificial Intelligence 115 (2019), 465--475.Google Scholar
- Susheel Suresh, Pan Li, Cong Hao, and Jennifer Neville. 2021. Adversarial Graph Augmentation to Improve Graph Contrastive Learning. CoRR abs/2106.05819 (2021).Google Scholar
- Yonglong Tian, Chen Sun, Ben Poole, Dilip Krishnan, Cordelia Schmid, and Phillip Isola. 2020. What Makes for Good Views for Contrastive Learning?. In Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020.Google Scholar
- Petar Velickovic, William Fedus, William L Hamilton, Pietro Liò, Yoshua Bengio, and R Devon Hjelm. 2019. Deep Graph Infomax. ICLR (Poster) (2019).Google Scholar
- Yulin Wang, Xuran Pan, Shiji Song, Hong Zhang, Gao Huang, and Cheng Wu. 2019. Implicit semantic data augmentation for deep networks. Advances in Neural Information Processing Systems 32 (2019), 12635--12644.Google Scholar
- Jason Wei and Kai Zou. 2019. Eda: Easy data augmentation techniques for boosting performance on text classification tasks. arXiv preprint arXiv:1901.11196 (2019).Google Scholar
- Jiancan Wu, Xiang Wang, Fuli Feng, Xiangnan He, Liang Chen, Jianxun Lian, and Xing Xie. 2021. Self-supervised graph learning for recommendation. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. 726--735.Google ScholarDigital Library
- Menglin Yang, Ziqiao Meng, and Irwin King. 2020. FeatureNorm: L2 Feature Normalization for Dynamic Graph Embedding. In 2020 IEEE International Conference on Data Mining (ICDM). IEEE, 731--740.Google ScholarCross Ref
- Menglin Yang, Min Zhou, Jiahong Liu, Defu Lian, and Irwin King. 2022. HRCF: Enhancing collaborative filtering via hyperbolic geometric regularization. In Proceedings of the ACM Web Conference 2022. 2462--2471.Google ScholarDigital Library
- Yuning You, Tianlong Chen, Yongduo Sui, Ting Chen, Zhangyang Wang, and Yang Shen. 2020. Graph contrastive learning with augmentations. Advances in Neural Information Processing Systems 33 (2020), 5812--5823.Google Scholar
- Junliang Yu, Hongzhi Yin, Xin Xia, Tong Chen, Lizhen Cui, and Nguyen Quoc Viet Hung. 2022. Are Graph Augmentations Necessary? Simple Graph Contrastive Learning for Recommendation. arXiv preprint arXiv:2112.08679 (2022).Google Scholar
- Yifei Zhang and Hao Zhu. 2019. Doc2hash: Learning Discrete Latent variables for Documents Retrieval. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). https://doi.org/10.18653/v1/N19--1232Google Scholar
- Yifei Zhang, Hao Zhu, Ziqiao Meng, Piotr Koniusz, and Irwin King. 2022. Graphadaptive Rectified Linear Unit for Graph Neural Networks. In WWW '22: The ACM Web Conference 2022, Virtual Event, Lyon, France, April 25 - 29, 2022. ACM, 1331--1339.Google Scholar
- Hao Zhu and Piotr Koniusz. 2021. REFINE: Random RangE FInder for Network Embedding. In ACM Conference on Information and Knowledge Management.Google ScholarDigital Library
- Hao Zhu and Piotr Koniusz. 2021. Simple Spectral Graph Convolution. In International Conference on Learning Representations.Google Scholar
- Hao Zhu, Ke Sun, and Peter Koniusz. 2021. Contrastive Laplacian Eigenmaps. Advances in Neural Information Processing Systems 34 (2021).Google Scholar
- Yanqiao Zhu, Yichen Xu, Feng Yu, Qiang Liu, Shu Wu, and Liang Wang. 2020. Deep graph contrastive representation learning. arXiv preprint arXiv:2006.04131 (2020).Google Scholar
- Yanqiao Zhu, Yichen Xu, Feng Yu, Qiang Liu, Shu Wu, and Liang Wang. 2021. Graph contrastive learning with adaptive augmentation. In Proceedings of the Web Conference 2021. 2069--2080.Google ScholarDigital Library
Index Terms
- COSTA: Covariance-Preserving Feature Augmentation for Graph Contrastive Learning
Recommendations
Cross-view graph contrastive learning with hypergraph
AbstractGraph contrastive learning (GCL) provides a new perspective to alleviate the reliance on labeled data for graph representation learning. Recent efforts on GCL leverage various graph augmentation strategies, i.e., node dropping and edge masking, ...
Highlights- We proposed that hypergraphs are used as a paradigm to enhance graph contrastive learning.
- We propose a novel diffusion model-based fusion mechanism that aligns the positive examples.
- Our experimental results all exceed existing ...
Features based adaptive augmentation for graph contrastive learning
AbstractSelf-supervised learning aims to eliminate the need for expensive annotation in graph representation learning, where graph contrastive learning (GCL) is trained with the self-supervision signals containing data-data pairs. These data-data pairs ...
Community-aware graph contrastive learning for collaborative filtering
AbstractRecently, graph neural networks have demonstrated superior performance in the field of collaborative filtering (CF). The graph collaborative filtering (GCF) method learns the interactions between users and items, whose performance is susceptible ...
Comments