ABSTRACT
Computing graph similarity is an important task in many graph-related applications such as retrieval in graph databases or graph clustering. While numerous measures have been proposed to capture the similarity between a pair of graphs, Graph Edit Distance (GED) and Maximum Common Subgraphs (MCS) are the two widely used measures in practice. GED and MCS are domain-agnostic measures of structural similarity between the graphs and define the similarity as a function of pairwise alignment of different entities (such as nodes, edges, and subgraphs) in the two graphs. The explicit explainability offered by the pairwise alignment provides transparency and justification of the similarity score, thus, GED and MCS have important practical applications. However, their exact computations are known to be NP-hard. While recently proposed neural-network based approximations have been shown to accurately compute these similarity scores, they have limited ability in providing comprehensive explanations compared to classical combinatorial algorithms, e.g., Beam search. This paper aims at efficiently approximating these domain-agnostic similarity measures through a neural network, and simultaneously learning the alignments (i.e., explanations) similar to those of classical intractable methods. Specifically, we formulate the similarity between a pair of graphs as the minimal "transformation" cost from one graph to another in the learnable node-embedding space. We show that, if node embedding is able to capture its neighborhood context closely, our proposed similarity function closely approximates both the alignment and the similarity score of classical methods. Furthermore, we also propose an efficient differentiable computation of our proposed objective for model training. Empirically, we demonstrate that the proposed method achieves up to 50%-100% reduction in the Mean Squared Error for the graph similarity approximation task and up to 20% improvement in the retrieval evaluation metrics for the graph retrieval task. The source code is available at https://github.com/khoadoan/GraphOTSim.
Supplemental Material
- Yunsheng Bai, Hao Ding, Song Bian, Ting Chen, Yizhou Sun, and Wei Wang. 2019. Simgnn: A neural network approach to fast graph similarity computation. In Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining. 384--392.Google ScholarDigital Library
- Yunsheng Bai, Hao Ding, Ken Gu, Yizhou Sun, and Wei Wang. 2020. Learning-Based Efficient Graph Similarity Computation via Multi-Scale Convolutional Set Matching. In AAAI. 3219--3226.Google Scholar
- Stefano Berretti, Alberto Del Bimbo, and Enrico Vicario. 2001. Efficient matching and indexing of graph models in content-based retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence 23, 10 (2001), 1089--1105.Google ScholarDigital Library
- Horst Bunke. 1997. On a relation between graph edit distance and maximum common subgraph. Pattern Recognition Letters 18, 8 (1997), 689--694.Google ScholarDigital Library
- Rainer E Burkard, Mauro Dell'Amico, and Silvano Martello. 2009. Assignment problems. Springer.Google Scholar
- Liqun Chen, Zhe Gan, Yu Cheng, Linjie Li, Lawrence Carin, and Jingjing Liu. 2020.Graph optimal transport for cross-domain alignment. In International Conference on Machine Learning. PMLR, 1542--1553.Google Scholar
- David K Duvenaud, Dougal Maclaurin, Jorge Iparraguirre, Rafael Bombarell,Timothy Hirzel, Alán Aspuru-Guzik, and Ryan P Adams. 2015. Convolutional networks on graphs for learning molecular fingerprints. In Advances in neural information processing systems. 2224--2232.Google Scholar
- Stefan Fankhauser, Kaspar Riesen, and Horst Bunke. 2011. Speeding up graph edit distance computation through fast bipartite matching. In International Workshop on Graph-Based Representations in Pattern Recognition. Springer, 102--111.Google ScholarDigital Library
- Matthias Fey, Jan E Lenssen, Christopher Morris, Jonathan Masci, and Nils M Kriege. 2019. Deep Graph Matching Consensus. In International Conference on Learning Representations.Google Scholar
- Andreas Fischer, Réjean Plamondon, Yvon Savaria, Kaspar Riesen, and Horst Bunke. 2014. A hausdorff heuristic for efficient computation of graph edit distance. InJoint IAPR International Workshops on Statistical Techniques in Pattern Recognition (SPR) and Structural and Syntactic Pattern Recognition (SSPR). Springer, 83--92.Google ScholarDigital Library
- Xinyu Fu, Jiani Zhang, Ziqiao Meng, and Irwin King. 2020. MAGNN: Metapath Aggregated Graph Neural Network for Heterogeneous Graph Embedding. In Proceedings of The Web Conference 2020. 2331--2341.Google ScholarDigital Library
- Carlos Garcia-Hernandez, Alberto Fernandez, and Francesc Serratosa. 2019. Ligand-based virtual screening using graph edit distance as molecular similarity measure. Journal of chemical information and modeling 59, 4 (2019), 1410--1421.Google ScholarCross Ref
- Carlos Garcia-Hernandez, Alberto Fernandez, and Francesc Serratosa. 2020. Learning the Edit Costs of Graph Edit Distance Applied to Ligand-Based Virtual Screening. Current topics in medicinal chemistry 20, 18 (2020), 1582--1592.Google Scholar
- Aude Genevay, Gabriel Peyré, and Marco Cuturi. 2018. Learning generative models with sinkhorn divergences. In International Conference on Artificial Intelligence and Statistics. 1608--1617.Google Scholar
- José Jiménez-Luna, Francesca Grisoni, and Gisbert Schneider. 2020. Drug discovery with explainable artificial intelligence.Nature Machine Intelligence 2, 10(2020), 573--584.Google Scholar
- Roy Jonker and Anton Volgenant. 1987. A shortest augmenting path algorithm for dense and sparse linear assignment problems. Computing 38, 4 (1987), 325--340.Google ScholarDigital Library
- Maurice G Kendall. 1938. A new measure of rank correlation. Biometrika 30, 1/2(1938), 81--93.Google ScholarCross Ref
- Thomas N. Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24--26, 2017, Conference Track Proceedings. OpenReview.net. https://openreview.net/forum?id=SJU4ayYglGoogle Scholar
- Yujia Li, Chenjie Gu, Thomas Dullien, Oriol Vinyals, and Pushmeet Kohli. 2019. Graph matching networks for learning the similarity of graph structured objects. In International Conference on Machine Learning. PMLR, 3835--3845.Google Scholar
- Saurav Manchanda, Da Zheng, and George Karypis. 2021. Schema-Aware Deep Graph Convolutional Networks for Heterogeneous Graphs. arXiv:2105.00644 [cs.LG]Google Scholar
- Hermina Petric Maretic, Mireille El Gheche, Giovanni Chierchia, and Pascal Frossard. 2019. GOT: an optimal transport framework for graph comparison. In Advances in Neural Information Processing Systems. 13876--13887.Google Scholar
- Michel Neuhaus, Kaspar Riesen, and Horst Bunke. 2006. Fast suboptimal algorithms for the computation of graph edit distance. In Joint IAPR International Workshops on Statistical Techniques in Pattern Recognition (SPR) and Structuraland Syntactic Pattern Recognition (SSPR). Springer, 163--172.Google ScholarDigital Library
- Giannis Nikolentzos, Polykarpos Meladianos, and Michalis Vazirgiannis. 2017. Matching node embeddings for graph similarity. In Thirty-First AAAI Conference on Artificial Intelligence.Google ScholarDigital Library
- John W Raymond, Eleanor J Gardiner, and Peter Willett. 2002. Rascal: Calculation of graph similarity using maximum common edge subgraphs. Comput. J. 45, 6(2002), 631--644.Google ScholarCross Ref
- Kaspar Riesen and Horst Bunke. 2009. Approximate graph edit distance computation by means of bipartite graph matching. Image and Vision computing 27, 7(2009), 950--959.Google Scholar
- Ignacio Rocco, Mircea Cimpoi, Relja Arandjelovic, Akihiko Torii, Tomas Pajdla, and Josef Sivic. 2018. Neighbourhood consensus networks. In Proceedings of the32nd International Conference on Neural Information Processing Systems. 1658--1669.Google Scholar
- Michael Schlichtkrull, Thomas N Kipf, Peter Bloem, Rianne Van Den Berg, Ivan Titov, and Max Welling. 2018. Modeling relational data with graph convolutional networks. In European Semantic Web Conference. Springer, 593--607.Google ScholarDigital Library
- Dennis Shasha, Jason TL Wang, and Rosalba Giugno. 2002. Algorithmics and applications of tree and graph searching. In Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems. 39--52.Google ScholarDigital Library
- Charles Spearman. 1987. The proof and measurement of association between two things. The American journal of psychology100, 3/4 (1987), 441--471.Google Scholar
- Hannu Toivonen, Ashwin Srinivasan, Ross D King, Stefan Kramer, and Christoph Helma. 2003. Statistical evaluation of the predictive toxicology challenge 2000--2001. Bioinformatics 19, 10 (2003), 1183--1193.Google ScholarCross Ref
- Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. 2018. Graph Attention Networks. In6th International Conference on Learning Representations. https://openreview.net/forum?id=rJXMpikCZGoogle Scholar
- Xiaoli Wang, Xiaofeng Ding, Anthony KH Tung, Shanshan Ying, and Hai Jin. 2012. An efficient graph indexing method. In 2012 IEEE 28th International Conference on Data Engineering. IEEE, 210--221.Google ScholarDigital Library
- Xiao Wang, Houye Ji, Chuan Shi, Bai Wang, Yanfang Ye, Peng Cui, and Philip S Yu. 2019. Heterogeneous graph attention network. In The World Wide Web Conference. 2022--2032.Google ScholarDigital Library
- Peter Willett, John M Barnard, and Geoffrey M Downs. 1998. Chemical similarity searching. Journal of chemical information and computer sciences 38, 6 (1998), 983--996.Google ScholarCross Ref
- Pinar Yanardag and SVN Vishwanathan. 2015. Deep graph kernels. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1365--1374.Google ScholarDigital Library
- Zhiping Zeng, Anthony KH Tung, Jianyong Wang, Jianhua Feng, and Lizhu Zhou. 2009. Comparing stars: On approximating graph edit distance. Proceedings of the VLDB Endowment 2, 1 (2009), 25--36.Google ScholarDigital Library
Index Terms
- Interpretable Graph Similarity Computation via Differentiable Optimal Alignment of Node Embeddings
Recommendations
Multiperspective Graph-Theoretic Similarity Measure
CIKM '18: Proceedings of the 27th ACM International Conference on Information and Knowledge ManagementDetermining the similarity between two objects is pertinent to many applications. When the basis for similarity is a set of object-to-object relationships, it is natural to rely on graph-theoretic measures. One seminal technique for measuring the ...
Measuring similarity of graph nodes by neighbor matching
The problem of measuring similarity of graph nodes is important in a range of practical problems. There is a number of proposed measures, usually based on iterative calculation of similarity and the principle that two nodes are as similar as their ...
DGE-GSIM: A multi-task dual graph embedding learning for graph similarity computation
ICMLSC '22: Proceedings of the 2022 6th International Conference on Machine Learning and Soft ComputingGraph similarity estimation is a challenging task due to the complex graph structure. To achieve an exact similarity estimation for input graphs, two critical factors are how to learn an appropriate graph embedding and how to compute the similarity ...
Comments