skip to main content
10.1145/3404835.3462960acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article
Public Access

Interpretable Graph Similarity Computation via Differentiable Optimal Alignment of Node Embeddings

Published:11 July 2021Publication History

ABSTRACT

Computing graph similarity is an important task in many graph-related applications such as retrieval in graph databases or graph clustering. While numerous measures have been proposed to capture the similarity between a pair of graphs, Graph Edit Distance (GED) and Maximum Common Subgraphs (MCS) are the two widely used measures in practice. GED and MCS are domain-agnostic measures of structural similarity between the graphs and define the similarity as a function of pairwise alignment of different entities (such as nodes, edges, and subgraphs) in the two graphs. The explicit explainability offered by the pairwise alignment provides transparency and justification of the similarity score, thus, GED and MCS have important practical applications. However, their exact computations are known to be NP-hard. While recently proposed neural-network based approximations have been shown to accurately compute these similarity scores, they have limited ability in providing comprehensive explanations compared to classical combinatorial algorithms, e.g., Beam search. This paper aims at efficiently approximating these domain-agnostic similarity measures through a neural network, and simultaneously learning the alignments (i.e., explanations) similar to those of classical intractable methods. Specifically, we formulate the similarity between a pair of graphs as the minimal "transformation" cost from one graph to another in the learnable node-embedding space. We show that, if node embedding is able to capture its neighborhood context closely, our proposed similarity function closely approximates both the alignment and the similarity score of classical methods. Furthermore, we also propose an efficient differentiable computation of our proposed objective for model training. Empirically, we demonstrate that the proposed method achieves up to 50%-100% reduction in the Mean Squared Error for the graph similarity approximation task and up to 20% improvement in the retrieval evaluation metrics for the graph retrieval task. The source code is available at https://github.com/khoadoan/GraphOTSim.

Skip Supplemental Material Section

Supplemental Material

SIGIR21-fp0937.mp4

mp4

31.7 MB

References

  1. Yunsheng Bai, Hao Ding, Song Bian, Ting Chen, Yizhou Sun, and Wei Wang. 2019. Simgnn: A neural network approach to fast graph similarity computation. In Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining. 384--392.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Yunsheng Bai, Hao Ding, Ken Gu, Yizhou Sun, and Wei Wang. 2020. Learning-Based Efficient Graph Similarity Computation via Multi-Scale Convolutional Set Matching. In AAAI. 3219--3226.Google ScholarGoogle Scholar
  3. Stefano Berretti, Alberto Del Bimbo, and Enrico Vicario. 2001. Efficient matching and indexing of graph models in content-based retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence 23, 10 (2001), 1089--1105.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Horst Bunke. 1997. On a relation between graph edit distance and maximum common subgraph. Pattern Recognition Letters 18, 8 (1997), 689--694.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Rainer E Burkard, Mauro Dell'Amico, and Silvano Martello. 2009. Assignment problems. Springer.Google ScholarGoogle Scholar
  6. Liqun Chen, Zhe Gan, Yu Cheng, Linjie Li, Lawrence Carin, and Jingjing Liu. 2020.Graph optimal transport for cross-domain alignment. In International Conference on Machine Learning. PMLR, 1542--1553.Google ScholarGoogle Scholar
  7. David K Duvenaud, Dougal Maclaurin, Jorge Iparraguirre, Rafael Bombarell,Timothy Hirzel, Alán Aspuru-Guzik, and Ryan P Adams. 2015. Convolutional networks on graphs for learning molecular fingerprints. In Advances in neural information processing systems. 2224--2232.Google ScholarGoogle Scholar
  8. Stefan Fankhauser, Kaspar Riesen, and Horst Bunke. 2011. Speeding up graph edit distance computation through fast bipartite matching. In International Workshop on Graph-Based Representations in Pattern Recognition. Springer, 102--111.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Matthias Fey, Jan E Lenssen, Christopher Morris, Jonathan Masci, and Nils M Kriege. 2019. Deep Graph Matching Consensus. In International Conference on Learning Representations.Google ScholarGoogle Scholar
  10. Andreas Fischer, Réjean Plamondon, Yvon Savaria, Kaspar Riesen, and Horst Bunke. 2014. A hausdorff heuristic for efficient computation of graph edit distance. InJoint IAPR International Workshops on Statistical Techniques in Pattern Recognition (SPR) and Structural and Syntactic Pattern Recognition (SSPR). Springer, 83--92.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Xinyu Fu, Jiani Zhang, Ziqiao Meng, and Irwin King. 2020. MAGNN: Metapath Aggregated Graph Neural Network for Heterogeneous Graph Embedding. In Proceedings of The Web Conference 2020. 2331--2341.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Carlos Garcia-Hernandez, Alberto Fernandez, and Francesc Serratosa. 2019. Ligand-based virtual screening using graph edit distance as molecular similarity measure. Journal of chemical information and modeling 59, 4 (2019), 1410--1421.Google ScholarGoogle ScholarCross RefCross Ref
  13. Carlos Garcia-Hernandez, Alberto Fernandez, and Francesc Serratosa. 2020. Learning the Edit Costs of Graph Edit Distance Applied to Ligand-Based Virtual Screening. Current topics in medicinal chemistry 20, 18 (2020), 1582--1592.Google ScholarGoogle Scholar
  14. Aude Genevay, Gabriel Peyré, and Marco Cuturi. 2018. Learning generative models with sinkhorn divergences. In International Conference on Artificial Intelligence and Statistics. 1608--1617.Google ScholarGoogle Scholar
  15. José Jiménez-Luna, Francesca Grisoni, and Gisbert Schneider. 2020. Drug discovery with explainable artificial intelligence.Nature Machine Intelligence 2, 10(2020), 573--584.Google ScholarGoogle Scholar
  16. Roy Jonker and Anton Volgenant. 1987. A shortest augmenting path algorithm for dense and sparse linear assignment problems. Computing 38, 4 (1987), 325--340.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Maurice G Kendall. 1938. A new measure of rank correlation. Biometrika 30, 1/2(1938), 81--93.Google ScholarGoogle ScholarCross RefCross Ref
  18. Thomas N. Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24--26, 2017, Conference Track Proceedings. OpenReview.net. https://openreview.net/forum?id=SJU4ayYglGoogle ScholarGoogle Scholar
  19. Yujia Li, Chenjie Gu, Thomas Dullien, Oriol Vinyals, and Pushmeet Kohli. 2019. Graph matching networks for learning the similarity of graph structured objects. In International Conference on Machine Learning. PMLR, 3835--3845.Google ScholarGoogle Scholar
  20. Saurav Manchanda, Da Zheng, and George Karypis. 2021. Schema-Aware Deep Graph Convolutional Networks for Heterogeneous Graphs. arXiv:2105.00644 [cs.LG]Google ScholarGoogle Scholar
  21. Hermina Petric Maretic, Mireille El Gheche, Giovanni Chierchia, and Pascal Frossard. 2019. GOT: an optimal transport framework for graph comparison. In Advances in Neural Information Processing Systems. 13876--13887.Google ScholarGoogle Scholar
  22. Michel Neuhaus, Kaspar Riesen, and Horst Bunke. 2006. Fast suboptimal algorithms for the computation of graph edit distance. In Joint IAPR International Workshops on Statistical Techniques in Pattern Recognition (SPR) and Structuraland Syntactic Pattern Recognition (SSPR). Springer, 163--172.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Giannis Nikolentzos, Polykarpos Meladianos, and Michalis Vazirgiannis. 2017. Matching node embeddings for graph similarity. In Thirty-First AAAI Conference on Artificial Intelligence.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. John W Raymond, Eleanor J Gardiner, and Peter Willett. 2002. Rascal: Calculation of graph similarity using maximum common edge subgraphs. Comput. J. 45, 6(2002), 631--644.Google ScholarGoogle ScholarCross RefCross Ref
  25. Kaspar Riesen and Horst Bunke. 2009. Approximate graph edit distance computation by means of bipartite graph matching. Image and Vision computing 27, 7(2009), 950--959.Google ScholarGoogle Scholar
  26. Ignacio Rocco, Mircea Cimpoi, Relja Arandjelovic, Akihiko Torii, Tomas Pajdla, and Josef Sivic. 2018. Neighbourhood consensus networks. In Proceedings of the32nd International Conference on Neural Information Processing Systems. 1658--1669.Google ScholarGoogle Scholar
  27. Michael Schlichtkrull, Thomas N Kipf, Peter Bloem, Rianne Van Den Berg, Ivan Titov, and Max Welling. 2018. Modeling relational data with graph convolutional networks. In European Semantic Web Conference. Springer, 593--607.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Dennis Shasha, Jason TL Wang, and Rosalba Giugno. 2002. Algorithmics and applications of tree and graph searching. In Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems. 39--52.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Charles Spearman. 1987. The proof and measurement of association between two things. The American journal of psychology100, 3/4 (1987), 441--471.Google ScholarGoogle Scholar
  30. Hannu Toivonen, Ashwin Srinivasan, Ross D King, Stefan Kramer, and Christoph Helma. 2003. Statistical evaluation of the predictive toxicology challenge 2000--2001. Bioinformatics 19, 10 (2003), 1183--1193.Google ScholarGoogle ScholarCross RefCross Ref
  31. Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. 2018. Graph Attention Networks. In6th International Conference on Learning Representations. https://openreview.net/forum?id=rJXMpikCZGoogle ScholarGoogle Scholar
  32. Xiaoli Wang, Xiaofeng Ding, Anthony KH Tung, Shanshan Ying, and Hai Jin. 2012. An efficient graph indexing method. In 2012 IEEE 28th International Conference on Data Engineering. IEEE, 210--221.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Xiao Wang, Houye Ji, Chuan Shi, Bai Wang, Yanfang Ye, Peng Cui, and Philip S Yu. 2019. Heterogeneous graph attention network. In The World Wide Web Conference. 2022--2032.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Peter Willett, John M Barnard, and Geoffrey M Downs. 1998. Chemical similarity searching. Journal of chemical information and computer sciences 38, 6 (1998), 983--996.Google ScholarGoogle ScholarCross RefCross Ref
  35. Pinar Yanardag and SVN Vishwanathan. 2015. Deep graph kernels. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1365--1374.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Zhiping Zeng, Anthony KH Tung, Jianyong Wang, Jianhua Feng, and Lizhu Zhou. 2009. Comparing stars: On approximating graph edit distance. Proceedings of the VLDB Endowment 2, 1 (2009), 25--36.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Interpretable Graph Similarity Computation via Differentiable Optimal Alignment of Node Embeddings

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            SIGIR '21: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval
            July 2021
            2998 pages
            ISBN:9781450380379
            DOI:10.1145/3404835

            Copyright © 2021 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 11 July 2021

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article

            Acceptance Rates

            Overall Acceptance Rate792of3,983submissions,20%

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader