research-article

Public Access

Interpretable Graph Similarity Computation via Differentiable Optimal Alignment of Node Embeddings

Authors:
Khoa D. Doan

Virginia Tech, Arlington, VA, USA

Virginia Tech, Arlington, VA, USA
View Profile

,
Saurav Manchanda

University of Minnesota, Minneapolis, MN, USA

University of Minnesota, Minneapolis, MN, USA
View Profile

,
Suchismit Mahapatra

University of Buffalo, Buffalo, NY, USA

University of Buffalo, Buffalo, NY, USA
View Profile

,
Chandan K. Reddy

Virginia Tech, Arlington, VA, USA

Virginia Tech, Arlington, VA, USA
View Profile

SIGIR '21: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information RetrievalJuly 2021Pages 665–674https://doi.org/10.1145/3404835.3462960

Published:11 July 2021Publication History

SIGIR '21: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval

Pages 665–674

ABSTRACT

Computing graph similarity is an important task in many graph-related applications such as retrieval in graph databases or graph clustering. While numerous measures have been proposed to capture the similarity between a pair of graphs, Graph Edit Distance (GED) and Maximum Common Subgraphs (MCS) are the two widely used measures in practice. GED and MCS are domain-agnostic measures of structural similarity between the graphs and define the similarity as a function of pairwise alignment of different entities (such as nodes, edges, and subgraphs) in the two graphs. The explicit explainability offered by the pairwise alignment provides transparency and justification of the similarity score, thus, GED and MCS have important practical applications. However, their exact computations are known to be NP-hard. While recently proposed neural-network based approximations have been shown to accurately compute these similarity scores, they have limited ability in providing comprehensive explanations compared to classical combinatorial algorithms, e.g., Beam search. This paper aims at efficiently approximating these domain-agnostic similarity measures through a neural network, and simultaneously learning the alignments (i.e., explanations) similar to those of classical intractable methods. Specifically, we formulate the similarity between a pair of graphs as the minimal "transformation" cost from one graph to another in the learnable node-embedding space. We show that, if node embedding is able to capture its neighborhood context closely, our proposed similarity function closely approximates both the alignment and the similarity score of classical methods. Furthermore, we also propose an efficient differentiable computation of our proposed objective for model training. Empirically, we demonstrate that the proposed method achieves up to 50%-100% reduction in the Mean Squared Error for the graph similarity approximation task and up to 20% improvement in the retrieval evaluation metrics for the graph retrieval task. The source code is available at https://github.com/khoadoan/GraphOTSim.

Supplemental Material

SIGIR21-fp0937.mp4

mp4

31.7 MB

Download

References

Yunsheng Bai, Hao Ding, Song Bian, Ting Chen, Yizhou Sun, and Wei Wang. 2019. Simgnn: A neural network approach to fast graph similarity computation. In Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining. 384--392.Google ScholarDigital Library
Yunsheng Bai, Hao Ding, Ken Gu, Yizhou Sun, and Wei Wang. 2020. Learning-Based Efficient Graph Similarity Computation via Multi-Scale Convolutional Set Matching. In AAAI. 3219--3226.Google Scholar
Stefano Berretti, Alberto Del Bimbo, and Enrico Vicario. 2001. Efficient matching and indexing of graph models in content-based retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence 23, 10 (2001), 1089--1105.Google ScholarDigital Library
Horst Bunke. 1997. On a relation between graph edit distance and maximum common subgraph. Pattern Recognition Letters 18, 8 (1997), 689--694.Google ScholarDigital Library
Rainer E Burkard, Mauro Dell'Amico, and Silvano Martello. 2009. Assignment problems. Springer.Google Scholar
Liqun Chen, Zhe Gan, Yu Cheng, Linjie Li, Lawrence Carin, and Jingjing Liu. 2020.Graph optimal transport for cross-domain alignment. In International Conference on Machine Learning. PMLR, 1542--1553.Google Scholar
David K Duvenaud, Dougal Maclaurin, Jorge Iparraguirre, Rafael Bombarell,Timothy Hirzel, Alán Aspuru-Guzik, and Ryan P Adams. 2015. Convolutional networks on graphs for learning molecular fingerprints. In Advances in neural information processing systems. 2224--2232.Google Scholar
Stefan Fankhauser, Kaspar Riesen, and Horst Bunke. 2011. Speeding up graph edit distance computation through fast bipartite matching. In International Workshop on Graph-Based Representations in Pattern Recognition. Springer, 102--111.Google ScholarDigital Library
Matthias Fey, Jan E Lenssen, Christopher Morris, Jonathan Masci, and Nils M Kriege. 2019. Deep Graph Matching Consensus. In International Conference on Learning Representations.Google Scholar
Andreas Fischer, Réjean Plamondon, Yvon Savaria, Kaspar Riesen, and Horst Bunke. 2014. A hausdorff heuristic for efficient computation of graph edit distance. InJoint IAPR International Workshops on Statistical Techniques in Pattern Recognition (SPR) and Structural and Syntactic Pattern Recognition (SSPR). Springer, 83--92.Google ScholarDigital Library
Xinyu Fu, Jiani Zhang, Ziqiao Meng, and Irwin King. 2020. MAGNN: Metapath Aggregated Graph Neural Network for Heterogeneous Graph Embedding. In Proceedings of The Web Conference 2020. 2331--2341.Google ScholarDigital Library
Carlos Garcia-Hernandez, Alberto Fernandez, and Francesc Serratosa. 2019. Ligand-based virtual screening using graph edit distance as molecular similarity measure. Journal of chemical information and modeling 59, 4 (2019), 1410--1421.Google ScholarCross Ref
Carlos Garcia-Hernandez, Alberto Fernandez, and Francesc Serratosa. 2020. Learning the Edit Costs of Graph Edit Distance Applied to Ligand-Based Virtual Screening. Current topics in medicinal chemistry 20, 18 (2020), 1582--1592.Google Scholar
Aude Genevay, Gabriel Peyré, and Marco Cuturi. 2018. Learning generative models with sinkhorn divergences. In International Conference on Artificial Intelligence and Statistics. 1608--1617.Google Scholar
José Jiménez-Luna, Francesca Grisoni, and Gisbert Schneider. 2020. Drug discovery with explainable artificial intelligence.Nature Machine Intelligence 2, 10(2020), 573--584.Google Scholar
Roy Jonker and Anton Volgenant. 1987. A shortest augmenting path algorithm for dense and sparse linear assignment problems. Computing 38, 4 (1987), 325--340.Google ScholarDigital Library
Maurice G Kendall. 1938. A new measure of rank correlation. Biometrika 30, 1/2(1938), 81--93.Google ScholarCross Ref
Thomas N. Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24--26, 2017, Conference Track Proceedings. OpenReview.net. https://openreview.net/forum?id=SJU4ayYglGoogle Scholar
Yujia Li, Chenjie Gu, Thomas Dullien, Oriol Vinyals, and Pushmeet Kohli. 2019. Graph matching networks for learning the similarity of graph structured objects. In International Conference on Machine Learning. PMLR, 3835--3845.Google Scholar
Saurav Manchanda, Da Zheng, and George Karypis. 2021. Schema-Aware Deep Graph Convolutional Networks for Heterogeneous Graphs. arXiv:2105.00644 [cs.LG]Google Scholar
Hermina Petric Maretic, Mireille El Gheche, Giovanni Chierchia, and Pascal Frossard. 2019. GOT: an optimal transport framework for graph comparison. In Advances in Neural Information Processing Systems. 13876--13887.Google Scholar
Michel Neuhaus, Kaspar Riesen, and Horst Bunke. 2006. Fast suboptimal algorithms for the computation of graph edit distance. In Joint IAPR International Workshops on Statistical Techniques in Pattern Recognition (SPR) and Structuraland Syntactic Pattern Recognition (SSPR). Springer, 163--172.Google ScholarDigital Library
Giannis Nikolentzos, Polykarpos Meladianos, and Michalis Vazirgiannis. 2017. Matching node embeddings for graph similarity. In Thirty-First AAAI Conference on Artificial Intelligence.Google ScholarDigital Library
John W Raymond, Eleanor J Gardiner, and Peter Willett. 2002. Rascal: Calculation of graph similarity using maximum common edge subgraphs. Comput. J. 45, 6(2002), 631--644.Google ScholarCross Ref
Kaspar Riesen and Horst Bunke. 2009. Approximate graph edit distance computation by means of bipartite graph matching. Image and Vision computing 27, 7(2009), 950--959.Google Scholar
Ignacio Rocco, Mircea Cimpoi, Relja Arandjelovic, Akihiko Torii, Tomas Pajdla, and Josef Sivic. 2018. Neighbourhood consensus networks. In Proceedings of the32nd International Conference on Neural Information Processing Systems. 1658--1669.Google Scholar
Michael Schlichtkrull, Thomas N Kipf, Peter Bloem, Rianne Van Den Berg, Ivan Titov, and Max Welling. 2018. Modeling relational data with graph convolutional networks. In European Semantic Web Conference. Springer, 593--607.Google ScholarDigital Library
Dennis Shasha, Jason TL Wang, and Rosalba Giugno. 2002. Algorithmics and applications of tree and graph searching. In Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems. 39--52.Google ScholarDigital Library
Charles Spearman. 1987. The proof and measurement of association between two things. The American journal of psychology100, 3/4 (1987), 441--471.Google Scholar
Hannu Toivonen, Ashwin Srinivasan, Ross D King, Stefan Kramer, and Christoph Helma. 2003. Statistical evaluation of the predictive toxicology challenge 2000--2001. Bioinformatics 19, 10 (2003), 1183--1193.Google ScholarCross Ref
Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. 2018. Graph Attention Networks. In6th International Conference on Learning Representations. https://openreview.net/forum?id=rJXMpikCZGoogle Scholar
Xiaoli Wang, Xiaofeng Ding, Anthony KH Tung, Shanshan Ying, and Hai Jin. 2012. An efficient graph indexing method. In 2012 IEEE 28th International Conference on Data Engineering. IEEE, 210--221.Google ScholarDigital Library
Xiao Wang, Houye Ji, Chuan Shi, Bai Wang, Yanfang Ye, Peng Cui, and Philip S Yu. 2019. Heterogeneous graph attention network. In The World Wide Web Conference. 2022--2032.Google ScholarDigital Library
Peter Willett, John M Barnard, and Geoffrey M Downs. 1998. Chemical similarity searching. Journal of chemical information and computer sciences 38, 6 (1998), 983--996.Google ScholarCross Ref
Pinar Yanardag and SVN Vishwanathan. 2015. Deep graph kernels. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1365--1374.Google ScholarDigital Library
Zhiping Zeng, Anthony KH Tung, Jianyong Wang, Jianhua Feng, and Lizhu Zhou. 2009. Comparing stars: On approximating graph edit distance. Proceedings of the VLDB Endowment 2, 1 (2009), 25--36.Google ScholarDigital Library

Index Terms

Interpretable Graph Similarity Computation via Differentiable Optimal Alignment of Node Embeddings

Recommendations

Multiperspective Graph-Theoretic Similarity Measure
CIKM '18: Proceedings of the 27th ACM International Conference on Information and Knowledge Management

Determining the similarity between two objects is pertinent to many applications. When the basis for similarity is a set of object-to-object relationships, it is natural to rely on graph-theoretic measures. One seminal technique for measuring the ...
Read More
Measuring similarity of graph nodes by neighbor matching

The problem of measuring similarity of graph nodes is important in a range of practical problems. There is a number of proposed measures, usually based on iterative calculation of similarity and the principle that two nodes are as similar as their ...
Read More
DGE-GSIM: A multi-task dual graph embedding learning for graph similarity computation
ICMLSC '22: Proceedings of the 2022 6th International Conference on Machine Learning and Soft Computing

Graph similarity estimation is a challenging task due to the complex graph structure. To achieve an exact similarity estimation for input graphs, two critical factors are how to learn an appropriate graph embedding and how to compute the similarity ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGIR '21: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval
July 2021
2998 pages
ISBN:9781450380379
DOI:10.1145/3404835
General Chairs:
Fernando Diaz
(Google)
,
Chirag Shah
University of Washington
,
Torsten Suel
New York University
,
Program Chairs:
Pablo Castells
Universidad Autónoma de Madrid, Amazon
,
Rosie Jones
Spotify
,
Tetsuya Sakai
Waseda University
Copyright © 2021 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 11 July 2021
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
GCN
graph similarity
model interpretability
similarity search
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate792of3,983submissions,20%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 8
  Total Citations
  View Citations
- 933
  Total Downloads
- Downloads (Last 12 months)325
- Downloads (Last 6 weeks)33
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Interpretable Graph Similarity Computation via Differentiable Optimal Alignment of Node Embeddings

SIGIR '21: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Multiperspective Graph-Theoretic Similarity Measure

Measuring similarity of graph nodes by neighbor matching

DGE-GSIM: A multi-task dual graph embedding learning for graph similarity computation