Abstract
Network embedding has emerged as an effective way to deal with downstream tasks, such as node classification [16, 31, 42]. Most existing methods leverage multi-similarities between nodes such as connectivity, which considers vertices that are closely connected to be similar and structural similarity, which is measured by assessing their relations to neighbors; while these methods only focus on static graphs. In this work, we bridge connectivity and structural similarity in a uniform representation via motifs, and consequently present an algorithm for Learning Embeddings by leveraging Motifs Of Networks (LEMON), which aims to learn embeddings for vertices and various motifs. Moreover, LEMON is inherently capable of dealing with inductive learning tasks for dynamic graphs. To validate the effectiveness and efficiency, we conduct various experiments on two real-world datasets and five public datasets from diverse domains. Through comparison with state-of-the-art baseline models, we find that LEMON achieves significant improvements in downstream tasks. We release our code on Github at https://github.com/larry2020626/LEMON.
- [1] . 2019. Link prediction via higher-order motif features. In Proceedings of the Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 412–429.Google ScholarDigital Library
- [2] . 2018. Sub2vec: Feature learning for subgraphs. In Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer, 170–182.Google ScholarDigital Library
- [3] . 2013. Distributed large-scale natural graph factorization. In Proceedings of the 22nd International Conference on World Wide Web. ACM, 37–48. Google ScholarDigital Library
- [4] . 2015. Efficient graphlet counting for large networks. In Proceedings of the 2015 IEEE International Conference on Data Mining. IEEE, 1–10. Google ScholarDigital Library
- [5] . 2019. role2vec: Role-based network embeddings. In Proceedings of the 1st International Workshop on Deep Learning on Graphs: Methods and Applications. 1–7.Google Scholar
- [6] . 2016. Estimation of local subgraph counts. In Proceedings of the 2016 IEEE International Conference on Big Data.
IEEE , 586–595.Google ScholarCross Ref - [7] . 2020. SubRank: Subgraph embeddings via a subgraph proximity measure. In Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining.
Springer , 487–498.Google ScholarDigital Library - [8] . 2006. Fast low-rank modifications of the thin singular value decomposition. Linear Algebra and Its Applications 415, 1 (2006), 20–30.Google ScholarCross Ref
- [9] . 2007. The BioGRID interaction database: 2008 update. Nucleic Acids Research 36, suppl_1 (2007), 637–640.Google ScholarCross Ref
- [10] . 2018. Motif counting beyond five nodes. Transactions on Knowledge Discovery from Data 12, 4 (2018), 1–25. Google ScholarDigital Library
- [11] . 2019. motif2vec: Motif aware node representation learning for heterogeneous networks. In ICBD. IEEE, 1052–1059.Google Scholar
- [12] . 2017. E-CLoG: Counting edge-centric local graphlets. In Proceedings of the 2017 IEEE International Conference on Big Data. IEEE, 586–595.Google ScholarCross Ref
- [13] . 2017. metapath2vec: Scalable representation learning for heterogeneous networks. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 135–144. Google ScholarDigital Library
- [14] . 2018. Learning structural node embeddings via diffusion wavelets. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 1320–1329. Google ScholarDigital Library
- [15] . 2017. Hin2vec: Explore meta-paths in heterogeneous information networks for representation learning. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. ACM, 1797–1806. Google ScholarDigital Library
- [16] . 2016. node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 855–864. Google ScholarDigital Library
- [17] . 2017. Inductive representation learning on large graphs. In Proceedings of the 31st International Conference on Neural Information Processing Systems. 1024–1034. Google ScholarDigital Library
- [18] . 2012. Rolx: Structural role extraction & mining in large graphs. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1231–1239. Google ScholarDigital Library
- [19] . 2014. A combinatorial approach to graphlet counting. Bioinformatics 30, 4 (2014), 559–565.Google ScholarCross Ref
- [20] . 2004. Efficient sampling algorithm for estimating subgraph concentrations and detecting network motifs. Bioinformatics 20, 11 (2004), 1746–1758. Google ScholarDigital Library
- [21] . 2017. Lightgbm: A highly efficient gradient boosting decision tree. In Proceedings of the 31st International Conference on Neural Information Processing Systems. Google ScholarDigital Library
- [22] . 2019. Graph convolutional networks with motif-based attention. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management.
ACM , 499–508. Google ScholarDigital Library - [23] . 2019. Multi-level network embedding with boosted low-rank matrix approximation. In Proceedings of the 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining. 49–56. Google ScholarDigital Library
- [24] . 2017. Enhancing the network embedding quality with structural similarity. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management.
ACM , 147–156. Google ScholarDigital Library - [25] . 2011. Large text compression benchmark. Retrieved on August 2019 from http://www.mattmahoney.net/dc/textdata. (2011).Google Scholar
- [26] . 2012. Rage–a rapid graphlet enumerator for large networks. Computer Networks 56, 2 (2012), 810–819. Google ScholarDigital Library
- [27] . 2013. Efficient estimation of word representations in vector space. In Proceedings of the 2013 International Conference on Learning Representations.Google Scholar
- [28] . 2002. Network motifs: Simple building blocks of complex networks. Science 298, 5594 (2002), 824–827.Google Scholar
- [29] . 2017. Motif-aware graph embeddings. In Proceedings of the International Joint Conference on Artificial Intelligence.Google Scholar
- [30] . 2016. Asymmetric transitivity preserving graph embedding. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.
ACM , 1105–1114. Google ScholarDigital Library - [31] . 2014. Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.
ACM , 701–710. Google ScholarDigital Library - [32] . 2017. Escape: Efficiently counting all 5-vertex subgraphs. In Proceedings of the 26th International Conference on World Wide Web. 1431–1440. Google ScholarDigital Library
- [33] . 2018. Network embedding as matrix factorization: Unifying deepwalk, line, pte, and node2vec. In Proceedings of the 11th ACM International Conference on Web Search and Data Mining.
ACM , 459–467. Google ScholarDigital Library - [34] . 2017. struc2vec: Learning node representations from structural identity. In Proceedings of the Special Interest Group on Knowledge Discovery and Data.
ACM , 385–394. Google ScholarDigital Library - [35] . 2015. The network data repository with interactive graph analytics and visualization. In AAAI. 4292–4293. Google ScholarDigital Library
- [36] . 2015. The network data repository with interactive graph analytics and visualization. In Proceedings of the 29th AAAI Conference on Artificial Intelligence 3–4. Google ScholarDigital Library
- [37] . 2018. Higher-order network representation learning. In Companion Proceedings of the the Web Conference 2018. Google ScholarDigital Library
- [38] . 2018. HONE: Higher-order network embeddings. arXiv:1801.09303. Retrieved from https://arxiv.org/abs/1801.09303.Google Scholar
- [39] . 2020. HONEM: Learning embedding for higher order networks. Big Data 8, 4 (2020), 255–269.Google ScholarCross Ref
- [40] . 2017. Motif-based convolutional neural network on graphs. arXiv:1711.05697. Retrieved from https://arxiv.org/abs/1711.05697.Google Scholar
- [41] . 2019. Meta-GNN: Metagraph neural network for semi-supervised learning in attributed heterogeneous information networks. In Proceedings of the 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining. IEEE (2019), 137–144. Google ScholarDigital Library
- [42] . 2015. Line: Large-scale information network embedding. In Proceedings of the 24th International Conference on World Wide Web. 1067–1077. Google ScholarDigital Library
- [43] . 2008. Arnetminer: Extraction and mining of academic social networks. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.
ACM , 990–998. Google ScholarDigital Library - [44] . 2020. GLEE: Geometric laplacian eigenmap embedding. Journal of Complex Networks 8, 2 (2020).Google ScholarCross Ref
- [45] . 2018. Verse: Versatile graph embeddings from similarity measures. In Proceedings of the 2018 World Wide Web Conference. 539–548. Google ScholarDigital Library
- [46] . 2020. MODEL: Motif-based deep feature learning for link prediction. IEEE Transactions on Computational Social Systems 7, 2 (2020), 503–516.Google ScholarCross Ref
- [47] . 2017. Community preserving network embedding. In Proceedings of the 31st AAAI Conference on Artificial Intelligence. 203–209. Google ScholarDigital Library
- [48] . 2006. FANMOD: A tool for fast network motif detection. Bioinformatics 22, 9 (2006), 1152–1153. Google ScholarDigital Library
- [49] . 2019. A comprehensive survey on graph neural networks. arXiv:1901.00596. Retrieved from https://arxiv.org/abs/1901.00596.Google Scholar
- [50] . 2018. Node, motif and subgraph: Leveraging network functional blocks through structural convolution. In Proceedings of the 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining. IEEE, 47–52. Google ScholarDigital Library
- [51] . 2017. Fast network embedding enhancement via high order proximity approximation. In Proceedings of the 26th International Joint Conference on Artificial Intelligence. 3894–3900. Google ScholarDigital Library
- [52] . 2019. Understanding default behavior in online lending. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management. 2043–2052. Google ScholarDigital Library
- [53] . 2018. Higher-order clustering in networks. Physical Review E 97, 5 (2018), 052306.Google ScholarCross Ref
- [54] . 2017. Local higher-order graph clustering. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Google ScholarDigital Library
- [55] . 2020. OFFER: A motif dimensional framework for network representation learning. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management. 3349–3352. Google ScholarDigital Library
- [56] . 2019. Rum: Network representation learning using motifs. In Proceedings of the 2019 IEEE 35th International Conference on Data Engineering.
IEEE , 1382–1393.Google ScholarCross Ref - [57] . [n.d.]. ProNE: Fast and scalable network representation learning. In Proceedings of the 28th International Joint Conference on Artificial Intelligence Vol. 19. 4278–4284. Google ScholarDigital Library
- [58] . 2019. Motif enhanced recommendation over heterogeneous information network. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management. Google ScholarDigital Library
- [59] . 2018. Dynamic network embedding by modeling triadic closure process. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence. 571–578.Google Scholar
Index Terms
- Network Embedding via Motifs
Recommendations
Unsupervised Learning of Multiple Motifs in Biopolymers Using Expectation Maximization
Special issue on applications in molecular biologyThe MEME algorithm extends the expectation maximization (EM) algorithm for identifying motifs in unaligned biopolymer sequences. The aim of MEME is to discover new motifs in a set of biopolymer sequences where little or nothing is known in advance about ...
Genome wide classification and characterisation of CpG sites in cancer and normal cells
This study identifies common methylation patterns across different cancer types in an effort to identify common molecular events in diverse types of cancer cells and provides evidence for the sequence surrounding a CpG to influence its susceptibility to ...
Multi-view Heterogeneous Network Embedding
Knowledge Science, Engineering and ManagementAbstractIn the real world, the complex and diverse relations among different objects can be described in the form of networks. At the same time, with the emergence and development of network embedding, it has become an effective tool for processing ...
Comments