research-article

Network Embedding via Motifs

Authors:
Ping Shao

Zhejiang University, Zhejiang, China

Zhejiang University, Zhejiang, China
View Profile

,
Yang Yang

Zhejiang University, Zhejiang, China

Zhejiang University, Zhejiang, China
View Profile

,
Shengyao Xu

Finvolution Group Inc., Shanghai, China

Finvolution Group Inc., Shanghai, China
View Profile

,
Chunping Wang

Finvolution Group Inc., Shanghai, China

Finvolution Group Inc., Shanghai, China
View Profile

ACM Transactions on Knowledge Discovery from Data Volume 16 Issue 3Article No.: 44pp 1–20https://doi.org/10.1145/3473911

Published:22 October 2021Publication History

ACM Transactions on Knowledge Discovery from Data

Abstract

Network embedding has emerged as an effective way to deal with downstream tasks, such as node classification [16, 31, 42]. Most existing methods leverage multi-similarities between nodes such as connectivity, which considers vertices that are closely connected to be similar and structural similarity, which is measured by assessing their relations to neighbors; while these methods only focus on static graphs. In this work, we bridge connectivity and structural similarity in a uniform representation via motifs, and consequently present an algorithm for Learning Embeddings by leveraging Motifs Of Networks (LEMON), which aims to learn embeddings for vertices and various motifs. Moreover, LEMON is inherently capable of dealing with inductive learning tasks for dynamic graphs. To validate the effectiveness and efficiency, we conduct various experiments on two real-world datasets and five public datasets from diverse domains. Through comparison with state-of-the-art baseline models, we find that LEMON achieves significant improvements in downstream tasks. We release our code on Github at https://github.com/larry2020626/LEMON.

REFERENCES

[1] AbuOda Ghadeer, Morales Gianmarco De Francisci, and Aboulnaga Ashraf. 2019. Link prediction via higher-order motif features. In Proceedings of the Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 412–429.Google ScholarDigital Library
[2] Adhikari Bijaya, Zhang Yao, Ramakrishnan Naren, and Prakash B. Aditya. 2018. Sub2vec: Feature learning for subgraphs. In Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer, 170–182.Google ScholarDigital Library
[3] Ahmed Amr, Shervashidze Nino, Narayanamurthy Shravan, Josifovski Vanja, and Smola Alexander J.. 2013. Distributed large-scale natural graph factorization. In Proceedings of the 22nd International Conference on World Wide Web. ACM, 37–48. Google ScholarDigital Library
[4] Ahmed Nesreen K., Neville Jennifer, Rossi Ryan A., and Duffield Nick. 2015. Efficient graphlet counting for large networks. In Proceedings of the 2015 IEEE International Conference on Data Mining. IEEE, 1–10. Google ScholarDigital Library
[5] Ahmed Nesreen K., Rossi Ryan A., Lee John Boaz, Willke Theodore L., Zhou Rong, Kong Xiangnan, and Eldardiry Hoda. 2019. role2vec: Role-based network embeddings. In Proceedings of the 1st International Workshop on Deep Learning on Graphs: Methods and Applications. 1–7.Google Scholar
[6] Ahmed Nesreen K., Willke Theodore L., and Rossi Ryan A.. 2016. Estimation of local subgraph counts. In Proceedings of the 2016 IEEE International Conference on Big Data. IEEE, 586–595.Google ScholarCross Ref
[7] Balalau Oana and Goyal Sagar. 2020. SubRank: Subgraph embeddings via a subgraph proximity measure. In Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer, 487–498.Google ScholarDigital Library
[8] Brand Matthew. 2006. Fast low-rank modifications of the thin singular value decomposition. Linear Algebra and Its Applications 415, 1 (2006), 20–30.Google ScholarCross Ref
[9] Breitkreutz Bobby-Joe, Stark Chris, Reguly Teresa, Boucher Lorrie, Breitkreutz Ashton, Livstone Michael, Oughtred Rose, Lackner Daniel H., Bähler Jürg, Wood Valerie, Kara Dolinski, and Mike Tyers. 2007. The BioGRID interaction database: 2008 update. Nucleic Acids Research 36, suppl_1 (2007), 637–640.Google ScholarCross Ref
[10] Bressan Marco, Chierichetti Flavio, Kumar Ravi, Leucci Stefano, and Panconesi Alessandro. 2018. Motif counting beyond five nodes. Transactions on Knowledge Discovery from Data 12, 4 (2018), 1–25. Google ScholarDigital Library
[11] Dareddy Manoj Reddy, Das Mahashweta, and Yang Hao. 2019. motif2vec: Motif aware node representation learning for heterogeneous networks. In ICBD. IEEE, 1052–1059.Google Scholar
[12] Dave Vachik S., Ahmed Nesreen K., and Hasan Mohammad Al. 2017. E-CLoG: Counting edge-centric local graphlets. In Proceedings of the 2017 IEEE International Conference on Big Data. IEEE, 586–595.Google ScholarCross Ref
[13] Dong Yuxiao, Chawla Nitesh V., and Swami Ananthram. 2017. metapath2vec: Scalable representation learning for heterogeneous networks. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 135–144. Google ScholarDigital Library
[14] Donnat Claire, Zitnik Marinka, Hallac David, and Leskovec Jure. 2018. Learning structural node embeddings via diffusion wavelets. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 1320–1329. Google ScholarDigital Library
[15] Fu Tao-yang, Lee Wang-Chien, and Lei Zhen. 2017. Hin2vec: Explore meta-paths in heterogeneous information networks for representation learning. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. ACM, 1797–1806. Google ScholarDigital Library
[16] Grover Aditya and Leskovec Jure. 2016. node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 855–864. Google ScholarDigital Library
[17] Hamilton Will, Ying Zhitao, and Leskovec Jure. 2017. Inductive representation learning on large graphs. In Proceedings of the 31st International Conference on Neural Information Processing Systems. 1024–1034. Google ScholarDigital Library
[18] Henderson Keith, Gallagher Brian, Eliassi-Rad Tina, Tong Hanghang, Basu Sugato, Akoglu Leman, Koutra Danai, Faloutsos Christos, and Li Lei. 2012. Rolx: Structural role extraction & mining in large graphs. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1231–1239. Google ScholarDigital Library
[19] Hočevar Tomaž and Demšar Janez. 2014. A combinatorial approach to graphlet counting. Bioinformatics 30, 4 (2014), 559–565.Google ScholarCross Ref
[20] Kashtan Nadav, Itzkovitz Shalev, Milo Ron, and Alon Uri. 2004. Efficient sampling algorithm for estimating subgraph concentrations and detecting network motifs. Bioinformatics 20, 11 (2004), 1746–1758. Google ScholarDigital Library
[21] Ke Guolin, Meng Qi, Finley Thomas, Wang Taifeng, Chen Wei, Ma Weidong, Ye Qiwei, and Liu Tie-Yan. 2017. Lightgbm: A highly efficient gradient boosting decision tree. In Proceedings of the 31st International Conference on Neural Information Processing Systems. Google ScholarDigital Library
[22] Lee John Boaz, Rossi Ryan A., Kong Xiangnan, Kim Sungchul, Koh Eunyee, and Rao Anup. 2019. Graph convolutional networks with motif-based attention. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management. ACM, 499–508. Google ScholarDigital Library
[23] Li Jundong, Wu Liang, Guo Ruocheng, Liu Chenghao, and Liu Huan. 2019. Multi-level network embedding with boosted low-rank matrix approximation. In Proceedings of the 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining. 49–56. Google ScholarDigital Library
[24] Lyu Tianshu, Zhang Yuan, and Zhang Yan. 2017. Enhancing the network embedding quality with structural similarity. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. ACM, 147–156. Google ScholarDigital Library
[25] Mahoney Matt. 2011. Large text compression benchmark. Retrieved on August 2019 from http://www.mattmahoney.net/dc/textdata. (2011).Google Scholar
[26] Marcus Dror and Shavitt Yuval. 2012. Rage–a rapid graphlet enumerator for large networks. Computer Networks 56, 2 (2012), 810–819. Google ScholarDigital Library
[27] Mikolov Tomas, Chen Kai, Corrado Greg, and Dean Jeffrey. 2013. Efficient estimation of word representations in vector space. In Proceedings of the 2013 International Conference on Learning Representations.Google Scholar
[28] Milo Ron, Shen-Orr Shai, Itzkovitz Shalev, Kashtan Nadav, Chklovskii Dmitri, and Alon Uri. 2002. Network motifs: Simple building blocks of complex networks. Science 298, 5594 (2002), 824–827.Google Scholar
[29] Nguyen Hoang and Murata Tsuyoshi. 2017. Motif-aware graph embeddings. In Proceedings of the International Joint Conference on Artificial Intelligence.Google Scholar
[30] Ou Mingdong, Cui Peng, Pei Jian, Zhang Ziwei, and Zhu Wenwu. 2016. Asymmetric transitivity preserving graph embedding. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1105–1114. Google ScholarDigital Library
[31] Perozzi Bryan, Al-Rfou Rami, and Skiena Steven. 2014. Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 701–710. Google ScholarDigital Library
[32] Pinar Ali, Seshadhri C., and Vishal Vaidyanathan. 2017. Escape: Efficiently counting all 5-vertex subgraphs. In Proceedings of the 26th International Conference on World Wide Web. 1431–1440. Google ScholarDigital Library
[33] Qiu Jiezhong, Dong Yuxiao, Ma Hao, Li Jian, Wang Kuansan, and Tang Jie. 2018. Network embedding as matrix factorization: Unifying deepwalk, line, pte, and node2vec. In Proceedings of the 11th ACM International Conference on Web Search and Data Mining. ACM, 459–467. Google ScholarDigital Library
[34] Ribeiro Leonardo FR, Saverese Pedro HP, and Figueiredo Daniel R.. 2017. struc2vec: Learning node representations from structural identity. In Proceedings of the Special Interest Group on Knowledge Discovery and Data. ACM, 385–394. Google ScholarDigital Library
[35] Rossi Ryan and Ahmed Nesreen. 2015. The network data repository with interactive graph analytics and visualization. In AAAI. 4292–4293. Google ScholarDigital Library
[36] Rossi Ryan and Ahmed Nesreen. 2015. The network data repository with interactive graph analytics and visualization. In Proceedings of the 29th AAAI Conference on Artificial Intelligence 3–4. Google ScholarDigital Library
[37] Rossi Ryan A., Ahmed Nesreen K., and Koh Eunyee. 2018. Higher-order network representation learning. In Companion Proceedings of the the Web Conference 2018. Google ScholarDigital Library
[38] Rossi Ryan A., Ahmed Nesreen K., Koh Eunyee, Kim Sungchul, Rao Anup, and Yadkori Yasin Abbasi. 2018. HONE: Higher-order network embeddings. arXiv:1801.09303. Retrieved from https://arxiv.org/abs/1801.09303.Google Scholar
[39] Saebi Mandana, Ciampaglia Giovanni Luca, Kaplan Lance M., and Chawla Nitesh V.. 2020. HONEM: Learning embedding for higher order networks. Big Data 8, 4 (2020), 255–269.Google ScholarCross Ref
[40] Sankar Aravind, Zhang Xinyang, and Chang Kevin Chen-Chuan. 2017. Motif-based convolutional neural network on graphs. arXiv:1711.05697. Retrieved from https://arxiv.org/abs/1711.05697.Google Scholar
[41] Sankar Aravind, Zhang Xinyang, and Chang Kevin Chen-Chuan. 2019. Meta-GNN: Metagraph neural network for semi-supervised learning in attributed heterogeneous information networks. In Proceedings of the 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining. IEEE (2019), 137–144. Google ScholarDigital Library
[42] Tang Jian, Qu Meng, Wang Mingzhe, Zhang Ming, Yan Jun, and Mei Qiaozhu. 2015. Line: Large-scale information network embedding. In Proceedings of the 24th International Conference on World Wide Web. 1067–1077. Google ScholarDigital Library
[43] Tang Jie, Zhang Jing, Yao Limin, Li Juanzi, Zhang Li, and Su Zhong. 2008. Arnetminer: Extraction and mining of academic social networks. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 990–998. Google ScholarDigital Library
[44] Torres Leo, Chan Kevin S., and Eliassi-Rad Tina. 2020. GLEE: Geometric laplacian eigenmap embedding. Journal of Complex Networks 8, 2 (2020).Google ScholarCross Ref
[45] Tsitsulin Anton, Mottin Davide, Karras Panagiotis, and Müller Emmanuel. 2018. Verse: Versatile graph embeddings from similarity measures. In Proceedings of the 2018 World Wide Web Conference. 539–548. Google ScholarDigital Library
[46] Wang Lei, Ren Jing, Xu Bo, Li Jianxin, Luo Wei, and Xia Feng. 2020. MODEL: Motif-based deep feature learning for link prediction. IEEE Transactions on Computational Social Systems 7, 2 (2020), 503–516.Google ScholarCross Ref
[47] Wang Xiao, Cui Peng, Wang Jing, Pei Jian, Zhu Wenwu, and Yang Shiqiang. 2017. Community preserving network embedding. In Proceedings of the 31st AAAI Conference on Artificial Intelligence. 203–209. Google ScholarDigital Library
[48] Wernicke Sebastian and Rasche Florian. 2006. FANMOD: A tool for fast network motif detection. Bioinformatics 22, 9 (2006), 1152–1153. Google ScholarDigital Library
[49] Wu Zonghan, Pan Shirui, Chen Fengwen, Long Guodong, Zhang Chengqi, and Yu Philip S.. 2019. A comprehensive survey on graph neural networks. arXiv:1901.00596. Retrieved from https://arxiv.org/abs/1901.00596.Google Scholar
[50] Yang Carl, Liu Mengxiong, Zheng Vincent W., and Han Jiawei. 2018. Node, motif and subgraph: Leveraging network functional blocks through structural convolution. In Proceedings of the 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining. IEEE, 47–52. Google ScholarDigital Library
[51] Yang Cheng, Sun Maosong, Liu Zhiyuan, and Tu Cunchao. 2017. Fast network embedding enhancement via high order proximity approximation. In Proceedings of the 26th International Joint Conference on Artificial Intelligence. 3894–3900. Google ScholarDigital Library
[52] Yang Yang, Xu Yuhong, Wang Chunping, Sun Yizhou, Wu Fei, Zhuang Yueting, and Gu Ming. 2019. Understanding default behavior in online lending. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management. 2043–2052. Google ScholarDigital Library
[53] Yin Hao, Benson Austin R., and Leskovec Jure. 2018. Higher-order clustering in networks. Physical Review E 97, 5 (2018), 052306.Google ScholarCross Ref
[54] Yin Hao, Benson Austin R., Leskovec Jure, and Gleich David F.. 2017. Local higher-order graph clustering. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Google ScholarDigital Library
[55] Yu Shuo, Xia Feng, Xu Jin, Chen Zhikui, and Lee Ivan. 2020. OFFER: A motif dimensional framework for network representation learning. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management. 3349–3352. Google ScholarDigital Library
[56] Yu Yanlei, Lu Zhiwu, Liu Jiajun, Zhao Guoping, and Wen Ji-rong. 2019. Rum: Network representation learning using motifs. In Proceedings of the 2019 IEEE 35th International Conference on Data Engineering. IEEE, 1382–1393.Google ScholarCross Ref
[57] Zhang Jie, Dong Yuxiao, Wang Yan, Tang Jie, and Ding Ming. [n.d.]. ProNE: Fast and scalable network representation learning. In Proceedings of the 28th International Joint Conference on Artificial Intelligence Vol. 19. 4278–4284. Google ScholarDigital Library
[58] Zhao Huan, Zhou Yingqi, Song Yangqiu, and Lee Dik Lun. 2019. Motif enhanced recommendation over heterogeneous information network. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management. Google ScholarDigital Library
[59] Zhou Lekui, Yang Yang, Ren Xiang, Wu Fei, and Zhuang Yueting. 2018. Dynamic network embedding by modeling triadic closure process. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence. 571–578.Google Scholar

Index Terms

Network Embedding via Motifs
1. Computing methodologies
  1. Artificial intelligence

Recommendations

Unsupervised Learning of Multiple Motifs in Biopolymers Using Expectation Maximization
Special issue on applications in molecular biology

The MEME algorithm extends the expectation maximization (EM) algorithm for identifying motifs in unaligned biopolymer sequences. The aim of MEME is to discover new motifs in a set of biopolymer sequences where little or nothing is known in advance about ...
Read More
Genome wide classification and characterisation of CpG sites in cancer and normal cells

This study identifies common methylation patterns across different cancer types in an effort to identify common molecular events in diverse types of cancer cells and provides evidence for the sequence surrounding a CpG to influence its susceptibility to ...
Read More
Multi-view Heterogeneous Network Embedding
Knowledge Science, Engineering and Management
Abstract
In the real world, the complex and diverse relations among different objects can be described in the form of networks. At the same time, with the emergence and development of network embedding, it has become an effective tool for processing ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Knowledge Discovery from Data Volume 16, Issue 3
June 2022
494 pages
ISSN:1556-4681
EISSN:1556-472X
DOI:10.1145/3485152
Editor:
Charu Aggarwal
IBM T. J. Watson Research, USA
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 22 October 2021
- Accepted: 1 July 2021
- Revised: 1 May 2021
- Received: 1 December 2020
Published in tkdd Volume 16, Issue 3

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Motif
network embedding
motif super-vertex
motif embedding
Qualifiers
- research-article
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 6
  Total Citations
  View Citations
- 865
  Total Downloads
- Downloads (Last 12 months)202
- Downloads (Last 6 weeks)25
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

View Full Text

HTML Format

View this article in HTML Format .

View HTML Format

Network Embedding via Motifs

ACM Transactions on Knowledge Discovery from Data

Abstract

REFERENCES

Cited By

Index Terms

Recommendations

Unsupervised Learning of Multiple Motifs in Biopolymers Using Expectation Maximization

Genome wide classification and characterisation of CpG sites in cancer and normal cells

Multi-view Heterogeneous Network Embedding