ABSTRACT
Knowledge graphs, which consist of entities and their relations, have become a popular way to store structured knowledge. Knowledge graph embedding (KGE), which derives a representation for each entity and relation, has been widely used to capture the semantics of the information in the knowledge graphs, and has demonstrated great success in many downstream applications, such as the extraction of similar entities in response to a query entity. However, existing KGE methods cannot work well on emerging knowledge graphs that are large-scale due to the constraints in storage and inference efficiency. In this paper, we propose a lightweight KGE model, LightKG, which significantly reduces storage as well as running time needed for inference. Instead of storing a continuous vector for every entity, LightKG only needs to store a few codebooks, each of which contains some codewords that correspond to the representatives among the embeddings, and the indices that correspond to the codeword selections for entities. Hence LightKG can achieve highly efficient storage. The efficiency of the downstream querying process can be significantly boosted too with the proposed LightKG model as the relevance score between the query and an entity can be efficiently calculated via a quick look-up in a table that contains the scores between the query and codewords. The storage and inference efficiency of LightKG is achieved by its novel design. LightKG is an end-to-end framework that automatically infers codebooks and codewords and generates an approximated embedding for each entity. A residual module is included in LightKG to induce the diversity among codebooks, and a continuous function is adopted to approximate codeword selection, which is non-differential. In addition, to further improve the performance of KGE, we propose a novel dynamic negative sampling method based on quantization, which can be applied to the proposed LightKG or other KGE methods. We conduct extensive experiments on five public datasets. The experiments show that LightKG is search and memory efficient with high approximate search accuracy. Also, the dynamic negative sampling can dramatically improve model performance with over 19% improvement on average.
Supplemental Material
- Artem Babenko and Victor Lempitsky. 2014. The inverted multi-index. IEEE transactions on pattern analysis and machine intelligence, Vol. 37, 6 (2014), 1247--1260.Google Scholar
- Yebo Bao, Hui Jiang, Lirong Dai, and Cong Liu. [n.d.]. Incoherent training of deep neural networks to de-correlate bottleneck features for speech recognition. In 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.Google Scholar
- Yoshua Bengio, Nicholas Léonard, and Aaron Courville. 2013. Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013).Google Scholar
- Antoine Bordes, Nicolas Usunier, Alberto Garcia-Duran, Jason Weston, and Oksana Yakhnenko. 2013. Translating embeddings for modeling multi-relational data. Advances in neural information processing systems, Vol. 26 (2013), 2787--2795.Google Scholar
- Liwei Cai and William Yang Wang. 2017. Kbgan: Adversarial learning for knowledge graph embeddings. arXiv preprint arXiv:1711.04071 (2017).Google Scholar
- Ines Chami, Adva Wolf, Da-Cheng Juan, Frederic Sala, Sujith Ravi, and Christopher Ré. 2020. Low-Dimensional Hyperbolic Knowledge Graph Embeddings. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 6901--6914.Google ScholarCross Ref
- Xiaojun Chen, Shengbin Jia, and Yang Xiang. 2020. A review: Knowledge reasoning over knowledge graph. Expert Systems with Applications, Vol. 141 (2020), 112948.Google ScholarDigital Library
- Yongjian Chen, Tao Guan, and Cheng Wang. 2010. Approximate nearest neighbor search by residual vector quantization. Sensors, Vol. 10, 12 (2010), 11259--11273.Google ScholarCross Ref
- Matthieu Courbariaux, Yoshua Bengio, and Jean-Pierre David. 2014. Training deep neural networks with low precision multiplications. arXiv preprint arXiv:1412.7024 (2014).Google Scholar
- Tim Dettmers, Pasquale Minervini, Pontus Stenetorp, and Sebastian Riedel. 2018. Convolutional 2d knowledge graph embeddings. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 32.Google ScholarCross Ref
- Laura Dietz, Alexander Kotov, and Edgar Meij. 2018. Utilizing knowledge graphs for text-centric information retrieval. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval. 1387--1390.Google ScholarDigital Library
- Tiezheng Ge, Kaiming He, Qifa Ke, and Jian Sun. 2013. Optimized product quantization. IEEE transactions on pattern analysis and machine intelligence, Vol. 36, 4 (2013), 744--755.Google Scholar
- Aristides Gionis, Piotr Indyk, Rajeev Motwani, et al. 1999. Similarity search in high dimensions via hashing. In Vldb, Vol. 99. 518--529.Google ScholarDigital Library
- Xu Han, Shulin Cao, Xin Lv, Yankai Lin, Zhiyuan Liu, Maosong Sun, and Juanzi Li. 2018. Openke: An open toolkit for knowledge embedding. In Proceedings of the 2018 conference on empirical methods in natural language processing: system demonstrations. 139--144.Google ScholarCross Ref
- John A Hartigan and Manchek A Wong. 1979. AK-means clustering algorithm. Journal of the Royal Statistical Society: Series C (Applied Statistics), Vol. 28, 1 (1979), 100--108.Google ScholarCross Ref
- Xiao Huang, Jingyuan Zhang, Dingcheng Li, and Ping Li. 2019. Knowledge graph embedding based question answering. In Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining. 105--113.Google ScholarDigital Library
- Itay Hubara, Matthieu Courbariaux, Daniel Soudry, Ran El-Yaniv, and Yoshua Bengio. 2016. Binarized neural networks. Advances in neural information processing systems, Vol. 29 (2016), 4107--4115.Google Scholar
- Itay Hubara, Matthieu Courbariaux, Daniel Soudry, Ran El-Yaniv, and Yoshua Bengio. 2017. Quantized neural networks: Training neural networks with low precision weights and activations. The Journal of Machine Learning Research, Vol. 18, 1 (2017), 6869--6898.Google ScholarDigital Library
- Eric Jang, Shixiang Gu, and Ben Poole. 2016. Categorical reparameterization with gumbel-softmax. arXiv preprint arXiv:1611.01144 (2016).Google Scholar
- Herve Jegou, Matthijs Douze, and Cordelia Schmid. 2010. Product quantization for nearest neighbor search. IEEE transactions on pattern analysis and machine intelligence, Vol. 33, 1 (2010), 117--128.Google Scholar
- Jin-Hwa Kim, Jaehyun Jun, and Byoung-Tak Zhang. 2018. Bilinear attention networks. arXiv preprint arXiv:1805.07932 (2018).Google Scholar
- Brian Kulis, Mátyás A Sustik, and Inderjit S Dhillon. 2009. Low-Rank Kernel Learning with Bregman Matrix Divergences. Journal of Machine Learning Research, Vol. 10, 2 (2009).Google Scholar
- Defu Lian, Qi Liu, and Enhong Chen. 2020a. Personalized ranking with importance sampling. In Proceedings of The Web Conference 2020. 1093--1103.Google ScholarDigital Library
- Defu Lian, Haoyu Wang, Zheng Liu, Jianxun Lian, Enhong Chen, and Xing Xie. 2020b. Lightrec: A memory and search-efficient recommender system. In Proceedings of The Web Conference 2020. 695--705.Google ScholarDigital Library
- Defu Lian, Yongji Wu, Yong Ge, Xing Xie, and Enhong Chen. 2020c. Geography-aware sequential location recommendation. In Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining. 2009--2019.Google ScholarDigital Library
- Defu Lian, Xing Xie, Enhong Chen, and Hui Xiong. [n.d.]. Product Quantized Collaborative Filtering. IEEE Transactions on Knowledge and Data Engineering ( [n.,d.]).Google Scholar
- Yankai Lin, Zhiyuan Liu, Maosong Sun, Yang Liu, and Xuan Zhu. 2015. Learning entity and relation embeddings for knowledge graph completion. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 29.Google ScholarCross Ref
- Zhenghao Liu, Chenyan Xiong, Maosong Sun, and Zhiyuan Liu. 2018. Entity-duet neural ranking: Understanding the role of knowledge graph semantics in neural information retrieval. arXiv preprint arXiv:1805.07591 (2018).Google Scholar
- Chris J Maddison, Andriy Mnih, and Yee Whye Teh. 2016. The concrete distribution: A continuous relaxation of discrete random variables. arXiv preprint arXiv:1611.00712 (2016).Google Scholar
- Maximilian Nickel, Volker Tresp, and Hans-Peter Kriegel. 2011. A three-way model for collective learning on multi-relational data.. In Icml, Vol. 11. 809--816.Google ScholarDigital Library
- Mohammad Rastegari, Vicente Ordonez, Joseph Redmon, and Ali Farhadi. 2016. Xnor-net: Imagenet classification using binary convolutional neural networks. In European conference on computer vision. Springer, 525--542.Google ScholarCross Ref
- Mrinmaya Sachan. 2020. Knowledge Graph Embedding Compression. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.Google ScholarCross Ref
- Shengtian Sang, Zhihao Yang, Lei Wang, Xiaoxia Liu, Hongfei Lin, and Jian Wang. 2018. SemaTyP: a knowledge graph based literature mining method for drug discovery. BMC bioinformatics, Vol. 19, 1 (2018), 193.Google Scholar
- Michael Schlichtkrull, Thomas N Kipf, Peter Bloem, Rianne Van Den Berg, Ivan Titov, and Max Welling. 2018. Modeling relational data with graph convolutional networks. In European Semantic Web Conference. Springer, 593--607.Google ScholarDigital Library
- Fabian M Suchanek, Gjergji Kasneci, and Gerhard Weikum. 2007. Yago: a core of semantic knowledge. In Proceedings of the 16th international conference on World Wide Web. 697--706.Google ScholarDigital Library
- Zhiqing Sun, Zhi-Hong Deng, Jian-Yun Nie, and Jian Tang. 2019. Rotate: Knowledge graph embedding by relational rotation in complex space. arXiv preprint arXiv:1902.10197 (2019).Google Scholar
- Kristina Toutanova, Danqi Chen, Patrick Pantel, Hoifung Poon, Pallavi Choudhury, and Michael Gamon. 2015. Representing text for joint embedding of text and knowledge bases. In Proceedings of the 2015 conference on empirical methods in natural language processing. 1499--1509.Google ScholarCross Ref
- Théo Trouillon, Christopher R Dance, Johannes Welbl, Sebastian Riedel, Éric Gaussier, and Guillaume Bouchard. 2017. Knowledge graph completion via complex tensor factorization. arXiv preprint arXiv:1702.06879 (2017).Google Scholar
- Hongwei Wang, Fuzheng Zhang, Jialin Wang, Miao Zhao, Wenjie Li, Xing Xie, and Minyi Guo. 2018b. Ripplenet: Propagating user preferences on the knowledge graph for recommender systems. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management. 417--426.Google ScholarDigital Library
- Peifeng Wang, Shuangyin Li, and Rong Pan. 2018a. Incorporating gan for negative sampling in knowledge representation learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 32.Google ScholarCross Ref
- Yuxiang Wang, Zhangpeng Ge, Haijiang Yan, Xiaoliang Xu, and Yixing Xia. 2019. Semantic locality-based approximate knowledge graph query. Concurrency and Computation: Practice and Experience, Vol. 31, 24 (2019), e5345.Google ScholarCross Ref
- Zhen Wang, Jianwen Zhang, Jianlin Feng, and Zheng Chen. 2014. Knowledge graph embedding by translating on hyperplanes. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 28.Google ScholarCross Ref
- Pengtao Xie, Yuntian Deng, and Eric Xing. 2015. Diversifying restricted boltzmann machine for document modeling. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1315--1324.Google ScholarDigital Library
- Pengtao Xie, Aarti Singh, and Eric P Xing. 2017. Uncorrelation and evenness: a new diversity-promoting regularizer. In International Conference on Machine Learning. 3811--3820.Google Scholar
- Bishan Yang, Wen-tau Yih, Xiaodong He, Jianfeng Gao, and Li Deng. 2014. Embedding entities and relations for learning and inference in knowledge bases. arXiv preprint arXiv:1412.6575 (2014).Google Scholar
- Yang Yu, Yu-Feng Li, and Zhi-Hua Zhou. 2011. Diversity regularized machine. In Twenty-Second International Joint Conference on Artificial Intelligence.Google Scholar
- Fuzheng Zhang, Nicholas Jing Yuan, Defu Lian, Xing Xie, and Wei-Ying Ma. 2016. Collaborative knowledge base embedding for recommender systems. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. 353--362.Google ScholarDigital Library
- Yuyu Zhang, Hanjun Dai, Zornitsa Kozareva, Alexander Smola, and Le Song. 2018. Variational reasoning for question answering with knowledge graph. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 32.Google ScholarCross Ref
Index Terms
- A Lightweight Knowledge Graph Embedding Framework for Efficient Inference and Storage
Recommendations
Efficient Non-Sampling Knowledge Graph Embedding
WWW '21: Proceedings of the Web Conference 2021Knowledge Graph (KG) is a flexible structure that is able to describe the complex relationship between data entities. Currently, most KG embedding models are trained based on negative sampling, i.e., the model aims to maximize some similarity of the ...
Weighted Knowledge Graph Embedding
SIGIR '23: Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information RetrievalKnowledge graph embedding (KGE) aims to project both entities and relations in a knowledge graph (KG) into low-dimensional vectors. Indeed, existing KGs suffer from the data imbalance issue, i.e., entities and relations conform to a long-tail ...
Embedding based Link Prediction for Knowledge Graph Completion
CIKM '20: Proceedings of the 29th ACM International Conference on Information & Knowledge ManagementKnowledge Graphs (KGs) have recently gained attention for representing knowledge about a particular domain. Since its advent, the Linked Open Data (LOD) cloud has constantly been growing containing many KGs about many different domains such as ...
Comments