ABSTRACT
The Natural Language Processing (NLP) community has significantly contributed to the solutions for entity and relation recognition from a natural language text, and possibly linking them to proper matches in Knowledge Graphs (KGs). Considering Wikidata as the background KG, there are still limited tools to link knowledge within the text to Wikidata. In this paper, we present Falcon 2.0, the first joint entity and relation linking tool over Wikidata. It receives a short natural language text in the English language and outputs a ranked list of entities and relations annotated with the proper candidates in Wikidata. The candidates are represented by their Internationalized Resource Identifier (IRI) in Wikidata. Falcon 2.0 resorts to the English language model for the recognition task (e.g., N-Gram tiling and N-Gram splitting), and then an optimization approach for the linking task. We have empirically studied the performance of Falcon 2.0 on Wikidata and concluded that it outperforms all the existing baselines. Falcon 2.0 is open source and can be reused by the community; all the required instructions of Falcon 2.0 are well-documented at our GitHub repository (https://github.com/SDM-TIB/falcon2.0). We also demonstrate an online API, which can be run without any technical expertise. Falcon 2.0 and its background knowledge bases are available as resources at https://labs.tib.eu/falcon/falcon2/.
Supplemental Material
- Sö ren Auer, Christian Bizer, Georgi Kobilarov, Jens Lehmann, Richard Cyganiak, and Zachary G. Ives. 2007. DBpedia: A Nucleus for a Web of Open Data. In ISWC. 722--735.Google ScholarDigital Library
- Krisztian Balog. 2018. Entity-oriented search .Springer Open.Google Scholar
- Debayan Banerjee, Mohnish Dubey, Debanjan Chaudhuri, and Jens Lehmann. [n.d.]. Joint Entity and Relation Linking using EARL. ( [n.,d.]).Google Scholar
- Kurt D. Bollacker, Colin Evans, Praveen Paritosh, Tim Sturge, and Jamie Taylor. 2008. Freebase: a collaboratively created graph database for structuring human knowledge. In ACM SIGMOD. 1247--1250.Google Scholar
- Yixin Cao, Lei Hou, Juanzi Li, and Zhiyuan Liu. 2018. Neural Collective Entity Linking. arxiv: 1811.08603 http://arxiv.org/abs/1811.08603Google Scholar
- Alberto Cetoli, Stefano Bragaglia, Andrew D O'Harney, Marc Sloan, and Mohammad Akbari. 2019. A Neural Approach to Entity Linking on Wikidata. In European Conference on Information Retrieval. Springer, 78--86.Google ScholarDigital Library
- Antonin Delpeuch. 2019. OpenTapioca: Lightweight Entity Linking for Wikidata. arXiv preprint arXiv:1904.09131 (2019).Google Scholar
- Dennis Diefenbach, Thomas Tanon, Kamal Singh, and Pierre Maret. 2017. Question answering benchmarks for wikidata.Google Scholar
- Mohnish Dubey, Debayan Banerjee, Abdelrahman Abdelkawi, and Jens Lehmann. 2019. Lc-quad 2.0: A large dataset for complex question answering over wikidata and dbpedia. In International Semantic Web Conference. Springer, 69--78.Google ScholarDigital Library
- Paolo Ferragina and Ugo Scaiella. 2010. TAGME: on-the-fly annotation of short text fragments (by wikipedia entities). In Proceedings of the 19th ACM Conference on Information and Knowledge Management, CIKM 2010, Toronto, Ontario, Canada, October 26--30, 2010. 1625--1628.Google ScholarDigital Library
- Octavian-Eugen Ganea and Thomas Hofmann. 2017. Deep Joint Entity Disambiguation with Local Neural Attention. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, EMNLP 2017, Copenhagen, Denmark, September 9--11, 2017. 2619--2629.Google ScholarCross Ref
- Clinton Gormley and Zachary Tong. 2015. Elasticsearch: The Definitive Guide: A Distributed Real-Time Search and Analytics Engine ." O'Reilly Media, Inc.".Google Scholar
- Stefan Heindorf, Martin Potthast, Benno Stein, and Gregor Engels. 2016. Vandalism detection in wikidata. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management. 327--336.Google ScholarDigital Library
- Johannes Hoffart, Mohamed Amir Yosef, Ilaria Bordino, Hagen Fü rstenau, Manfred Pinkal, Marc Spaniol, Bilyana Taneva, Stefan Thater, and Gerhard Weikum. 2011. Robust Disambiguation of Named Entities in Text. In EMNLP 2011. 782--792.Google ScholarDigital Library
- Emrah Inan and Oguz Dikenelli. 2018. A Sequence Learning Method for Domain-Specific Entity Linking. In Proceedings of the Seventh Named Entities Workshop (Melbourne, Australia). Association for Computational Linguistics, 14--21. http://aclweb.org/anthology/W18--2403Google ScholarCross Ref
- Heng Ji. 2019. Entity Discovery and Linking and Wikification Reading List. http://nlp.cs.rpi.edu/kbp/2014/elreading.htmlGoogle Scholar
- Nikolaos Kolitsas, Octavian-Eugen Ganea, and Thomas Hofmann. 2018. End-to-End Neural Entity Linking. In Proceedings of the 22nd Conference on Computational Natural Language Learning. 519--529.Google ScholarCross Ref
- Changsung Moon, Paul Jones, and Nagiza F Samatova. 2017. Learning entity type embeddings for knowledge graph completion. In Proceedings of the 2017 ACM on conference on information and knowledge management. 2215--2218.Google ScholarDigital Library
- Isaiah Onando Mulang, Kuldeep Singh, Akhilesh Vyas, Saeedeh Shekarpour, Ahmad Sakor, Maria Esther Vidal, Soren Auer, and Jens Lehmann. 2020. Encoding Knowledge Graph Entity Aliases in an Attentive Neural Networks for Wikidata Entity Linking. In WISE (to appear) (2020).Google Scholar
- Jonathan Raphael Raiman and Olivier Michel Raiman. 2018. DeepType: multilingual entity linking by neural type system evolution. In Thirty-Second AAAI Conference on Artificial Intelligence.Google ScholarCross Ref
- Ridho Reinanda, Edgar Meij, and Maarten de Rijke. 2016. Document Filtering for Long-tail Entities. In Proceedings of the 25th ACM International Conference on Information and Knowledge Management, CIKM 2016, Indianapolis, IN, USA, October 24--28, 2016. ACM, 771--780. https://doi.org/10.1145/2983323.2983728Google ScholarDigital Library
- Michael Röder, Ricardo Usbeck, and Axel-Cyrille Ngonga Ngomo. 2018. Gerbil--benchmarking named entity recognition and linking consistently. Semantic Web, Vol. 9, 5 (2018), 605--625.Google ScholarDigital Library
- Ahmad Sakor, Isaiah Onando Mulang, Kuldeep Singh, Saeedeh Shekarpour, Maria Esther Vidal, Jens Lehmann, and Sören Auer. 2019. Old is gold: linguistic driven approach for entity and relation linking of short text. In Proceedings of the 2019 NAACL HLT (Long Papers). 2336--2346.Google ScholarCross Ref
- W. Shen, J. Wang, and J. Han. 2015. Entity Linking with a Knowledge Base: Issues, Techniques, and Solutions. IEEE Transactions on Knowledge and Data Engineering, Vol. 27, 2 (2015), 443--460.Google ScholarCross Ref
- Kuldeep Singh, Ioanna Lytra, Arun Sethupat Radhakrishna, Saeedeh Shekarpour, Maria-Esther Vidal, and Jens Lehmann. 2018a. No One is Perfect: Analysing the Performance of Question Answering Components over the DBpedia Knowledge Graph. arXiv:1809.10044 (2018).Google Scholar
- Kuldeep Singh, Arun Sethupat Radhakrishna, Andreas Both, Saeedeh Shekarpour, Ioanna Lytra, Ricardo Usbeck, Akhilesh Vyas, Akmal Khikmatullaev, Dharmen Punjani, Christoph Lange, Maria-Esther Vidal, Jens Lehmann, and Sö ren Auer. 2018b. Why Reinvent the Wheel: Let's Build Question Answering Systems Together. In Web Conference. 1247--1256.Google ScholarDigital Library
- Daniil Sorokin and Iryna Gurevych. 2018. Mixing Context Granularities for Improved Entity Linking on Question Answering Data across Entity Categories. In Proceedings of the Seventh Joint Conference on Lexical and Computational Semantics. 65--75.Google ScholarCross Ref
- Denny Vrandecic. 2012. Wikidata: a new platform for collaborative data collection. In Proceedings of the 21st World Wide Web Conference, WWW 2012, Lyon, France, April 16--20, 2012 (Companion Volume). ACM, 1063--1064. https://doi.org/10.1145/2187980.2188242Google ScholarDigital Library
- Edwin Williams. 1981. On the notions" Lexically related" and" Head of a word". Linguistic inquiry, Vol. 12, 2 (1981), 245--274.Google Scholar
- Ikuya Yamada, Hiroyuki Shindo, Hideaki Takeda, and Yoshiyasu Takefuji. 2016a. Joint Learning of the Embedding of Words and Entities for Named Entity Disambiguation. In CoNLL 2016, Yoav Goldberg and Stefan Riezler (Eds.). ACL, 250--259.Google ScholarCross Ref
- Ikuya Yamada, Hiroyuki Shindo, Hideaki Takeda, and Yoshiyasu Takefuji. 2016b. Joint Learning of the Embedding of Words and Entities for Named Entity Disambiguation. CoRR, Vol. abs/1601.01343 (2016).Google Scholar
- Xiyuan Yang, Xiaotao Gu, Sheng Lin, Siliang Tang, Yueting Zhuang, Fei Wu, Zhigang Chen, Guoping Hu, and Xiang Ren. 2019. Learning Dynamic Context Augmentation for Global Entity Linking. In EMNLP-IJCNLP 2019, , Kentaro Inui, Jing Jiang, Vincent Ng, and Xiaojun Wan (Eds.). 271--281.Google ScholarCross Ref
- Yi Yang and Ming-Wei Chang. 2015. S-MART: Novel Tree-based Structured Learning Algorithms Applied to Tweet Entity Linking. In ACL- IJCNLP (Volume 1: Long Papers). 504--513.Google Scholar
- Zi Yang, Elmer Gardu n o, Yan Fang, Avner Maiberg, Collin McCormack, and Eric Nyberg. 2013. Building optimal information systems automatically: configuration space exploration for biomedical information systems. In 22nd ACM CIKM'13, San Francisco, USA. ACM, 1421--1430.Google ScholarDigital Library
- Xinbo Zhang and Lei Zou. 2018. IMPROVE-QA: An Interactive Mechanism for RDF Question/Answering Systems. In Proceedings of the 2018 International Conference on Management of Data, SIGMOD Conference 2018, Houston, TX, USA, June 10--15, 2018. 1753--1756. https://doi.org/10.1145/3183713.3193555Google ScholarDigital Library
Index Terms
- Falcon 2.0: An Entity and Relation Linking Tool over Wikidata
Recommendations
Cross-Evaluation of Entity Linking and Disambiguation Systems for Clinical Text Annotation
SEMANTiCS 2016: Proceedings of the 12th International Conference on Semantic SystemsIn this paper we study whether state-of-the-art techniques for multi-domain and multilingual entity linking can be ported to the clinical domain. To do so, we compare two known entity linking systems, BabelFly and TagMe, that leverage on Wikipedia and ...
WeDGeM: A Domain-Specific Evaluation Dataset Generator for Multilingual Entity Linking Systems
Web Information Systems Engineering – WISE 2017AbstractEntity Linking is the task to annotate ambiguous mentions in an unstructured text to the referent entities in the given knowledge base. To evaluate these approaches, there are a vast amount of general purpose benchmark datasets. However, it is ...
Wikidata based Location Entity Linking
ICSCA '20: Proceedings of the 2020 9th International Conference on Software and Computer ApplicationsOnline news reading has become general among people and suggesting relevant news articles to readers is a non-trivial task. News recommender systems (NRS) are built to provide appropriate stories to readers based on their interest. News articles usually ...
Comments