skip to main content
10.1145/2237867.2237871acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Named entity recognition and disambiguation using linked data and graph-based centrality scoring

Authors Info & Claims
Published:20 May 2012Publication History

ABSTRACT

Named Entity Recognition (NER) is a subtask of information extraction and aims to identify atomic entities in text that fall into predefined categories such as person, location, organization, etc. Recent efforts in NER try to extract entities and link them to linked data entities. Linked data is a term used for data resources that are created using semantic web standards such as DBpedia. There are a number of online tools that try to identify named entities in text and link them to linked data resources. Although one can use these tools via their APIs and web interfaces, they use different data resources and different techniques to identify named entities and not all of them reveal this information. One of the major tasks in NER is disambiguation that is identifying the right entity among a number of entities with the same names; for example "apple" standing for both "Apple, Inc." the company and the fruit. We developed a similar tool called NERSO, short for Named Entity Recognition Using Semantic Open Data, to automatically extract named entities, disambiguating and linking them to DBpedia entities. Our disambiguation method is based on constructing a graph of linked data entities and scoring them using a graph-based centrality algorithm. We evaluate our system by comparing its performance with two publicly available NER tools. The results show that NERSO performs better.

References

  1. Sinha, R., Mihalcea, R. 2007. Unsupervised graph-based word sense disambiguation using measures of word semantic similarity. In Proceedings of the IEEE International Conference on Semantic Computing (ICSC 2007). Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Gentile, A., Zhang, Z., Xia, L. 2009. Graph-based semantic relatedness for named entity disambiguation. In Proceedings of International Conference on Software, Services & Semantic Technologies, 2009.Google ScholarGoogle Scholar
  3. Gerber, A., Gao, L. 2011. A Scoping Study of (Who, What, When, Where) Semantic Tagging Services. Research report Public Release February 2011, eResearch Lab, The University of QueenslandGoogle ScholarGoogle Scholar
  4. Hassell, J., Aleman-Meza, B. 2006. Ontology-driven automatic entity disambiguation in unstructured text. In Proc. 5th International Semantic Web Conference (ISWC), volume 4273 of LNCS, pp. 44--57, Athens, GA, 2006 Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Cucerzan, S. 2007. Large-scale named entity disambiguation based on Wikipedia data. In Proc. of Empirical Methods in Natural Language Processing Conference on Computational Natural Language Learning 2007, pp. 708--716, 2007.Google ScholarGoogle Scholar
  6. Ni, Y., Zhang, L., Qiu, Z., Wang, C. 2010. Enhancing the open-domain classification of named entity using linked open data. In Proc. 9th International Semantic Web Conference (ISWC 2010), pp. 566--581, Shanghai, China, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Fogarolli, A. 2009. Word Sense Disambiguation Based on Wikipedia Link Structure. IEEE International Conference on Semantic Computing, pp. 77--82, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Han, X., Zhao, J. 2009. Named entity disambiguation by leveraging Wikipedia semantic knowledge. In Proc. of the 18th ACM Conference on Information and Knowledge Management, (CIKM 2009), pp. 215--224, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Mendes, P. N., Jakob, M., García-Silva, A., Bizer, C. 2011. DBpedia Spotlight: Shedding Light on the Web of Documents. Proceedings of the 7th International Conference on Semantic Systems (I-Semantics). Graz, Austria, 7--9 September 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Bunescu, R., Pasca, M. 2006. Using encyclopedic knowledge for named entity disambiguation. In Proc. of EACL, pp. 9--16.Google ScholarGoogle Scholar
  11. Kulkarni, S., Singh, A., Ramakrishnan, G., Chakrabarti, S. 2009. Collective annotation of Wikipedia entities in web text. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD 2009), pp. 457--466, New York, NY, USA, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Hoffart, J., Yosef, M., A., Bordino, I., Furstenau, H., Pinkal, M., Spaniol, M., Taneva, B., Thater, S., Weikum. G. 2011. Robust Disambiguation of Named Entities in Text. In Proc. of Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 782--792, July 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Ferragina, P., Scaiella, U. 2010. Tagme: on-the-fly annotation of short text fragments (by Wikipedia entities). In Proc. of the 19th ACM Conference on Information and Knowledge Management, (CIKM 2010), 1625--1628. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Bizer, C., Lehmann, J., Kobilarov, G., Auer, S., Becker, C., Cyganiak, R., Hellmann, S. 2009. DBpedia - A crystallization point for the Web of Data. Journal of Web Semantics: Science, Services and Agents on the World Wide Web, 7(3), 154--165, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Bizer, C., Heath, T., Berners-Lee, T. 2009. Linked data-the story so far. Int. Journal on Semantic Web and Information Systems, Special Issue on Linked Data, 4(2), 1--22, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  16. Mihalcea, R., Csomai, A. 2007. Wikify!: linking documents to encyclopedic knowledge. In Proc. of the 16th ACM Conference on Information and Knowledge management (CIKM 2007), Lisbon, Portugal, pp. 233--242, 2007 Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Named entity recognition and disambiguation using linked data and graph-based centrality scoring

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            SWIM '12: Proceedings of the 4th International Workshop on Semantic Web Information Management
            May 2012
            73 pages
            ISBN:9781450314466
            DOI:10.1145/2237867

            Copyright © 2012 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 20 May 2012

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader