Skip to main content

Cross-Document Entity Tracking

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4425))

Abstract

The main focus of current work is to analyze useful features for linking and disambiguating person entities across documents. The more general problem of linking and disambiguating any kind of entity is known as entity detection and tracking (EDT) or noun phrase coreference resolution. EDT has applications in many important areas of information retrieval: clustering results in search engines when looking for a particular person; possibility to answer questions such as “Who was Woodward’s source in the Plame scandal?” with “senior administration official” or “Richard Armitage” and information fusion from multiple documents. In current work person entities are limited to names and nominal entities. We emphasize the linguistic aspect of cross-document EDT: testing novel features useful in EDT across documents, such as the syntactic and semantic characteristics of the entities. The most important class of new features are contextual features, at varying levels of detail: events, related named-entities, and local context. The validity of the features is evaluated on a corpus annotated for cross-document coreference resolution of person names and nominals, and also on a corpus annotated only for names.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bagga, A., Baldwin, B.: Entity-Based Cross-Document Coreferencing Using the Vector Space Model. In: Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and the 17th International Conference on Computational Linguistics (COLING-ACL’98), August 1998, pp. 79–85 (1998)

    Google Scholar 

  2. Li, X., Morie, P., Roth, D.: Robust Reading: Identification and Tracing of Ambiguous Names: Discriminative and Generative Approaches. In: Proceedings of the Annual Meeting of the North American Association of Computational Linguistics (NAACL), pp. 17–24 (2004)

    Google Scholar 

  3. MUC-7 Coreference Task Definition (version 3.0) Message Understanding Conference Proceedings (1998), http://www.itl.nist.gov/iaui/894.02/related_projects/muc/proceedings/muc_7_toc.html

  4. OpenNlp, http://opennlp.sourceforge.net/

  5. Topic Detection and Tracking, http://www.nist.gov/speech/tests/tdt/

  6. WordFreak, http://wordfreak.sourceforge.net/

  7. Felbaum, C. (ed.): WordNet: An Electronic Lexical Database

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Giambattista Amati Claudio Carpineto Giovanni Romano

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer Berlin Heidelberg

About this paper

Cite this paper

Angheluta, R., Moens, MF. (2007). Cross-Document Entity Tracking. In: Amati, G., Carpineto, C., Romano, G. (eds) Advances in Information Retrieval. ECIR 2007. Lecture Notes in Computer Science, vol 4425. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71496-5_65

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-71496-5_65

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-71494-1

  • Online ISBN: 978-3-540-71496-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics