skip to main content
10.1145/2600428.2609495acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
poster

A burstiness-aware approach for document dating

Published:03 July 2014Publication History

ABSTRACT

A large number of mainstream applications, like temporal search, event detection, and trend identification, assume knowledge of the timestamp of every document in a given textual collection. In many cases, however, the required timestamps are either unavailable or ambiguous. A charac- teristic instance of this problem emerges in the context of large repositories of old digitized documents. For such doc- uments, the timestamp may be corrupted during the digiti- zation process, or may simply be unavailable. In this paper, we study the task of approximating the timestamp of a doc- ument, so-called document dating. We propose a content- based method and use recent advances in the domain of term burstiness, which allow it to overcome the drawbacks of pre- vious document dating methods, e.g. the fix time partition strategy. We use an extensive experimental evaluation on different datasets to validate the efficacy and advantages of our methodology, showing that our method outperforms the state of the art methods on document dating.

References

  1. J. Allan, R. Papka, and V. Lavrenko. On-line new event detection and tracking. In Proceedings of SIGIR 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. N. Chambers. Labeling documents with timestamps: Learning from their time expressions. In Proceedings of ACL 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. F. de Jong, H. Rode, and D. Hiemstra. Temporal language models for the disclosure of historical text. In Proceedings of AHC 2005.Google ScholarGoogle Scholar
  4. R. Jones, and F. Diaz. Temporal Profiles of Queries. In ACM Trans. Inf. Syst., 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. N. Kanhabua and K. Nørvåg. Improving Temporal Language Models For Determining Time of Non-Timestamped Documents. In Proceedings of ECDL 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. T. Lappas, B. Arai, M. Platakis, D. Kotsakos, and D. Gunopulos. On burstiness-aware search for document sequences. In Proceedings of SIGKDD 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. M.-H. Peetz, E. Meij, and M. de Rijke. Using temporal bursts for query modeling. In Inf. Retr., 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. X. Wan. TimedTextRank: adding the temporal dimension to multi-document summarization. In Proceedings of SIGIR 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A burstiness-aware approach for document dating

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      SIGIR '14: Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval
      July 2014
      1330 pages
      ISBN:9781450322577
      DOI:10.1145/2600428

      Copyright © 2014 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 3 July 2014

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • poster

      Acceptance Rates

      SIGIR '14 Paper Acceptance Rate82of387submissions,21%Overall Acceptance Rate792of3,983submissions,20%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader