skip to main content
10.1145/2682862.2682869acmotherconferencesArticle/Chapter ViewAbstractPublication PagesadcsConference Proceedingsconference-collections
research-article

Examining New Event Detection

Authors Info & Claims
Published:26 November 2014Publication History

ABSTRACT

We examine the accuracy of first story detection on traditional news collections and on a re-purposed source of academic material. The impact on accuracy of detecting an early rather than the first story is examined, showing that accuracy increases under a broader time window, however, the increases on some collections are small. Even on collections where the increase is large, many new events are still missed and there remains an underlying challenge to detecting new events. An analysis of temporal and vocabulary profiles of topics within their source collections is conducted. Analysis of the results establish the underlying causes of the patterns seen in the experimental results with respect to the different source types and performance. The usefulness of new criteria for new event detection and success across source types is discussed.

References

  1. TREC Genomics Track. available at http://ir.ohsu.edu/genomics/.Google ScholarGoogle Scholar
  2. C. Aggarwal and K. Subbian. Event Detection in Social Streams. In SIAM International Conference on Data Mining, pages 624--635, 2012.Google ScholarGoogle Scholar
  3. A. Ahmed, Q. Ho, J. Eisenstein, E. Xing, A. Smola, and C. Teo. Unified analysis of streaming news. In WWW '11, pages 267--276. ACM, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. S. Ahmed, R. Bhindwale, and H. Davulcu. Tracking terrorism news threads by extracting event signatures. In ISI '09, pages 182--184. IEEE, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. J. Allan, J. Carbonell, G. Doddington, J. Yamron, and Y. Yang. Topic detection and tracking pilot study: final report. In DARPA Broadcast News Transcription and Understanding Workshop, pages 194--218, 1998.Google ScholarGoogle Scholar
  6. J. Allan, H. Jin, M. Rajman, C. Wayne, D. Gildea, V. Lavrenko, R. Hoberman, and D. Caputo, editors. Topic-based novelty detection 1999 summer workshop at CLSP final report, 1999. available at http://www.clsp.jhu.edu/ws99/final/Topic-based.pdf.Google ScholarGoogle Scholar
  7. J. Allan and G. Kumaran. Text classification and named entities for new event detection. In SIGIR '04, pages 297--304, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. J. Allan, V. Lavrenko, and H. Hin. First story detection in TDT is hard. In CIKM '00, pages 374--318, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. J. Allan, V. Lavrenko, and R.Swan. Explorations within topic tracking and detection. In J. Allan, editor, Topic Detection and Tracking; Event-based Information Organization, pages 197--224. Kluwer Academic Publishers, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. S. Brandt. Statistical and computational methods in data analysis. Amsterdam: North-Holland, 1976, 2nd rev. ed., 5th repr. 1989, 1, 1989.Google ScholarGoogle Scholar
  11. T. Brants and F. Chen. A System for new event detection. In SIGIR '03, pages 330--337, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. J. Cao, C. Ngo, Y. Zhang, D. Zhang, and L. Ma. Trajectory-based visualization of web video topics. In MULTIMEDIA '10, pages 1639--1642. ACM, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. P. Clarkson and R. Rosenfeld. Statistical language modeling using the CMU-Cambridge toolkit. In Fifth European Conference on Speech Communication and Technology, pages 2707--2710, 1997.Google ScholarGoogle ScholarCross RefCross Ref
  14. J. Fiscus and G. Doddington. Topic detection and tracking overview. In J. Allan, editor, Topic Detection and Tracking; Event-based Information Organization, pages 17--30. Kluwer Academic Publishers, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. F. Jelinek, R. L. Mercer, L. R. Bahl, and J. K. Baker. Perplexity - a measure of the difficulty of speech recognition tasks. The Journal of the Acoustical Society of America, 62(1):63, 1977.Google ScholarGoogle ScholarCross RefCross Ref
  16. H. Jin, R. Schwartz, S. Sista, and F. Walls. Topic tracking for radio, TV broadcast, and newswire. In DARPA Broadcast News Workshop, pages 199--204, 1999.Google ScholarGoogle ScholarCross RefCross Ref
  17. T. Leek, R. Schwartz, and S. Sista. Probabilistic approaches to Topic Detection and Tracking. In J. Allan, editor, Topic Detection and Tracking; Event-based Information Organization, pages 67--85. Kluwer Academic Publishers, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. J. Lin, R. Snow, and W. Morgan. Smoothing techniques for adaptive online language models: topic tracking in tweet streams. In SIGKDD '11, pages 422--429. ACM, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. I. Ounis, G. Amati, V. Plachouras, B. He, C. Macdonald, and C. Lioma. Terrier: A high performance and scalable information retrieval platform. In OSIR '06, volume 2006, pages 18--25, 2006.Google ScholarGoogle Scholar
  20. M. Porter. An algorithm for suffix stripping. Program, 14(3):130--137, 1980.Google ScholarGoogle ScholarCross RefCross Ref
  21. G. Smith. Video scene detection using closed caption text. Virginia Commonwealth University, 2009. available at https://digarchive.library.vcu.edu/handle/10156/2649.Google ScholarGoogle Scholar
  22. N. Stokes and J. Carthy. Combining semantic and syntactic document classifiers to improve first story detection. In SIGIR '01, pages 424--425, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. S. Strassel, J. Kong, and D. Graff. TDT4 Multilingual Text and Annotations. Linguistic Data Consortium, Philadelphia, 2005.Google ScholarGoogle Scholar
  24. K. Zhang, J. Zi, and L. Wu. New event detection based on indexing-tree and named entity. In SIGIR '07, pages 215--222, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Other conferences
    ADCS '14: Proceedings of the 19th Australasian Document Computing Symposium
    November 2014
    132 pages
    ISBN:9781450330008
    DOI:10.1145/2682862

    Copyright © 2014 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 26 November 2014

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article
    • Research
    • Refereed limited

    Acceptance Rates

    Overall Acceptance Rate30of57submissions,53%

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader