Skip to main content
Log in

Query-Free News Search

  • Published:
World Wide Web Aims and scope Submit manuscript

Abstract

Many daily activities present information in the form of a stream of text, and often people can benefit from additional information on the topic discussed. TV broadcast news can be treated as one such stream of text; in this paper we discuss finding news articles on the web that are relevant to news currently being broadcast.

We evaluated a variety of algorithms for this problem, looking at the impact of inverse document frequency, stemming, compounds, history, and query length on the relevance and coverage of news articles returned in real time during a broadcast. We also evaluated several postprocessing techniques for improving the precision, including reranking using additional terms, reranking by document similarity, and filtering on document similarity. For the best algorithm, 84–91% of the articles found were relevant, with at least 64% of the articles being on the exact topic of the broadcast. In addition, a relevant article was found for at least 70% of the topics.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. J. Allan, R. Gupta, and V. Khandelwal, “Temporal summaries of news topics,” in Research and Development in Information Retrieval, 2001, pp. 10–18.

  2. E. Brill, “Transformation-based error-driven learning and natural language processing: A case study in part-of-speech tagging,” Computation Linguistics 21(4), 1995, 543–565.

    Google Scholar 

  3. S. Brin, R. Motwani, L. Page, and T. Winograd, “What can you do with a web in your pocket?” Data Engineering Bulletin 21(2), 1998, 37–47.

    Google Scholar 

  4. J. Budzik, K. Hammond, and L. Birnbaum, “Information access in context,” Knowledge Based Systems 14(1–2), 2001, 37–53.

    Google Scholar 

  5. J. Davis, “Intercast dying of neglect,” CNET News, January 29, 1997.

  6. Electronic Industries Alliance, “Transport of internet uniform resource locator (url) information using text-2 (t-2) service,” Technical Report, EIA-746-A, 1998.

  7. E. Frank, G. W. Paynter, I. H. Witten, C. Gutwin, and C. G. Nevill-Manning, “Domain-specific keyphrase extraction,” in IJCAI, 1999, pp. 668–673.

  8. P. Hart and J. Graham, “Query-free information retrieval,” IEEE Expert 12(5), 1997, 32–37.

    Google Scholar 

  9. B. Krulwich and C. Burkey, “Learning user information interests through the extraction of semantically significant phrases,” in AAAI 1996 Spring Symposium on Machine Learning in Information Access, 1996.

  10. H. Lieberman, “Letizia: An agent that assists web browsing,” in C. S. Mellish (ed.), Proceedings of the 14th International Joint Conference on Artificial Intelligence ({IJCAI}-95), 1995, pp. 924–929.

  11. K. Livingston, M. Dredze, K. Hammond, and L. Birnbaum, “Beyond broadcast,” in International Conference on Intelligent User Interfaces, 2003.

  12. P. Maglio, R. Barrett, C. Campbell, and T. Selker, “Suitor: An attentive information system,” in International Conference on Intelligent User Interfaces, 2000.

  13. A. Munoz, “Compound key word generation from document databases using a hierarchical clustering art model,” Intelligent Data Analysis 1(1), 1997.

  14. M. N. Price, G. Golovchinsky, and B. N. Schilit, “Linking by inking: Trailblazing in a paper-like hypertext,” in Proceedings of the Hypertext’98, 1998, pp. 30–39.

  15. B. Rhodes and P. Maes, “Just-in-time information retrieval agents,” IBM Systems Journal 39(3–4), 2000.

  16. B. J. Rhodes, “Just-in-time information retrieval,” Ph.D. Thesis, MIT Media Laboratory, Cambridge, MA, May 2000.

  17. S. Robertson, S. Walker, and M. Beaulieu, “Okapi at TREC-7: automatic ad hoc, filtering, VLC and interactive track,” in Proceedings of the 7th International Text Retrieval Conference (TREC), 1999, pp. 253–264.

  18. G. D. Robson, “Closed captions, V-chip, and other VBI data,” Nuts and Volts, 2000.

  19. G. Salton, The SMART System—Experiments in Automatic Document Processing, Prentice Hall, 1971.

  20. A. M. Steier and R. K. Belew, “Exporting phrases: A statistical analysis of topical language,” in Proceedings of the 2nd Symposium on Document Analysis and Information Retrieval, 1993, pp. 179–190.

  21. P. D. Turney, “Learning algorithms for keyphrase extraction,” Information Retrieval 2(4), 2000, 303–336.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bay-Wei Chang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Henzinger, M., Chang, BW., Milch, B. et al. Query-Free News Search. World Wide Web 8, 101–126 (2005). https://doi.org/10.1007/s11280-004-4870-6

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11280-004-4870-6

Keywords

Navigation