ABSTRACT
We propose a new document summarization algorithm which is personalized. The key idea is to rely on the attention (reading) time of individual users spent on single words in a document as the essential clue. The prediction of user attention over every word in a document is based on the user's attention during his previous reads, which is acquired via a vision-based commodity eye-tracking mechanism. Once the user's attentions over a small collection of words are known, our algorithm can predict the user's attention over every word in the document through word semantics analysis. Our algorithm then summarizes the document according to user attention on every individual word in the document. With our algorithm, we have developed a document summarization prototype system. Experiment results produced by our algorithm are compared with the ones manually summarized by users as well as by commercial summarization software, which clearly demonstrates the advantages of our new algorithm for user-oriented document summarization.
- Bibliography summarization papers. http://www.summarization.com/summ.pdf, last updated on October 20, 2008. last visited on December 11, 2008.Google Scholar
- R. A. Baeza-Yates and B. Ribeiro-Neto. Modern Information Retrieval. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA, 1999. Google ScholarDigital Library
- A. Bulling, J. A.Ward, H. Gellersen, and G. Tröster. Robust recognition of reading activity in transit using wearable electrooculography. In Pervasive '08: Proceedings of the 6th International Conference on Pervasive Computing, pages 19--37, 2008. Google ScholarDigital Library
- A. Bulling, D. Roggen, and G. Tröster. It's in your eyes: towards context-awareness and mobile HCI using wearable EOG Goggles. In UbiComp '08: Proceedings of the 10th International Conference on Ubiquitous Computing, pages 84--93, New York, NY, USA, 2008. ACM. Google ScholarDigital Library
- E. H. Chi, M. Gumbrecht, and L. Hong. Visual foraging of highlighted text: An eye-tracking study. In HCII '07: Proceedings of HCI International Conference, pages 589--598, 2007. Google ScholarDigital Library
- The New York Times Company. The New York Times, http://www.nytimes.com/, last visited on December 11, 2008.Google Scholar
- T. Darrell, N. Checka, A. Oh, and L. Morency. Exploring vision-based interfaces: How to use your head in dual pointing tasks. MIT AI Memo 2002-001, 2002.Google Scholar
- Z. Dou, R. Song, and J.-R. Wen. A large-scale evaluation and analysis of personalized search strategies. In WWW '07: Proceedings of International Conference on World Wide Web, pages 581--590, New York, NY, USA, 2007. ACM. Google ScholarDigital Library
- G. Dupret, V. Murdock, and B. Piwowarski. Web search engine evaluation using clickthrough data and a user model. In WWW '07: Proceedings of International Conference on World Wide Web, Banff, Canada, 2007.Google Scholar
- G. Erkan and D. Radev. Lexrank: Graph-based lexical centrality as salience in text summarization. Journal of Artificial Intelligence Research (JAIR), 22:457--479, 2004. Google ScholarDigital Library
- American Association for the Advancement of Science. Science magazine, http://www.sciencemag.com/, last visited on December 11, 2008.Google Scholar
- S. Fox, K. Karnawat, M. Mydland, S. Dumais, and T. White. Evaluating implicit measures to improve web search. ACM Transactions on Information Systems, 23(2):147--168, 2005. Google ScholarDigital Library
- X. Fu. Evaluating sources of implicit feedback in web searches. In RecSys '07: Proceedings of the 1st ACM International Conference on Recommender Systems, pages 191--194, New York, NY, USA, 2007. ACM. Google ScholarDigital Library
- Y. Gong and X. Liu. Generic text summarization using relevance measure and latent semantic analysis. In SIGIR '01: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 19--25, New York, NY, USA, 2001. ACM. Google ScholarDigital Library
- D. Gorodnichy. Perceptual cursor-a solution to the broken loop problem in vision-based hands-free computer control devices. National Research Council Canada Publication, NRC-48472:1--23, 2006.Google Scholar
- L. A. Granka, T. Joachims, and G. Gay. Eye-tracking analysis of user behavior in www search. In SIGIR '04: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 478--479, New York, NY, USA, 2004. ACM. Google ScholarDigital Library
- Z. Guan and E. Cutrell. An eye tracking study of the effect of target rank on web search. In CHI '07: Proceedings of SIGCHI Conference on Human Factors in Computing Systems, pages 417--420, New York, NY, USA, 2007. ACM. Google ScholarDigital Library
- W. S. A. Halabi, M. Kubat, and M. Tapia. Time spent on a web page is sufficient to infer a user's interest. In IMSA '07: Proceedings of IASTED European Conference, pages 41--46, Anaheim, CA, USA, 2007. ACTA Press. Google ScholarDigital Library
- 19. C.-K. Huang, Y.-J. Oyang, and L.-F. Chien. A contextual term suggestion mechanism for interactive web search. In WI '01: Proceedings of Asia-Pacific Conference on Web Intelligence: Research and Development, pages 272--281, London, UK, 2001. Springer-Verlag. Google ScholarDigital Library
- T. Joachims. A probabilistic analysis of the rocchio algorithm with tfidf for text categorization. In ICML '97: Proceedings of International Conference on Machine Learning, pages 143--151, San Francisco, CA, USA, 1997. Morgan Kaufmann Publishers Inc. Google ScholarDigital Library
- T. Joachims. Optimizing search engines using clickthrough data. In KDD '02: Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 133--142, New York, NY, USA, 2002. ACM. Google ScholarDigital Library
- T. Joachims, L. Granka, B. Pan, H. Hembrooke, and G. Gay. Accurately interpreting clickthrough data as implicit feedback. In SIGIR '05: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 154--161, New York, NY, USA, 2005. ACM. Google ScholarDigital Library
- K. S. Jones. What might be in a summary. In Information Retrieval 93: Von der Modellierung zur Anwendung, pages 9--26, 1993.Google Scholar
- D. Kelly and N. J. Belkin. Reading time, scrolling and interaction: exploring implicit sources of user preferences for relevance feedback. In SIGIR '01 Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 408--409, New York, NY, USA, 2001. ACM. Google ScholarDigital Library
- D. Kelly and N. J. Belkin. Display time as implicit feedback: understanding task effects. In SIGIR '04: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 377--384, New York, NY, USA, 2004. ACM. Google ScholarDigital Library
- K.-N. Kim and R. Ramakrishna. Vision-based eye-gaze tracking for human computer interface. SMC '99: Proceedings of IEEE International Conference on Systems, Man, and Cybernetics, 2:324--329, 1999.Google Scholar
- Y. Li, Z. A. Bandar, and D. Mclean. An approach for measuring semantic similarity between words using multiple information sources. IEEE Transactions on Knowledge and Data Engineering, 15(4):871--882, 2003. Google ScholarDigital Library
- C.-Y. Lin. Training a selection function for extraction. In CIKM '99: Proceedings of the 8th ACM International Conference on Information and Knowledge Management, pages 55--62, New York, NY, USA, 1999. ACM. Google ScholarDigital Library
- Y.-P. Lin, Y.-P. Chao, C.-C. Lin, and J.-H. Chen. Webcam mouse using face and eye tracking in various illumination environments. EMBS '05: Proceedings of 27th IEEE Annual International Conference of Engineering in Medicine and Biology Society, pages 3738--3741, 2005.Google Scholar
- F. Liu, C. Yu, and W. Meng. Personalized web search by mapping user queries to categories. In CIKM '02: Proceedings of the 11th ACM International Conference on Information and Knowledge Management, pages 558--565, New York, NY, USA, 2002. ACM. Google ScholarDigital Library
- Y. Lv, L. Sun, J. Zhang, J.-Y. Nie, W. Chen, and W. Zhang. An iterative implicit feedback approach to personalized search. In ACL '06: Proceedings of International Conference on Computational Linguistics, pages 585--592, Morristown, NJ, USA, 2006. Association for Computational Linguistics. Google ScholarDigital Library
- I. Mani. Advances in Automatic Text Summarization. MIT Press, Cambridge, MA, USA, 1999. Google ScholarDigital Library
- Microsoft. Word (software), http://office.microsoft.com/word/, Microsoft Corporation, last visited on December 11, 2008.Google Scholar
- J. Pitkow, H. Sch¨utze, T. Cass, R. Cooley, D. Turnbull, A. Edmonds, E. Adar, and T. Breuel. Personalized search. Communications of the ACM, 45(9):50--55, 2002. Google ScholarDigital Library
- D. Radev, T. Allison, S. Blair-Goldensohn, J. Blitzer, A. C¸ elebi, S. Dimitrov, E. Drabek, A. Hakim, W. Lam, D. Liu, J. Otterbacher, H. Qi, H. Saggion, S. Teufel, M. Topper, A. Winkel, and Z. Zhang. MEAD--A platform for multidocument multilingual text summarization. In LREC '04: The 2nd International Conference on Language Resources and Evaluation, Lisbon, Portugal, May 2004.Google Scholar
- F. Radlinski and T. Joachims. Query chains: learning to rank from implicit feedback. In KDD '05: Proceedings of ACM SIGKDD International Conference on Knowledge Discovery in Data Mining, pages 239--248, New York, NY, USA, 2005. ACM. Google ScholarDigital Library
- R. W. Reeder, P. Pirolli, and S. K. Card. Webeyemapper and weblogger: tools for analyzing eye tracking data collected in web-use studies. In CHI '01: CHI '01 Extended Abstracts on Human Factors in Computing Systems, pages 19--20, New York, NY, USA, 2001. ACM. Google ScholarDigital Library
- R. Ruddarraju, A. Haro, K. Nagel, Q. T. Tran, I. A. Essa, G. Abowd, and E. D. Mynatt. Perceptual user interfaces using vision-based eye tracking. In ICMI '03: Proceedings of 5th International Conference on Multimodal Interfaces, pages 227--233, New York, NY, USA, 2003. ACM. Google ScholarDigital Library
- G. Salton and C. Buckley. Improving retrieval performance by relevance feedback. Journal of the American Society for Information Science, 41(4):288--297, 1990.Google ScholarCross Ref
- M. Speretta and S. Gauch. Personalized search based on user search histories. In WI '05: Proceedings of IEEE/WIC/ACM International Conference on Web Intelligence, pages 622--628, Washington, DC, USA, 2005. IEEE Computer Society. Google ScholarDigital Library
- M.-C. Su, S.-Y. Su, and G.-D. Chen. A low-cost vision-based human-computer interface for people with severe disabilities. Biomedical Engineering Applications, Basis, and Communications, 17:284--292, 2005.Google Scholar
- R. White, J. M. Jose, and I. Ruthven. Comparing explicit and implicit feedback techniques for web retrieval: Trec-10 interactive track report. In TREC 2001.Google Scholar
- R. White, I. Ruthven, and J. M. Jose. The use of implicit evidence for relevance feedback in web retrieval. In Proceedings of BCS-IRSG European Colloquium on IR Research, pages 93--109, London, UK, 2002. Springer-Verlag. Google ScholarDigital Library
- S. Xu, H. Jiang, and F. C. Lau. Personalized online document, image and video recommendation via commodity eye-tracking. In RecSys '08: Proceedings of the 2nd ACM International Conference on Recommender Systems, pages 83--90, New York, NY, USA, 2008. ACM. Google ScholarDigital Library
- S. Xu, Y. Zhu, H. Jiang, and F. C. M. Lau. A user-oriented webpage ranking algorithm based on user attention time. In AAAI '08: Proceedings of the 23rd AAAI Conference on Artificial Intelligence, pages 1255--1260, 2008, AAAI Press. Google ScholarDigital Library
- J.-Y. Yeh, H.-R. Ke, W.-P. Yang, and I.-H. Meng. Text summarization using a trainable summarizer and latent semantic analysis. Information Processing and Management, 41(1):75--95, 2004. Google ScholarDigital Library
- P. Zielinski. Opengazer: open-source gaze tracker for ordinary webcams (software), Samsung and The Gatsby Charitable Foundation. http://www.inference.phy.cam.ac.uk/opengazer/, last visited on December 11 2008.Google Scholar
Index Terms
- User-oriented document summarization through vision-based eye-tracking
Recommendations
Personalized online document, image and video recommendation via commodity eye-tracking
RecSys '08: Proceedings of the 2008 ACM conference on Recommender systemsWe propose a new recommendation algorithm for online documents, images and videos, which is personalized. Our idea is to rely on the attention time of individual users captured through commodity eye-tracking as the essential clue. The prediction of user ...
Comments-oriented document summarization: understanding documents with readers' feedback
SIGIR '08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrievalComments left by readers on Web documents contain valuable information that can be utilized in different information retrieval tasks including document search, visualization, and summarization. In this paper, we study the problem of comments-oriented ...
Latent dirichlet allocation based multi-document summarization
AND '08: Proceedings of the second workshop on Analytics for noisy unstructured text dataExtraction based Multi-Document Summarization Algorithms consist of choosing sentences from the documents using some weighting mechanism and combining them into a summary. In this article we use Latent Dirichlet Allocation to capture the events being ...
Comments