skip to main content
10.1145/2232817.2232864acmconferencesArticle/Chapter ViewAbstractPublication PagesjcdlConference Proceedingsconference-collections
research-article

Improving multi-faceted book search by incorporating sparse latent semantic analysis of click-through logs

Published:10 June 2012Publication History

ABSTRACT

Multi-faceted book search engine presents diverse category-style options to allow users to refine search results without re-entering a query. In this paper, we propose a novel multi-faceted book search engine that utilizes users' query-related latent intents mined from click-through logs as multiple facets for books. The latent query intents can be effectively and efficiently discovered by applying the Sparse Latent Semantic Analysis (LSA) model to users' query and clicking behaviors in the click-through logs. This paper presents the details to improve the multi-faceted book search by incorporating the compact representation of query-intent-book relationships generated by Sparse LSA into the off-line and online processing procedures. The specificity of latent query intents can be flexibly changed by adjusting the sparsity level of projection matrix in the Sparse LSA model. We evaluated our approach on CADAL click-through logs containing 45,892 queries and 164,822 books. The experimental results show the Sparse LSA model with more sparse projection matrix tends to discover the more specific latent query intents. The latent query intents suggested by our approach usually gain the high user satisfaction ratio.

References

  1. R. Baeza-yates, C. Hurtado, and M. Mendoza. Query recommendation using query logs in search engines.In EDBT, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. D. Beeferman and A. Berger. Agglomerative clustering of a search engine query log.In KDD, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. A. Z. Broder. A taxonomy of web search. SIGIR Forum, 36(2):3--10, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. L. Calderon-Benavides, C. Gonzalez-Caro, and R. Baeza-Yates. Towards a deeper understanding of the user's query intent.SIGIR, 2010.Google ScholarGoogle Scholar
  5. X. Chen, Y. Qi, B. Bai, Q. Lin, and J. Carbonell. Sparse latent semantic analysis. SIAM 2011 International Conference on Data Mining, 2011.Google ScholarGoogle ScholarCross RefCross Ref
  6. S.Deerwester, S.T.Dumais,G.W.Furnas, T. Landauer, and R. Harshman. Indexing by latent semantic analysis.Journal of the American Society for Information Science, 41, 1990.Google ScholarGoogle Scholar
  7. J. English, M. Hearst, R. Sinha, K. Swearingen, and K.-P. Yee. Hierarchical faceted metadata in site search interfaces. CHI '02 extended abstracts on Human factors in computing systems, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. B. M. Fonseca, P. Golgher, and et al. Concept-based interactive query expansion. Proceedings of the 14th ACM international conference on Information and knowledge management, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. G.H.Golub, F.T.Luk, and M.L.Overton. A block lanczos method for computing the singular values and corresponding vectors of a matrix. ACM Trans. Math. Software, 7:149--169, 1981. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning. Springer, 2001.Google ScholarGoogle ScholarCross RefCross Ref
  11. H.Cui, J.-R.Wen, J.-Y.Nie, and W.-Y.Ma. Probabilistic query expansion using query logs. In Proceedings of International World Wide Web Conference, pages 325--332, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. http://lucene.apache.org/.Google ScholarGoogle Scholar
  13. C. K. Huang, L. F. Chien, and Y. J. Oyang. Relevant term suggestion in interactive web search based on contextual information in query session logs.Journal of the American Society for Information Science and Technology, 54(7):638--649, May 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. T. H. J. Friedman and R. Tibshirani. Regularized paths for generalized linear models via coordinate descent.Journal of Statistical Software, 33(1), 2010.Google ScholarGoogle ScholarCross RefCross Ref
  15. B. J. Jansen, D. L. Booth, and A. Spink. Determining the informational, navigational,and transactional intent of web queries. Information Processing and Management, 44(3):1251--1266, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. J.K. Cullum, R.A. Willoughby, and M.Lake. A lanczos algorithm for computing singular values and vectors of large matrices. SIAM J. Sci.Statist. Comput, 4(2):197--215, 1983.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. R. Jones, B. Rey, and O. Madani. Generating query substitutions. In WWW, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. R. M. Larsen. Lanczos bidiagonalization with partial reorthogonalization. Department of Computer Science, Aarhus University, Technical report, DAIMI PB-357, code available at http://soi.stanford.edu/rmunk/PROPACK/ (1998).Google ScholarGoogle Scholar
  19. U. Lee, Z. Liu, and J. Cho. Automatic identification of user goals in web search.In Proc. of the WWW '05, pages 391--400, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. K. W.-T. Leung, D. L. Lee, and W.-C. Lee. Personalized web search with location preferences. 2010 IEEE 26th International Conference on Data Engineering (ICDE), pages 701--712, March 2010.Google ScholarGoogle ScholarCross RefCross Ref
  21. Z. Liao, J. Yang, C. Fu, and G. Zhang. Integrating web videos for faceted search based on duplicates, contexts and rules.Intelligent Information Processing'10, pages 203--212, 2010.Google ScholarGoogle Scholar
  22. Y. Liu, M. Zhang, L. Ru, and S. Ma. Automatic query type identification based on click through information. In Proc. of the AIRS '06, pages 593--600, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. M.W.Berry. Large-scale sparse singular value computations. International Journal of Supercomputer Applications, 1992.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. J. rong Wen, J.-Y. Nie, and H.-H. Zhang. Clustering user queries of a search engine.In Proceedings of the International World Wide Web Conference, pages 162--168, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. D. E. Rose and D. Levinson. Understanding user goals in web search. In Proc. of the WWW'04, pages 13--19, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. R. Tibshirani. Regression shrinkage and selection via the lasso.Journal of the Royal Statistical Society, Series B(58):267--288, 1996.Google ScholarGoogle Scholar
  27. J. Xu and W. B. Croft. Improving the effectiveness of information retrieval with local context analysis. ACM Transactions on Information System, 18(1):79--112, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. G.-R. Xue, H.-J. Zeng, Z. Chen, Y. Yu, W.-Y. Ma, W. Xi, and W. Fan. Optimizing web search using web click-through data.In Proceedings of ACM CIKM Conference, pages 118--126, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. K.-P. Yee, K. Swearingen, K. Li, and M. Hearst. Faceted metadata for image search and browsing. Proceedings of the SIGCHI conference on Human factors in computing systems, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Y. Zhang, X. Wang, H. Yu, R. Li, B. Wei, and J. Pan. When personalization meets socialization: an icadal approach. JCDL 2011, pages 459--460, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Q. Zhao, S. C. H. Hoi, T.-Y. Liu, S. S. Bhowmick, M. R. Lyu, and W.-Y. Ma. Time-dependent semantic similarity measure of queries using historical click-through data. WWW, pages 543--552, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Improving multi-faceted book search by incorporating sparse latent semantic analysis of click-through logs

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          JCDL '12: Proceedings of the 12th ACM/IEEE-CS joint conference on Digital Libraries
          June 2012
          458 pages
          ISBN:9781450311540
          DOI:10.1145/2232817

          Copyright © 2012 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 10 June 2012

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate415of1,482submissions,28%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader