ABSTRACT
Multi-faceted book search engine presents diverse category-style options to allow users to refine search results without re-entering a query. In this paper, we propose a novel multi-faceted book search engine that utilizes users' query-related latent intents mined from click-through logs as multiple facets for books. The latent query intents can be effectively and efficiently discovered by applying the Sparse Latent Semantic Analysis (LSA) model to users' query and clicking behaviors in the click-through logs. This paper presents the details to improve the multi-faceted book search by incorporating the compact representation of query-intent-book relationships generated by Sparse LSA into the off-line and online processing procedures. The specificity of latent query intents can be flexibly changed by adjusting the sparsity level of projection matrix in the Sparse LSA model. We evaluated our approach on CADAL click-through logs containing 45,892 queries and 164,822 books. The experimental results show the Sparse LSA model with more sparse projection matrix tends to discover the more specific latent query intents. The latent query intents suggested by our approach usually gain the high user satisfaction ratio.
- R. Baeza-yates, C. Hurtado, and M. Mendoza. Query recommendation using query logs in search engines.In EDBT, 2004. Google ScholarDigital Library
- D. Beeferman and A. Berger. Agglomerative clustering of a search engine query log.In KDD, 2000. Google ScholarDigital Library
- A. Z. Broder. A taxonomy of web search. SIGIR Forum, 36(2):3--10, 2002. Google ScholarDigital Library
- L. Calderon-Benavides, C. Gonzalez-Caro, and R. Baeza-Yates. Towards a deeper understanding of the user's query intent.SIGIR, 2010.Google Scholar
- X. Chen, Y. Qi, B. Bai, Q. Lin, and J. Carbonell. Sparse latent semantic analysis. SIAM 2011 International Conference on Data Mining, 2011.Google ScholarCross Ref
- S.Deerwester, S.T.Dumais,G.W.Furnas, T. Landauer, and R. Harshman. Indexing by latent semantic analysis.Journal of the American Society for Information Science, 41, 1990.Google Scholar
- J. English, M. Hearst, R. Sinha, K. Swearingen, and K.-P. Yee. Hierarchical faceted metadata in site search interfaces. CHI '02 extended abstracts on Human factors in computing systems, 2002. Google ScholarDigital Library
- B. M. Fonseca, P. Golgher, and et al. Concept-based interactive query expansion. Proceedings of the 14th ACM international conference on Information and knowledge management, 2005. Google ScholarDigital Library
- G.H.Golub, F.T.Luk, and M.L.Overton. A block lanczos method for computing the singular values and corresponding vectors of a matrix. ACM Trans. Math. Software, 7:149--169, 1981. Google ScholarDigital Library
- T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning. Springer, 2001.Google ScholarCross Ref
- H.Cui, J.-R.Wen, J.-Y.Nie, and W.-Y.Ma. Probabilistic query expansion using query logs. In Proceedings of International World Wide Web Conference, pages 325--332, 2002. Google ScholarDigital Library
- http://lucene.apache.org/.Google Scholar
- C. K. Huang, L. F. Chien, and Y. J. Oyang. Relevant term suggestion in interactive web search based on contextual information in query session logs.Journal of the American Society for Information Science and Technology, 54(7):638--649, May 2003. Google ScholarDigital Library
- T. H. J. Friedman and R. Tibshirani. Regularized paths for generalized linear models via coordinate descent.Journal of Statistical Software, 33(1), 2010.Google ScholarCross Ref
- B. J. Jansen, D. L. Booth, and A. Spink. Determining the informational, navigational,and transactional intent of web queries. Information Processing and Management, 44(3):1251--1266, 2008. Google ScholarDigital Library
- J.K. Cullum, R.A. Willoughby, and M.Lake. A lanczos algorithm for computing singular values and vectors of large matrices. SIAM J. Sci.Statist. Comput, 4(2):197--215, 1983.Google ScholarDigital Library
- R. Jones, B. Rey, and O. Madani. Generating query substitutions. In WWW, 2006. Google ScholarDigital Library
- R. M. Larsen. Lanczos bidiagonalization with partial reorthogonalization. Department of Computer Science, Aarhus University, Technical report, DAIMI PB-357, code available at http://soi.stanford.edu/rmunk/PROPACK/ (1998).Google Scholar
- U. Lee, Z. Liu, and J. Cho. Automatic identification of user goals in web search.In Proc. of the WWW '05, pages 391--400, 2005. Google ScholarDigital Library
- K. W.-T. Leung, D. L. Lee, and W.-C. Lee. Personalized web search with location preferences. 2010 IEEE 26th International Conference on Data Engineering (ICDE), pages 701--712, March 2010.Google ScholarCross Ref
- Z. Liao, J. Yang, C. Fu, and G. Zhang. Integrating web videos for faceted search based on duplicates, contexts and rules.Intelligent Information Processing'10, pages 203--212, 2010.Google Scholar
- Y. Liu, M. Zhang, L. Ru, and S. Ma. Automatic query type identification based on click through information. In Proc. of the AIRS '06, pages 593--600, 2006. Google ScholarDigital Library
- M.W.Berry. Large-scale sparse singular value computations. International Journal of Supercomputer Applications, 1992.Google ScholarDigital Library
- J. rong Wen, J.-Y. Nie, and H.-H. Zhang. Clustering user queries of a search engine.In Proceedings of the International World Wide Web Conference, pages 162--168, 2001. Google ScholarDigital Library
- D. E. Rose and D. Levinson. Understanding user goals in web search. In Proc. of the WWW'04, pages 13--19, 2004. Google ScholarDigital Library
- R. Tibshirani. Regression shrinkage and selection via the lasso.Journal of the Royal Statistical Society, Series B(58):267--288, 1996.Google Scholar
- J. Xu and W. B. Croft. Improving the effectiveness of information retrieval with local context analysis. ACM Transactions on Information System, 18(1):79--112, 2000. Google ScholarDigital Library
- G.-R. Xue, H.-J. Zeng, Z. Chen, Y. Yu, W.-Y. Ma, W. Xi, and W. Fan. Optimizing web search using web click-through data.In Proceedings of ACM CIKM Conference, pages 118--126, 2004. Google ScholarDigital Library
- K.-P. Yee, K. Swearingen, K. Li, and M. Hearst. Faceted metadata for image search and browsing. Proceedings of the SIGCHI conference on Human factors in computing systems, 2003. Google ScholarDigital Library
- Y. Zhang, X. Wang, H. Yu, R. Li, B. Wei, and J. Pan. When personalization meets socialization: an icadal approach. JCDL 2011, pages 459--460, 2011. Google ScholarDigital Library
- Q. Zhao, S. C. H. Hoi, T.-Y. Liu, S. S. Bhowmick, M. R. Lyu, and W.-Y. Ma. Time-dependent semantic similarity measure of queries using historical click-through data. WWW, pages 543--552, 2006. Google ScholarDigital Library
Index Terms
- Improving multi-faceted book search by incorporating sparse latent semantic analysis of click-through logs
Recommendations
Improving Ranking Consistency for Web Search by Leveraging a Knowledge Base and Search Logs
CIKM '15: Proceedings of the 24th ACM International on Conference on Information and Knowledge ManagementIn this paper, we propose a new idea called ranking consistency in web search. Relevance ranking is one of the biggest problems in creating an effective web search system. Given some queries with similar search intents, conventional approaches typically ...
Impact of query intent and search context on clickthrough behavior in sponsored search
Implicit feedback techniques may be used for query intent detection, taking advantage of user behavior to understand their interests and preferences. In sponsored search, a primary concern is the user's interest in purchasing or utilizing a commercial ...
Re-ranking search results using query logs
CIKM '06: Proceedings of the 15th ACM international conference on Information and knowledge managementThis work addresses two common problems in search, frequently occurring with underspecified user queries: the top-ranked results for such queries may not contain documents relevant to the user's search intent, and fresh and relevant pages may not get ...
Comments