ABSTRACT
Accurately and effectively detecting the locations where search queries are truly about has huge potential impact on increasing search relevance. In this paper, we define a search query's dominant location (QDL) and propose a solution to correctly detect it. QDL is geographical location(s) associated with a query in collective human knowledge, i.e., one or few prominent locations agreed by majority of people who know the answer to the query. QDL is a subjective and collective attribute of search queries and we are able to detect QDLs from both queries containing geographical location names and queries not containing them. The key challenges to QDL detection include false positive suppression (not all contained location names in queries mean geographical locations), and detecting implied locations by the context of the query. In our solution, a query is recursively broken into atomic tokens according to its most popular web usage for reducing false positives. If we do not find a dominant location in this step, we mine the top search results and/or query logs (with different approaches discussed in this paper) to discover implicit query locations. Our large-scale experiments on recent MSN Search queries show that our query location detection solution has consistent high accuracy for all query frequency ranges.
- Amitay, E., Har'El, N., Sivan R., and Soffer, A. Web-a-where: geotagging web content. Proc. 27th Annual International Conference on Research and Development in Information Retrieval (SIGIR'04), Jul. 2004, Sheffield, UK, 273--280. Google ScholarDigital Library
- Banko, M., Brill, E., Dumais S., and Lin J. AskMSR: Questing answering using the worldwide web. Proc. 2002 AAAI Spring Symposium on Mining Answers from Texts and Knowledge Bases, Mar 2003, Palo Alto, CA, USA, 7--8.Google Scholar
- Bourigault, D. Surface grammatical analysis for extraction of terminological noun phrases. Proc. 14th COLING, 1992, Nantes, France, 977--981. Google ScholarDigital Library
- Church, K.W. A stochastic parts program and noun phrase parser for unrestricted test. Proc. 2nd Conference on Applied Natural Language Processing, Feb. 1988, Austin, Texas, USA, 136--143. Google ScholarDigital Library
- Cucerzan, S., and Yarowsky, D. Language independent NER using a unified model of internal and contextual evidence. Proc. 19th COLING, Aug. 2002, Taipei, Taiwan, 171--175. Google ScholarDigital Library
- Ding, J., Gravano, L., and Shivakumar N. Computing geographical scopes of web resource. Proc. 26th International Conference on Very Large Data Bases (VLDB'00), Sep. 2000, Cairo, Egypt., 545--556. Google ScholarDigital Library
- Geographic Names Information System (GNIS). http://geonames.usgs.gov/Google Scholar
- Google local search: http://local.google.comGoogle Scholar
- Google search. http://www.google.comGoogle Scholar
- Gravano, L., Hatzivassiloglou, V., and Lichenstein, R. Categorizing Web Queries according to Geographical Locality. Proc. 12th Int'l Conference on Information and knowledge management (CIKM'03), Nov. 2003, New Orleans, LA, USA, 325--333. Google ScholarDigital Library
- Li, H., Srihari, R. K., Niu, C., and Li, W. Location normalization for information extraction. Proc. 19th COLING, Aug. 2002, Taipei, Taiwan. Google ScholarDigital Library
- Li, H., Srihari, R. K., Niu, C., and Li, W. InfoXtract location normalizations: a hybrid approach to geographical references in information extraction. Workshop on the Analysis of Geographic References, May 2003, Edmonton, Canada. Google ScholarDigital Library
- MSN Search. http://search.msn.com/Google Scholar
- North American Numbering Plan. http://sd.wareonearth.com/~phil/npanxxGoogle Scholar
- USPS -- The United States Postal Services. http://www.usps.com/Google Scholar
- Van Rijsbergen, C.J. Information Retrieval. Butterworths, London, Second Edition, 1979. Google ScholarDigital Library
- Yahoo! local search. http://local.yahoo.com/Google Scholar
- Zhou G., and Su, J. Named entity tagging using an HMM-based chunk tagger. Proc. 40th Annual Meeting of the ACL, July 2002, Philadelphia, PA, USA, 209--219. Google ScholarDigital Library
Index Terms
- Detecting dominant locations from search queries
Recommendations
Identifying popular search goals behind search queries to improve web search ranking
AIRS'11: Proceedings of the 7th Asia conference on Information Retrieval TechnologyWeb users usually have a certain search goal before they submit a search query. However, many laypersons can't transform their search goals into suitable queries. Thus, understanding original search goals behind a query is very important for search ...
Evaluating leading web search engines on children's queries
HCII'11: Proceedings of the 14th international conference on Human-computer interaction: users and applications - Volume Part IVThis study compared retrieved results, relevance ranking, and overlap across Google, Yahoo!, Bing, Yahoo Kids!, and Ask Kids on 15 queries constructed by middle school children. Queries included one word, two words, and multiple words/phrases/natural ...
Analysis of geographic queries in a search engine log
LOCWEB '08: Proceedings of the first international workshop on Location and the webGeography is becoming increasingly important in web search. Search engines can often return better results to users by analyzing features such as user location or geographic terms in web pages and user queries. This is also of great commercial value as ...
Comments