Abstract
In an expert search task, the users’ need is to identify people who have relevant expertise to a topic of interest. An expert search system predicts and ranks the expertise of a set of candidate persons with respect to the users’ query. In this paper, we propose a novel approach for predicting and ranking candidate expertise with respect to a query, called the Voting Model for Expert Search. In the Voting Model, we see the problem of ranking experts as a voting problem. We model the voting problem using 12 various voting techniques, which are inspired from the data fusion field. We investigate the effectiveness of the Voting Model and the associated voting techniques across a range of document weighting models, in the context of the TREC 2005 and TREC 2006 Enterprise tracks. The evaluation results show that the voting paradigm is very effective, without using any query or collection-specific heuristics. Moreover, we show that improving the quality of the underlying document representation can significantly improve the retrieval performance of the voting techniques on an expert search task. In particular, we demonstrate that applying field-based weighting models improves the ranking of candidates. Finally, we demonstrate that the relative performance of the voting techniques for the proposed approach is stable on a given task regardless of the used weighting models, suggesting that some of the proposed voting techniques will always perform better than other voting techniques.
Similar content being viewed by others
References
Amati G (2003) Probabilistic models for information retrieval based on divergence from randomness. PhD thesis, University of Glasgow, Glasgow, UK
Amati G (2006) Frequentist and Bayesian approach to information retrieval. In: Lalmas M, MacFarlane A, Rüger S et al (eds) Proceedings of ECIR 2006. Lecture Notes in Computer Science, vol 3936. Springer, London, pp 13–24. doi: 10.1007/11735106_3
Balog K, de Rijke M (2006) Finding experts and their details in e-mail corpora. In: Carr L, De Roure D, Iyengar A et al (eds). Proceedings of WWW 2006. ACM Press, Edinburgh, pp. 1035–1036 doi: 10.1145/1135777.1136002
Balog K, Azzopardi L, de Rijke M (2006) Formal models for expert finding in enterprise corpora. In: Efthimiadis E, Dumais S, Hawking D et al (eds) Proceedings of ACM SIGIR 2006. ACM Press, Seattle, pp 43–50. doi: 10.1145/1148170.1148181
Aslam JA, Montague M (2001) Models for metasearch. In: oft WB, Harper D, Kraft D et al. (eds). Proceedings of ACM SIGIR 2001. ACM Press, New Orleans, pp 276–284 doi: 10.1145/383952.384007
Campbell CS, Maglio PP, Cozzi A, et al (2003) Expertise identification using email communications. In Proceedings of ACM CIKM 2003. ACM Press, New Orleans, pp 528–531. doi: 10.1145/956863.956965
Cao Y, Li H, Liu J et al (2005) Research on expert search at enterprise track of TREC 2005. In: Proceedings of TREC-2005. NIST, Gaithersburg
Craswell N, de Vries AP, Soboroff I (2005) Overview of the TREC-2005 enterprise track. In: Proceedings of TREC-2005. NIST, Gaithersburg
aswell N, Hawking D, Vercoustre A-M et al (2001) Panoptic expert: searching for experts not just for documents. In: Ausweb Poster Proceedings, Queensland, Australia
Dom B, Eiron I and Cozzi A (2003). Graph-based ranking algorithms for e-mail expertise analysis. In: Zaki, MJ and Aggarwal, C (eds) Proceedings of ACM SIGMOD DMKD Workshop 2003., pp 42–48. ACM Press, San Diego
Dumais ST, Nielsen J (1992) Automating the assignment of submitted manusipts to reviewers. In: Belkin NJ, Ingwersen P, Pejtersen AM (eds) Proceedings of ACM SIGIR 1992, Copenhagen, Denmark, pp 233–244. doi: 10.1145/133160.133205
Fang H, Zhai C (2007) Probabilistic models for expert finding. In: Amati G, Carpineto C, Romano G (eds) Proceedings of ECIR 2007. Lecture Notes in Computer Science vol 4425. Springer, Rome, pp 418-430. doi: 10.1007/978-3-540-71496-5_38
Fox EA, Shaw JA (1994) Combination of multiple searches. In: Proceedings of TREC-2. NIST, Gaithersburg
Hertzum M and Pejtersen AM (2000). The information-seeking practises of engineers: searching for documents as well as for people. Inf Process Manage 36(5): 761–778 doi: 10.1016/S0306-4573(00)00011-X
Hiemstra D (2001) Using language models for information retrieval. PhD thesis, University of Twente, The Netherlands
Kendall MG (1955). Rank correlation methods, 2nd edn. Charles Griffin, London
Kleinberg JM (1999). Authoritative sources in a hyperlinked environment. J ACM 46(5): 604–632 doi: 10.1145/324133.324140
Lee JH (1997) Analyses of multiple evidence combination. In: Belkin NJ, Willett P, Narasimhalu AD (eds) Proceedings of ACM SIGIR 1997, ACM Press, Philadelphia, pp 267–276. doi: 10.1145/258525.258587
Lioma C, Macdonald C, Plachouras V, et al (2007) University of Glasgow at TREC 2006: experiments in terabyte and enterprise tracks with terrier. In: Proceedings of TREC 2006. NIST, Gaithersburg
Liu X, oft WB, Koll M (2005) Finding experts in community-based question-answering services. In: Schek H-J, Fuhr N, Chowdhury A (eds) Proceedings of ACM CIKM 2005, ACM Press, Bremen, pp 315–316. doi: 10.1145/1099554.1099644
Macdonald C, He B, Plachouras V, et al (2006) University of Glasgow at TREC 2005: experiments in terabyte and enterprise tracks with terrier. In: Proceedings of TREC-2005. NIST, Gaithersburg
Macdonald C, Ounis I (2006) Searching for expertise using the terrier platform. In: Efthimiadis E, Dumais S, Hawking D et al (eds) Proceedings of ACM SIGIR 2006. ACM Press, Seattle WA, pp 732. doi: 10.1145/1148170.1148345
Macdonald C, Ounis I (2007) Using relevance feedback in expert search. In: Amati G, Carpineto C, Romano G (eds) Proceedings of ECIR 2007. Lecture Notes in Computer Science, vol 4425. Springer, Rome, pp 418-430. doi: 10.1007/978-3-540-71496-5_39
Macdonald C, Plachouras V, He B, Lioma C, Ounis I (2006) University of Glasgow at WebCLEF 2005: experiments in per-field normalisation and language specific stemming. In: Peters C, Gey FC, Gonzalo et al (eds) Proceedings of CLEF workshop 2005. Lecture Notes in Computer Science, vol 4022. Springer, Vienna, Austria, pp 898-907. doi: 10.1007/11878773_100
Manmatha R, Rath T, Feng F (2001) Modelling score distributions for combining the outputs of search engines. In: oft WB, Harper D, Kraft D et al (eds) Proceedings of ACM SIGIR 2001. ACM Press, New Orleans LA, pp 267–275. doi: 10.1145/383952.384005
Maybury M, D’Amore R and House D (2001). Expert finding for collaborative virtual environments. Commun ACM 44(12): 55–56 doi: 10.1145/501338.501343
McLean A, Vercoustre A-M, Wu M (2003) Enterprise PeopleFinder: combining evidence from Web pages and corporate data. In: Hawking D, Bruza P, Thom J (eds) Proceedings of the 8th Australasian Document Computing Symposium (ADCS’03)
Montague M, Aslam JA (2001) Metasearch consistency. In: oft WB, Harper D, Kraft D et al (eds) Proceedings of ACM SIGIR 2001. ACM Press, New Orleans, pp 386–387. doi: 10.1145/383952.384030
Montague M, Aslam JA (2001) Relevance score normalization for metasearch. In: Proceedings of ACM CIKM 2001. ACM Press, Atlanta, pp 427–433. doi: 10.1145/502585.502657
Montague M, Aslam JA (2002) Condorcet fusion for improved retrieval. In Proceedings of ACM CIKM 2002. ACM Press, McLean, pp 538–548. doi: 10.1145/584792.584881
Ogilvie P, Callan J (2003) Combining document representations for known-item search. In: Clarke C, Cormack G, Callan J et al (eds) Proceedings of ACM SIGIR 2003. Toronto, Canada, pp 143–150. doi: 10.1145/860435.860463
Ounis I, Amati G, Plachouras V et al (2005) Terrier Information Retrieval Platform. In: Losada D, Fernández-Luna JM (eds) Proceedings of ECIR 2005. Lecture Notes in Computer Science, vol 3408. Springer, Santiago de Compostela, pp 517–519. doi: 10.1007/b107096
Ounis I, Amati G and Plachouras V (2006). Terrier: a high performance and scalable information retrieval platform. In: Beigbeder, M, Buntime, W, and Gen Yee, W (eds) Proceedings of the OSIR Workshop 2006, pp 18–25. ACM Press, Seattle
Petkova D, oft WB (2006) Hierarchical language models for expert finding in enterprise corpora. In: Lu CT, Bourbakis NG (eds) Proceedings of ICTAI 2006. IEEE, Washington, DC, pp 599–608. doi: 10.1109/ICTAI.2006.63
Plachouras V, He B, Ounis I (2004) University of Glasgow at TREC2004: experiments in Web, robust and terabyte tracks with terrier. In: Proceedings of TREC-2004. NIST, Gaithersburg
Plachouras V, Ounis I (2007) Multinomial randomness models for retrieval with document fields. In: Amati G, Carpineto C, Romano G (eds) Proceedings of ECIR 2007. Lecture Notes in Computer Science, vol 4425. Springer, Rome, pp 28-39. doi: 10.1007/978-3-540-71496-5_6
Robertson SE, Zaragoza H, Taylor M (2004) Simple BM25 extension to multiple weighted Fields. In: Gravano L, Zhai CX, Herzog O (eds) Proceedings of ACM CIKM 2004. ACM Press, Washington, DC, pp 42–49. doi: 10.1145/1031171.1031181
Robertson SE, Walker S, Hancock-Beaulieu M, et al (1995) Okapi at TREC-4. In: Proceedings of TREC-4. NIST, Gaithersburg
Robertson SE, Walker S, Hancock-Beaulieu M, et al (1992) Okapi at TREC. In: Proceedings of TREC-1. NIST, Gaithersburg
Savoy J, Calvé AL, Vrajitoru D (1997) Report on the TREC-5 experiment: data fusion and collection fusion. In: Proceedings of TREC-5. NIST, Gaithersburg, MD
Shaw JA, Fox EA (1994) Combination of multiple searches. In: Proceedings of TREC-3. NIST Gaithersburg
Sihn W, Heeren F (2001) Xpertfinder—expert finding within specified subject areas through analysis of E-mail communication. In: Proceedings of Euromedia 2001, Valencia, Spain, pp 279–283
Soboroff I, de Vries AP, aswell N (2006) Overview of the TREC-2006 enterprise track. In: Proceedings of TREC-2006. NIST, Gaithersburg
Wang J, Chen Z, Tao L, Ma WY, Wenyin L (2002) Ranking user’s relevance to a topic through link analysis on web logs. In: Proceedings of WIDM 2002 workshop, McLean, VA, pp 49–54
Yimam-Seid D and Kobsa A (2003). Expert finding systems for organizations: problem and domain analysis and the DEMOIR approach. J Organizat Comput and Elec Commerce 13(1): 1–24
Zaragoza H, aswell N, Taylor M, et al (2004) Miosoft Cambridge at TREC-13: Web and HARD tracks. In: Proceedings of TREC-2004. NIST, Gaithersburg
Zhang M, Song R, Lin C, et al (2002) Expansion-based technologies in finding relevant and new information: THU TREC2002: Novelty Track experiments. In: Proceedings of TREC-2002. NIST, Gaithersburg
Author information
Authors and Affiliations
Corresponding author
Additional information
Extended version of ‘Voting for candidates: adapting data fusion techniques for an expert search task’. C. Macdonald and I. Ounis. In Proceedings of ACM CIKM 2006, Arlington, VA. 2006. doi: 10.1145/1183614.1183671.
Rights and permissions
About this article
Cite this article
Macdonald, C., Ounis, I. Voting techniques for expert search. Knowl Inf Syst 16, 259–280 (2008). https://doi.org/10.1007/s10115-007-0105-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-007-0105-3