ABSTRACT
While neural rankers continue to show notable performance improvements over a wide variety of information retrieval tasks, there have been recent studies that show such rankers may intensify certain stereotypical biases. In this paper, we investigate whether neural rankers introduce retrieval effectiveness (performance) disparities over queries related to different genders. We specifically study whether there are significant performance differences between male and female queries when retrieved by neural rankers. Through our empirical study over the MS MARCO collection, we find that such performance disparities are notable and that the performance disparities may be due to the difference between how queries and their relevant judgements are collected and distributed for different gendered queries. More specifically, we observe that male queries are more closely associated with their relevant documents compared to female queries and hence neural rankers are able to more easily learn associations between male queries and their relevant documents. We show that it is possible to systematically balance relevance judgment collections in order to reduce performance disparity between different gendered queries without negatively compromising overall model performance.
Supplemental Material
- Amin Bigdeli, Negar Arabzadeh, Shirin Seyedsalehi, Morteza Zihayat, and Ebrahim Bagheri. 2022. A Light-Weight Strategy for Restraining Gender Biases in Neural Rankers. In European Conference on Information Retrieval. Springer, 47--55.Google ScholarDigital Library
- Amin Bigdeli, Negar Arabzadeh, Shirin Seyersalehi, Morteza Zihayat, and Ebrahim Bagheri. 2021. On the Orthogonality of Bias and Utility in Ad hoc Retrieval. In Proceedings of the 44rd International ACM SIGIR Conference on Research and Development in Information Retrieval.Google ScholarDigital Library
- Amin Bigdeli, Negar Arabzadeh, Morteza Zihayat, and Ebrahim Bagheri. 2021. Exploring Gender Biases in Information Retrieval Relevance Judgement Datasets. In Advances in Information Retrieval - 43rd European Conference on IR Research.Google Scholar
- Alessandro Fabris, Alberto Purpura, Gianmaria Silvello, and Gian Antonio Susto. 2020. Gender stereotype reinforcement: Measuring the gender bias conveyed by ranking algorithms. Information Processing & Management 57, 6 (2020), 102377.Google ScholarCross Ref
- Luyu Gao and Jamie Callan. 2021. Unsupervised corpus aware language model pre-training for dense passage retrieval. arXiv preprint arXiv:2108.05540 (2021).Google Scholar
- Anja Klasnja, Negar Arabzadeh, Mahbod Mehrvarz, and Ebrahim Bagheri. 2022. On the Characteristics of Ranking-based Gender Bias Measures. In WebSci'22 (2022-03--30) (The 14th International ACM Conference on Web Science in 2022 (WebSci'22), 26 -- 29, June, 2022, Universitat Pompeu Fabra, Barcelona, Spain).Google ScholarDigital Library
- Tri Nguyen, Mir Rosenberg, Xia Song, Jianfeng Gao, Saurabh Tiwary, Rangan Majumder, and Li Deng. 2016. MS MARCO: A human generated machine reading comprehension dataset. In CoCo@ NIPS.Google Scholar
- Rodrigo Nogueira and Kyunghyun Cho. 2019. Passage Re-ranking with BERT. arXiv preprint arXiv:1901.04085 (2019).Google Scholar
- Rodrigo Nogueira, Jimmy Lin, and AI Epistemic. 2019. From doc2query to docTTTTTquery. Online preprint (2019).Google Scholar
- Harshith Padigela, Hamed Zamani, and W Bruce Croft. 2019. Investigating the successes and failures of BERT for passage re-ranking. arXiv preprint arXiv:1905.01758 (2019).Google Scholar
- Nils Reimers and Iryna Gurevych. 2019. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. CoRR abs/1908.10084 (2019). arXiv:1908.10084 http://arxiv.org/abs/1908.10084Google Scholar
- Navid Rekabsaz, Simone Kopeinik, and Markus Schedl. 2021. Societal Biases in Retrieved Contents: Measurement Framework and Adversarial Mitigation for BERT Rankers. arXiv preprint arXiv:2104.13640 (2021).Google Scholar
- Navid Rekabsaz and Markus Schedl. 2020. Do Neural Ranking Models Intensify Gender Bias?. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. 2065--2068.Google ScholarDigital Library
- Shirin Seyedsalehi, Amin Bigdeli, Negar Arabzadeh, Bhaskar Mitra, Morteza Zihayat, and Ebrahim Bagheri. 2022. Bias-aware Fair Neural Ranking for Addressing Stereotypical Gender Biases. In EDBT/ICDT 2022 (2022-01-01).Google Scholar
Index Terms
- Addressing Gender-related Performance Disparities in Neural Rankers
Recommendations
The perseverance of gender stereotype
HCI '18: Proceedings of the 32nd International BCS Human Computer Interaction ConferenceNumerous studies have observed unequal representation of gender stereotypes across different areas. However, some of the studies featured in this body of research were focused on using direct measures of implicit associations to understand how students ...
Show me a "Male Nurse"! How Gender Bias is Reflected in the Query Formulation of Search Engine Users
CHI '23: Proceedings of the 2023 CHI Conference on Human Factors in Computing SystemsBiases in algorithmic systems have led to discrimination against historically disadvantaged groups, including the reinforcement of outdated gender stereotypes. While a substantial body of research addresses biases in algorithms and underlying data, in ...
Comments