research-article

Bias in Bios: A Case Study of Semantic Representation Bias in a High-Stakes Setting

Authors:
Maria De-Arteaga

Carnegie Mellon University

Carnegie Mellon University
View Profile

,
Alexey Romanov

University of Massachusetts Lowell

University of Massachusetts Lowell
View Profile

,
Hanna Wallach

Microsoft Research

Microsoft Research
View Profile

,
Jennifer Chayes

Microsoft Research

Microsoft Research
View Profile

,
Christian Borgs

Microsoft Research

Microsoft Research
View Profile

,
Alexandra Chouldechova

Carnegie Mellon University

Carnegie Mellon University
View Profile

,
Sahin Geyik

LinkedIn

LinkedIn
View Profile

,
Krishnaram Kenthapadi

LinkedIn

LinkedIn
View Profile

,
Adam Tauman Kalai

Microsoft Research

Microsoft Research
View Profile

FAT* '19: Proceedings of the Conference on Fairness, Accountability, and TransparencyJanuary 2019Pages 120–128https://doi.org/10.1145/3287560.3287572

Published:29 January 2019Publication History

FAT* '19: Proceedings of the Conference on Fairness, Accountability, and Transparency

Pages 120–128

ABSTRACT

We present a large-scale study of gender bias in occupation classification, a task where the use of machine learning may lead to negative outcomes on peoples' lives. We analyze the potential allocation harms that can result from semantic representation bias. To do so, we study the impact on occupation classification of including explicit gender indicators---such as first names and pronouns---in different semantic representations of online biographies. Additionally, we quantify the bias that remains when these indicators are "scrubbed," and describe proxy behavior that occurs in the absence of explicit gender indicators. As we demonstrate, differences in true positive rates between genders are correlated with existing gender imbalances in occupations, which may compound these imbalances.

References

Yossi Adi, Einat Kermany, Yonatan Belinkov, Ofer Lavi, and Yoav Goldberg. 2016. Fine-grained analysis of sentence embeddings using auxiliary prediction tasks. arXiv preprint arXiv:1608.04207 (2016).Google Scholar
Kristen M Altenburger, Rajlakshmi De, Kaylyn Frazier, Nikolai Avteniev, and Jim Hamilton. 2017. Are There Gender Differences in Professional Self-Promotion? An Empirical Case Study of LinkedIn Profiles Among Recent MBA Graduates. In ICWSM. 460--463.Google Scholar
Ian Ayres. 2002. Outcome tests of racial disparities in police practices. Justice research and Policy 4, 1--2 (2002), 131--142.Google Scholar
Solon Barocas and Andrew D Selbst. 2016. Big data's disparate impact. Cal. L. Rev. 104 (2016), 671.Google Scholar
Marianne Bertrand and Esther Duflo. 2017. Field Experiments on Discrimination. In Handbook of Economic Field Experiments. Vol. 1. Elsevier, 309--393.Google Scholar
Marianne Bertrand and Sendhil Mullainathan. 2004. Are Emily and Greg more employable than Lakisha and Jamal? A field experiment on labor market discrimination. American economic review 94, 4 (2004), 991--1013.Google Scholar
Asia J Biega, Krishna P Gummadi, and Gerhard Weikum. 2018. Equity of Attention: Amortizing Individual Fairness in Rankings. arXiv preprint arXiv:1805.01788 (2018). Google ScholarDigital Library
Su Lin Blodgett and Brendan O'Connor. 2017. Racial Disparity in Natural Language Processing: A Case Study of Social Media African-American English. arXiv preprint arXiv: 1707.00061 (2017).Google Scholar
Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov. 2017. Enriching Word Vectors with Subword Information. Transactions of the Association for Computational Linguistics 5 (2017), 135--146.Google ScholarCross Ref
Tolga Bolukbasi, Kai-Wei Chang, James Y Zou, Venkatesh Saligrama, and Adam T Kalai. 2016. Man is to computer programmer as woman is to homemaker? debiasing word embeddings. In Advances in Neural Information Processing Systems. 4349--4357. Google ScholarDigital Library
Victoria L Brescoll. 2011. Who takes the floor and why: Gender, power, and volubility in organizations. Administrative Science Quarterly 56, 4 (2011), 622--641.Google ScholarCross Ref
Toon Calders and Indrė Žliobaitė. 2013. Why unbiased computational processes can lead to discriminative decision procedures. In Discrimination and privacy in the information society. Springer, 43--57.Google Scholar
Aylin Caliskan, Joanna J Bryson, and Arvind Narayanan. 2017. Semantics derived automatically from language corpora contain human-like biases. Science 356, 6334 (2017), 183--186.Google Scholar
L Elisa Celis, Damian Straszak, and Nisheeth K Vishnoi. 2018. Ranking with fairness constraints. In Proceedings of the International Colloquium on Automata, Languages, and Programming.Google Scholar
Na Cheng, Rajarathnam Chandramouli, and KP Subbalakshmi. 2011. Author gender identification from text. Digital Investigation 8, 1 (2011), 78--88. Google ScholarDigital Library
Junyoung Chung, Caglar Gulcehre, KyungHyun Cho, and Yoshua Bengio. 2014. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555 (2014).Google ScholarDigital Library
Lucas Dixon, John Li, Jeffrey Sorensen, Nithum Thain, and Lucy Vasserman. 2017. Measuring and Mitigating Unintended Bias in Text Classification. (2017).Google Scholar
Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Richard Zemel. 2012. Fairness through awareness. In Proceedings of the 3rd innovations in theoretical computer science conference. ACM, 214--226. Google ScholarDigital Library
Cynthia Dwork and Christina Ilvento. 2018. Fairness Under Composition. arXiv preprint arXiv:1806.06122 (2018).Google Scholar
Cynthia Dwork, Nicole Immorlica, Adam Tauman Kalai, and Mark DM Leiserson. 2018. Decoupled classifiers for group-fair and efficient machine learning. In Conference on Fairness, Accountability and Transparency. 119--133.Google Scholar
Nikhil Garg, Londa Schiebinger, Dan Jurafsky, and James Zou. 2018. Word embeddings quantify 100 years of gender and ethnic stereotypes. Proceedings of the National Academy of Sciences 115, 16 (2018), E3635--E3644.Google ScholarCross Ref
Sahin Cem Geyik and Krishnaram Kenthapadi. October 2018. Building Representative Talent Search at LinkedIn. (October 2018). LinkedIn engineering blog post, Available at https://engineering.linkedin.com/blog/2018/10/building-representative-talent-search-at-linkedin.Google Scholar
Donna K Ginther and Shulamit Kahn. 2004. Women in economics: Moving up or falling off the academic career ladder? Journal of Economic perspectives 18, 3 (2004), 193--214.Google ScholarCross Ref
Claudia Goldin and Cecilia Rouse. 2000. Orchestrating impartiality: The impact of "blind" auditions on female musicians. American economic review 90, 4 (2000), 715--741.Google Scholar
Moritz Hardt, Eric Price, and Nati Srebro. 2016. Equality of opportunity in supervised learning. In Advances in neural information processing systems. 3315--3323. Google ScholarDigital Library
Deborah Hellman. 2018. Indirect Discrimination and the Duty to Avoid Compounding Injustice. Foundations of Indirect Discrimination Law, Forthcoming (2018).Google Scholar
Niki Kilbertus, Mateo Rojas Carulla, Giambattista Parascandolo, Moritz Hardt, Dominik Janzing, and Bernhard Schölkopf. 2017. Avoiding discrimination through causal reasoning. In Advances in Neural Information Processing Systems. 656--666. Google ScholarDigital Library
Pauline T Kim. 2016. Data-driven discrimination at work. Wm. & Mary L. Rev. 58 (2016), 857.Google Scholar
Jon Kleinberg, Jens Ludwig, Sendhil Mullainathan, and Ashesh Rambachan. 2018. Algorithmic fairness. In AEA Papers and Proceedings, Vol. 108. 22--27.Google ScholarCross Ref
Moshe Koppel, Shlomo Argamon, and Anat Rachel Shimoni. 2002. Automatically categorizing written texts by author gender. Literary and Linguistic Computing 17, 4 (2002), 401--412.Google ScholarCross Ref
Tomas Mikolov, Edouard Grave, Piotr Bojanowski, Christian Puhrsch, and Armand Joulin. 2018. Advances in Pre-Training Distributed Word Representations. In Proceedings of the International Conference on Language Resources and Evaluation (LREC 2018).Google Scholar
David Niven and Jeremy Zilber. 2001. Do women and men in congress cultivate different images? Evidence from congressional web sites. Political Communication 18, 4 (2001), 395--405.Google ScholarCross Ref
Devin G Pope and Justin R Sydnor. 2011. Implementing anti-discrimination policies in statistical profiling models. American Economic Journal: Economic Policy 3, 3 (2011), 206--31.Google ScholarCross Ref
Rachel Rudinger, Jason Naradowsky, Brian Leonard, and Benjamin Van Durme. 2018. Gender bias in coreference resolution. arXiv preprint arXiv:1804.09301 (2018).Google Scholar
Heather Sarsons. 2015. Gender differences in recognition for group work. Harvard University Working Paper (2015).Google Scholar
Heather Sarsons. 2017. Interpreting signals in the labor market: evidence from medical referrals. Job Market Paper (2017).Google Scholar
David G Smith, Judith E Rosenstein, Margaret C Nikolov, and Darby A Chaney. 2018. The Power of Language: Gender, Status, and Agency in Performance Evaluations. Sex Roles (2018), 1--13.Google Scholar
Rachael Tatman. 2017. Gender and Dialect Bias in YouTube's Automatic Captions. In Proceedings of the First ACL Workshop on Ethics in Natural Language Processing. 53--59.Google ScholarCross Ref
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in Neural Information Processing Systems. 5998--6008. Google ScholarDigital Library
Ke Yang and Julia Stoyanovich. 2017. Measuring fairness in ranked outputs. In Proceedings of the 29th International Conference on Scientific and Statistical Database Management. 22. Google ScholarDigital Library
Zichao Yang, Diyi Yang, Chris Dyer, Xiaodong He, Alex Smola, and Eduard Hovy. 2016. Hierarchical attention networks for document classification. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 1480--1489.Google ScholarCross Ref
Meike Zehlike, Francesco Bonchi, Carlos Castillo, Sara Hajian, Mohamed Megahed, and Ricardo Baeza-Yates. 2017. FA*TR: A fair top-k ranking algorithm. In Proceedings of the ACM Conference on Information and Knowledge Management. 1569--1578. Google ScholarDigital Library
Rich Zemel, Yu Wu, Kevin Swersky, Toni Pitassi, and Cynthia Dwork. 2013. Learning fair representations. In International Conference on Machine Learning. 325--333. Google ScholarDigital Library
Jieyu Zhao, Tianlu Wang, Mark Yatskar, Vicente Ordonez, and Kai-Wei Chang. 2018. Gender bias in coreference resolution: Evaluation and debiasing methods. arXiv preprint arXiv: 1804.06876 (2018).Google Scholar

Index Terms

Bias in Bios: A Case Study of Semantic Representation Bias in a High-Stakes Setting
1. Applied computing
  1. Document management and text processing
2. Computing methodologies
  1. Machine learning

Recommendations

Social norm bias: residual harms of fairness-aware algorithms
Abstract
Many modern machine learning algorithms mitigate bias by enforcing fairness constraints across coarsely-defined groups related to a sensitive attribute like gender or race. However, these algorithms seldom account for within-group heterogeneity ...
Read More
Controlled Analyses of Social Biases in Wikipedia Bios
WWW '22: Proceedings of the ACM Web Conference 2022

Social biases on Wikipedia, a widely-read global platform, could greatly influence public opinion. While prior research has examined man/woman gender bias in biography articles, possible influences of other demographic attributes limit conclusions. In ...
Read More
Gender bias in artificial intelligence: the need for diversity and gender theory in machine learning
GE '18: Proceedings of the 1st International Workshop on Gender Equality in Software Engineering

Artificial intelligence is increasingly influencing the opinions and behavior of people in everyday life. However, the over-representation of men in the design of these technologies could quietly undo decades of advances in gender equality. Over ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

FAT* '19: Proceedings of the Conference on Fairness, Accountability, and Transparency
January 2019
388 pages
ISBN:9781450361255
DOI:10.1145/3287560

Copyright © 2019 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 29 January 2019
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Supervised learning
algorithmic fairness
automated hiring
compounding injustices
gender bias
online recruiting
Qualifiers
- research-article
- Research
- Refereed limited
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 105
  Total Citations
  View Citations
- 2,238
  Total Downloads
- Downloads (Last 12 months)398
- Downloads (Last 6 weeks)59
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Bias in Bios: A Case Study of Semantic Representation Bias in a High-Stakes Setting

FAT* '19: Proceedings of the Conference on Fairness, Accountability, and Transparency

ABSTRACT

References

Cited By

Index Terms

Recommendations

Social norm bias: residual harms of fairness-aware algorithms

Controlled Analyses of Social Biases in Wikipedia Bios

Gender bias in artificial intelligence: the need for diversity and gender theory in machine learning

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media