skip to main content
10.1145/3287560.3287572acmconferencesArticle/Chapter ViewAbstractPublication PagesfacctConference Proceedingsconference-collections
research-article

Bias in Bios: A Case Study of Semantic Representation Bias in a High-Stakes Setting

Published:29 January 2019Publication History

ABSTRACT

We present a large-scale study of gender bias in occupation classification, a task where the use of machine learning may lead to negative outcomes on peoples' lives. We analyze the potential allocation harms that can result from semantic representation bias. To do so, we study the impact on occupation classification of including explicit gender indicators---such as first names and pronouns---in different semantic representations of online biographies. Additionally, we quantify the bias that remains when these indicators are "scrubbed," and describe proxy behavior that occurs in the absence of explicit gender indicators. As we demonstrate, differences in true positive rates between genders are correlated with existing gender imbalances in occupations, which may compound these imbalances.

References

  1. Yossi Adi, Einat Kermany, Yonatan Belinkov, Ofer Lavi, and Yoav Goldberg. 2016. Fine-grained analysis of sentence embeddings using auxiliary prediction tasks. arXiv preprint arXiv:1608.04207 (2016).Google ScholarGoogle Scholar
  2. Kristen M Altenburger, Rajlakshmi De, Kaylyn Frazier, Nikolai Avteniev, and Jim Hamilton. 2017. Are There Gender Differences in Professional Self-Promotion? An Empirical Case Study of LinkedIn Profiles Among Recent MBA Graduates. In ICWSM. 460--463.Google ScholarGoogle Scholar
  3. Ian Ayres. 2002. Outcome tests of racial disparities in police practices. Justice research and Policy 4, 1--2 (2002), 131--142.Google ScholarGoogle Scholar
  4. Solon Barocas and Andrew D Selbst. 2016. Big data's disparate impact. Cal. L. Rev. 104 (2016), 671.Google ScholarGoogle Scholar
  5. Marianne Bertrand and Esther Duflo. 2017. Field Experiments on Discrimination. In Handbook of Economic Field Experiments. Vol. 1. Elsevier, 309--393.Google ScholarGoogle Scholar
  6. Marianne Bertrand and Sendhil Mullainathan. 2004. Are Emily and Greg more employable than Lakisha and Jamal? A field experiment on labor market discrimination. American economic review 94, 4 (2004), 991--1013.Google ScholarGoogle Scholar
  7. Asia J Biega, Krishna P Gummadi, and Gerhard Weikum. 2018. Equity of Attention: Amortizing Individual Fairness in Rankings. arXiv preprint arXiv:1805.01788 (2018). Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Su Lin Blodgett and Brendan O'Connor. 2017. Racial Disparity in Natural Language Processing: A Case Study of Social Media African-American English. arXiv preprint arXiv: 1707.00061 (2017).Google ScholarGoogle Scholar
  9. Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov. 2017. Enriching Word Vectors with Subword Information. Transactions of the Association for Computational Linguistics 5 (2017), 135--146.Google ScholarGoogle ScholarCross RefCross Ref
  10. Tolga Bolukbasi, Kai-Wei Chang, James Y Zou, Venkatesh Saligrama, and Adam T Kalai. 2016. Man is to computer programmer as woman is to homemaker? debiasing word embeddings. In Advances in Neural Information Processing Systems. 4349--4357. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Victoria L Brescoll. 2011. Who takes the floor and why: Gender, power, and volubility in organizations. Administrative Science Quarterly 56, 4 (2011), 622--641.Google ScholarGoogle ScholarCross RefCross Ref
  12. Toon Calders and Indrė Žliobaitė. 2013. Why unbiased computational processes can lead to discriminative decision procedures. In Discrimination and privacy in the information society. Springer, 43--57.Google ScholarGoogle Scholar
  13. Aylin Caliskan, Joanna J Bryson, and Arvind Narayanan. 2017. Semantics derived automatically from language corpora contain human-like biases. Science 356, 6334 (2017), 183--186.Google ScholarGoogle Scholar
  14. L Elisa Celis, Damian Straszak, and Nisheeth K Vishnoi. 2018. Ranking with fairness constraints. In Proceedings of the International Colloquium on Automata, Languages, and Programming.Google ScholarGoogle Scholar
  15. Na Cheng, Rajarathnam Chandramouli, and KP Subbalakshmi. 2011. Author gender identification from text. Digital Investigation 8, 1 (2011), 78--88. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Junyoung Chung, Caglar Gulcehre, KyungHyun Cho, and Yoshua Bengio. 2014. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555 (2014).Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Lucas Dixon, John Li, Jeffrey Sorensen, Nithum Thain, and Lucy Vasserman. 2017. Measuring and Mitigating Unintended Bias in Text Classification. (2017).Google ScholarGoogle Scholar
  18. Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Richard Zemel. 2012. Fairness through awareness. In Proceedings of the 3rd innovations in theoretical computer science conference. ACM, 214--226. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Cynthia Dwork and Christina Ilvento. 2018. Fairness Under Composition. arXiv preprint arXiv:1806.06122 (2018).Google ScholarGoogle Scholar
  20. Cynthia Dwork, Nicole Immorlica, Adam Tauman Kalai, and Mark DM Leiserson. 2018. Decoupled classifiers for group-fair and efficient machine learning. In Conference on Fairness, Accountability and Transparency. 119--133.Google ScholarGoogle Scholar
  21. Nikhil Garg, Londa Schiebinger, Dan Jurafsky, and James Zou. 2018. Word embeddings quantify 100 years of gender and ethnic stereotypes. Proceedings of the National Academy of Sciences 115, 16 (2018), E3635--E3644.Google ScholarGoogle ScholarCross RefCross Ref
  22. Sahin Cem Geyik and Krishnaram Kenthapadi. October 2018. Building Representative Talent Search at LinkedIn. (October 2018). LinkedIn engineering blog post, Available at https://engineering.linkedin.com/blog/2018/10/building-representative-talent-search-at-linkedin.Google ScholarGoogle Scholar
  23. Donna K Ginther and Shulamit Kahn. 2004. Women in economics: Moving up or falling off the academic career ladder? Journal of Economic perspectives 18, 3 (2004), 193--214.Google ScholarGoogle ScholarCross RefCross Ref
  24. Claudia Goldin and Cecilia Rouse. 2000. Orchestrating impartiality: The impact of "blind" auditions on female musicians. American economic review 90, 4 (2000), 715--741.Google ScholarGoogle Scholar
  25. Moritz Hardt, Eric Price, and Nati Srebro. 2016. Equality of opportunity in supervised learning. In Advances in neural information processing systems. 3315--3323. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Deborah Hellman. 2018. Indirect Discrimination and the Duty to Avoid Compounding Injustice. Foundations of Indirect Discrimination Law, Forthcoming (2018).Google ScholarGoogle Scholar
  27. Niki Kilbertus, Mateo Rojas Carulla, Giambattista Parascandolo, Moritz Hardt, Dominik Janzing, and Bernhard Schölkopf. 2017. Avoiding discrimination through causal reasoning. In Advances in Neural Information Processing Systems. 656--666. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Pauline T Kim. 2016. Data-driven discrimination at work. Wm. & Mary L. Rev. 58 (2016), 857.Google ScholarGoogle Scholar
  29. Jon Kleinberg, Jens Ludwig, Sendhil Mullainathan, and Ashesh Rambachan. 2018. Algorithmic fairness. In AEA Papers and Proceedings, Vol. 108. 22--27.Google ScholarGoogle ScholarCross RefCross Ref
  30. Moshe Koppel, Shlomo Argamon, and Anat Rachel Shimoni. 2002. Automatically categorizing written texts by author gender. Literary and Linguistic Computing 17, 4 (2002), 401--412.Google ScholarGoogle ScholarCross RefCross Ref
  31. Tomas Mikolov, Edouard Grave, Piotr Bojanowski, Christian Puhrsch, and Armand Joulin. 2018. Advances in Pre-Training Distributed Word Representations. In Proceedings of the International Conference on Language Resources and Evaluation (LREC 2018).Google ScholarGoogle Scholar
  32. David Niven and Jeremy Zilber. 2001. Do women and men in congress cultivate different images? Evidence from congressional web sites. Political Communication 18, 4 (2001), 395--405.Google ScholarGoogle ScholarCross RefCross Ref
  33. Devin G Pope and Justin R Sydnor. 2011. Implementing anti-discrimination policies in statistical profiling models. American Economic Journal: Economic Policy 3, 3 (2011), 206--31.Google ScholarGoogle ScholarCross RefCross Ref
  34. Rachel Rudinger, Jason Naradowsky, Brian Leonard, and Benjamin Van Durme. 2018. Gender bias in coreference resolution. arXiv preprint arXiv:1804.09301 (2018).Google ScholarGoogle Scholar
  35. Heather Sarsons. 2015. Gender differences in recognition for group work. Harvard University Working Paper (2015).Google ScholarGoogle Scholar
  36. Heather Sarsons. 2017. Interpreting signals in the labor market: evidence from medical referrals. Job Market Paper (2017).Google ScholarGoogle Scholar
  37. David G Smith, Judith E Rosenstein, Margaret C Nikolov, and Darby A Chaney. 2018. The Power of Language: Gender, Status, and Agency in Performance Evaluations. Sex Roles (2018), 1--13.Google ScholarGoogle Scholar
  38. Rachael Tatman. 2017. Gender and Dialect Bias in YouTube's Automatic Captions. In Proceedings of the First ACL Workshop on Ethics in Natural Language Processing. 53--59.Google ScholarGoogle ScholarCross RefCross Ref
  39. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in Neural Information Processing Systems. 5998--6008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Ke Yang and Julia Stoyanovich. 2017. Measuring fairness in ranked outputs. In Proceedings of the 29th International Conference on Scientific and Statistical Database Management. 22. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Zichao Yang, Diyi Yang, Chris Dyer, Xiaodong He, Alex Smola, and Eduard Hovy. 2016. Hierarchical attention networks for document classification. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 1480--1489.Google ScholarGoogle ScholarCross RefCross Ref
  42. Meike Zehlike, Francesco Bonchi, Carlos Castillo, Sara Hajian, Mohamed Megahed, and Ricardo Baeza-Yates. 2017. FA*TR: A fair top-k ranking algorithm. In Proceedings of the ACM Conference on Information and Knowledge Management. 1569--1578. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Rich Zemel, Yu Wu, Kevin Swersky, Toni Pitassi, and Cynthia Dwork. 2013. Learning fair representations. In International Conference on Machine Learning. 325--333. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Jieyu Zhao, Tianlu Wang, Mark Yatskar, Vicente Ordonez, and Kai-Wei Chang. 2018. Gender bias in coreference resolution: Evaluation and debiasing methods. arXiv preprint arXiv: 1804.06876 (2018).Google ScholarGoogle Scholar

Index Terms

  1. Bias in Bios: A Case Study of Semantic Representation Bias in a High-Stakes Setting

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        FAT* '19: Proceedings of the Conference on Fairness, Accountability, and Transparency
        January 2019
        388 pages
        ISBN:9781450361255
        DOI:10.1145/3287560

        Copyright © 2019 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 29 January 2019

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed limited

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader