research-article

Domain Adaptive Decision Trees: Implications for Accuracy and Fairness

Authors:
Jose M. Alvarez

Scuola Normale Superiore, University of Pisa, Italy

Scuola Normale Superiore, University of Pisa, Italy

0000-0001-9412-9013
View Profile

,
Kristen M. Scott

KU Leuven, Belgium

KU Leuven, Belgium

0000-0002-3920-5017
View Profile

,
Bettina Berendt

TU Berlin, Weizenbaum Institute, Germany and KU Leuven, Belgium

TU Berlin, Weizenbaum Institute, Germany and KU Leuven, Belgium

0000-0002-8003-3413
View Profile

,
Salvatore Ruggieri

University of Pisa, Italy

University of Pisa, Italy

0000-0002-1917-6087
View Profile

FAccT '23: Proceedings of the 2023 ACM Conference on Fairness, Accountability, and TransparencyJune 2023Pages 423–433https://doi.org/10.1145/3593013.3594008

Published:12 June 2023Publication History

FAccT '23: Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency

Pages 423–433

ABSTRACT

In uses of pre-trained machine learning models, it is a known issue that the target population in which the model is being deployed may not have been reflected in the source population with which the model was trained. This can result in a biased model when deployed, leading to a reduction in model performance. One risk is that, as the population changes, certain demographic groups will be under-served or otherwise disadvantaged by the model, even as they become more represented in the target population. The field of domain adaptation proposes techniques for a situation where label data for the target population does not exist, but some information about the target distribution does exist. In this paper we contribute to the domain adaptation literature by introducing domain-adaptive decision trees (DADT). We focus on decision trees given their growing popularity due to their interpretability and performance relative to other more complex models. With DADT we aim to improve the accuracy of models trained in a source domain (or training data) that differs from the target domain (or test data). We propose an in-processing step that adjusts the information gain split criterion with outside information corresponding to the distribution of the target population. We demonstrate DADT on real data and find that it improves accuracy over a standard decision tree when testing in a shifted target population. We also study the change in fairness under demographic parity and equal opportunity. Results show an improvement in fairness with the use of DADT.

Supplemental Material

Available for Download

pdf

Appendix (345.2 KB)

pdf

Appendix (345.2 KB)

References

Evan Archer, Il Memming Park, and Jonathan W. Pillow. 2014. Bayesian entropy estimation for countable discrete distributions. J. Mach. Learn. Res. 15, 1 (2014), 2833–2868.Google ScholarDigital Library
Solon Barocas, Moritz Hardt, and Arvind Narayanan. 2019. Fairness and Machine Learning. fairmlbook.org. http://www.fairmlbook.org.Google Scholar
Joy Buolamwini and Timnit Gebru. 2018. Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification. In FAT(Proceedings of Machine Learning Research, Vol. 81). PMLR, 77–91.Google Scholar
Thomas M. Cover and Joy A. Thomas. 2001. Elements of Information Theory. Wiley.Google Scholar
Catherine D’Ignazio and Lauren F. Klein. 2020. Data Feminism. The MIT Press.Google Scholar
Frances Ding, Moritz Hardt, John Miller, and Ludwig Schmidt. 2021. Retiring Adult: New Datasets for Fair Machine Learning. In NeurIPS. 6478–6490.Google Scholar
Sanghamitra Dutta, Dennis Wei, Hazar Yueksel, Pin-Yu Chen, Sijia Liu, and Kush R. Varshney. 2020. Is There a Trade-Off Between Fairness and Accuracy? A Perspective Using Mismatched Hypothesis Testing. In ICML(Proceedings of Machine Learning Research, Vol. 119). PMLR, 2803–2813.Google Scholar
European Commission. 2021. Proposal for a Regulation of the European Parliament and of the Council Laying Down Harmonised Rules on Artificial Intelligence (Artificial Intelligence Act) and Amending Certain Union Legislative Acts. https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=celex%3A52021PC0206, accessed on January 2nd, 2023.Google Scholar
Igor Goldenberg and Geoffrey I. Webb. 2019. Survey of distance measures for quantifying concept drift and shift in numeric data. Knowl. Inf. Syst. 60, 2 (2019), 591–615.Google ScholarDigital Library
Léo Grinsztajn, Edouard Oyallon, and Gaël Varoquaux. 2022. Why do tree-based models still outperform deep learning on typical tabular data?. In NeurIPS.Google Scholar
Silviu Guiasu. 1971. Weighted Entropy. Reports on Mathematical Physic 2, 3 (1971).Google Scholar
Moritz Hardt, Eric Price, and Nati Srebro. 2016. Equality of Opportunity in Supervised Learning. In Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, December 5-10, 2016, Barcelona, Spain.Google ScholarDigital Library
Trevor Hastie, Robert Tibshirani, Jerome H Friedman, and Jerome H Friedman. 2009. The elements of statistical learning: data mining, inference, and prediction. Vol. 2. Springer.Google Scholar
Annie Liang, Jay Lu, and Xiaosheng Mu. 2022. Algorithmic Design: Fairness Versus Accuracy. In EC. ACM, 58–59.Google Scholar
Subha Maity, Debarghya Mukherjee, Mikhail Yurochkin, and Yuekai Sun. 2021. Does enforcing fairness mitigate biases caused by subpopulation shift?. In NeurIPS. 25773–25784.Google Scholar
Tomasz Maszczyk and Wlodzislaw Duch. 2008. Comparison of Shannon, Renyi and Tsallis Entropy Used in Decision Trees. In ICAISC(Lecture Notes in Computer Science, Vol. 5097). Springer, 643–651.Google Scholar
Jose G. Moreno-Torres, Troy Raeder, Rocío Alaíz-Rodríguez, Nitesh V. Chawla, and Francisco Herrera. 2012. A unifying view on dataset shift in classification. Pattern Recognit. 45, 1 (2012), 521–530.Google ScholarDigital Library
Debarghya Mukherjee, Felix Petersen, Mikhail Yurochkin, and Yuekai Sun. 2022. Domain Adaptation meets Individual Fairness. And they get along. In NeurIPS.Google Scholar
Ilya Nemenman, F. Shafee, and William Bialek. 2001. Entropy and Inference, Revisited. In NIPS. MIT Press, 471–478.Google Scholar
Sebastian Nowozin. 2012. Improved Information Gain Estimates for Decision Tree Induction. In ICML. icml.cc / Omnipress.Google Scholar
Vipin Kumar Pang-Ning Tan, Michael Steinbach. 2006. Introduction to Data Mining. Addison Wesley.Google Scholar
Joaquin Quiñonero-Candela, Masashi Sugiyama, Neil D Lawrence, and Anton Schwaighofer. 2009. Dataset shift in machine learning. MIT Press.Google ScholarDigital Library
Ievgen Redko, Emilie Morvant, Amaury Habrard, Marc Sebban, and Younès Bennani. 2020. A survey on domain adaptation theory. CoRR abs/2004.11829 (2020).Google Scholar
Cynthia Rudin. 2016. A renaissance for decision tree learning. https://www.youtube.com/watch?v=bY7WEr6lcuY. Keynote at PAPIs 2016..Google Scholar
Cynthia Rudin. 2019. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat. Mach. Intell. 1, 5 (2019), 206–215.Google ScholarCross Ref
Thomas Schürmann. 2004. Bias analysis in entropy estimation. Journal of Physics A: Mathematical and General 37, 27 (2004), L295–L301. https://doi.org/10.1088/0305-4470/37/27/l02Google ScholarCross Ref
Edward H Simpson. 1951. The interpretation of interaction in contingency tables. Journal of the Royal Statistical Society: Series B (Methodological) 13, 2 (1951), 238–241.Google ScholarCross Ref
Gonen Singer, Roee Anuar, and Irad Ben-Gal. 2020. A weighted information-gain measure for ordinal classification trees. Expert Syst. Appl. 152 (2020), 113375.Google ScholarCross Ref
Harini Suresh and John V. Guttag. 2021. A Framework for Understanding Sources of Harm throughout the Machine Learning Life Cycle. In EAAMO. ACM, 17:1–17:9.Google Scholar
Ana Valdivia, Javier Sánchez-Monedero, and Jorge Casillas. 2021. How fair can we go in machine learning? Assessing the boundaries of accuracy and fairness. Int. J. Intell. Syst. 36, 4 (2021), 1619–1643.Google ScholarDigital Library
João Vieira and Cláudia Antunes. 2014. Decision Tree Learner in the Presence of Domain Knowledge. In CSWS(Communications in Computer and Information Science, Vol. 480). Springer, 42–55.Google ScholarCross Ref
White House. 2022. Blueprint for an AI Bill of Rights. https://www.whitehouse.gov/ostp/ai-bill-of-rights/. Accessed on January 2nd, 2023.Google Scholar
Kun Zhang, Bernhard Schölkopf, Krikamol Muandet, and Zhikun Wang. 2013. Domain Adaptation under Target and Conditional Shift. In ICML (3)(JMLR Workshop and Conference Proceedings, Vol. 28). JMLR.org, 819–827.Google Scholar
Wenbin Zhang and Albert Bifet. 2020. FEAT: A Fairness-Enhancing and Concept-Adapting Decision Tree Classifier. In DS(Lecture Notes in Computer Science, Vol. 12323). Springer, 175–189.Google ScholarDigital Library
Wenbin Zhang and Eirini Ntoutsi. 2019. FAHT: An Adaptive Fairness-aware Decision Tree Classifier. In IJCAI. ijcai.org, 1480–1486.Google Scholar

Index Terms

Domain Adaptive Decision Trees: Implications for Accuracy and Fairness
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Multi-task learning
        Learning under covariate shift
        Transfer learning
    2. Machine learning approaches
      1. Classification and regression trees

Recommendations

Feature-level domain adaptation

Domain adaptation is the supervised learning setting in which the training and test data are sampled from different distributions: training data is sampled from a source domain, whilst test data is sampled from a target domain. This paper proposes and ...
Read More
Graph Adaptive Knowledge Transfer for Unsupervised Domain Adaptation
Computer Vision – ECCV 2018
Abstract
Unsupervised domain adaptation has caught appealing attentions as it facilitates the unlabeled target learning by borrowing existing well-established source domain knowledge. Recent practice on domain adaptation manages to extract effective ...
Read More
Multivariate Decision Trees

Unlike a univariate decision tree, a multivariate decision tree is not restricted to splits of the instance space that are orthogonal to the features' axes. This article addresses several issues for constructing multivariate decision trees: representing ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

FAccT '23: Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency
June 2023
1929 pages
ISBN:9798400701924
DOI:10.1145/3593013

Copyright © 2023 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 12 June 2023
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Covariate Shift
Decision Trees
Domain Adaptation
Fairness
Information Gain
folktables
Qualifiers
- research-article
- Research
- Refereed limited
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 189
  Total Downloads
- Downloads (Last 12 months)189
- Downloads (Last 6 weeks)14
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Domain Adaptive Decision Trees: Implications for Accuracy and Fairness

FAccT '23: Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency

ABSTRACT

Supplemental Material

Available for Download

References

Cited By

Index Terms

Recommendations

Feature-level domain adaptation

Graph Adaptive Knowledge Transfer for Unsupervised Domain Adaptation

Multivariate Decision Trees

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

Domain Adaptive Decision Trees: Implications for Accuracy and Fairness

FAccT '23: Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency

ABSTRACT

Supplemental Material

Available for Download

References

Cited By

Index Terms

Recommendations

Feature-level domain adaptation

Graph Adaptive Knowledge Transfer for Unsupervised Domain Adaptation

Multivariate Decision Trees

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media