ABSTRACT
In uses of pre-trained machine learning models, it is a known issue that the target population in which the model is being deployed may not have been reflected in the source population with which the model was trained. This can result in a biased model when deployed, leading to a reduction in model performance. One risk is that, as the population changes, certain demographic groups will be under-served or otherwise disadvantaged by the model, even as they become more represented in the target population. The field of domain adaptation proposes techniques for a situation where label data for the target population does not exist, but some information about the target distribution does exist. In this paper we contribute to the domain adaptation literature by introducing domain-adaptive decision trees (DADT). We focus on decision trees given their growing popularity due to their interpretability and performance relative to other more complex models. With DADT we aim to improve the accuracy of models trained in a source domain (or training data) that differs from the target domain (or test data). We propose an in-processing step that adjusts the information gain split criterion with outside information corresponding to the distribution of the target population. We demonstrate DADT on real data and find that it improves accuracy over a standard decision tree when testing in a shifted target population. We also study the change in fairness under demographic parity and equal opportunity. Results show an improvement in fairness with the use of DADT.
Supplemental Material
Available for Download
- Evan Archer, Il Memming Park, and Jonathan W. Pillow. 2014. Bayesian entropy estimation for countable discrete distributions. J. Mach. Learn. Res. 15, 1 (2014), 2833–2868.Google ScholarDigital Library
- Solon Barocas, Moritz Hardt, and Arvind Narayanan. 2019. Fairness and Machine Learning. fairmlbook.org. http://www.fairmlbook.org.Google Scholar
- Joy Buolamwini and Timnit Gebru. 2018. Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification. In FAT(Proceedings of Machine Learning Research, Vol. 81). PMLR, 77–91.Google Scholar
- Thomas M. Cover and Joy A. Thomas. 2001. Elements of Information Theory. Wiley.Google Scholar
- Catherine D’Ignazio and Lauren F. Klein. 2020. Data Feminism. The MIT Press.Google Scholar
- Frances Ding, Moritz Hardt, John Miller, and Ludwig Schmidt. 2021. Retiring Adult: New Datasets for Fair Machine Learning. In NeurIPS. 6478–6490.Google Scholar
- Sanghamitra Dutta, Dennis Wei, Hazar Yueksel, Pin-Yu Chen, Sijia Liu, and Kush R. Varshney. 2020. Is There a Trade-Off Between Fairness and Accuracy? A Perspective Using Mismatched Hypothesis Testing. In ICML(Proceedings of Machine Learning Research, Vol. 119). PMLR, 2803–2813.Google Scholar
- European Commission. 2021. Proposal for a Regulation of the European Parliament and of the Council Laying Down Harmonised Rules on Artificial Intelligence (Artificial Intelligence Act) and Amending Certain Union Legislative Acts. https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=celex%3A52021PC0206, accessed on January 2nd, 2023.Google Scholar
- Igor Goldenberg and Geoffrey I. Webb. 2019. Survey of distance measures for quantifying concept drift and shift in numeric data. Knowl. Inf. Syst. 60, 2 (2019), 591–615.Google ScholarDigital Library
- Léo Grinsztajn, Edouard Oyallon, and Gaël Varoquaux. 2022. Why do tree-based models still outperform deep learning on typical tabular data?. In NeurIPS.Google Scholar
- Silviu Guiasu. 1971. Weighted Entropy. Reports on Mathematical Physic 2, 3 (1971).Google Scholar
- Moritz Hardt, Eric Price, and Nati Srebro. 2016. Equality of Opportunity in Supervised Learning. In Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, December 5-10, 2016, Barcelona, Spain.Google ScholarDigital Library
- Trevor Hastie, Robert Tibshirani, Jerome H Friedman, and Jerome H Friedman. 2009. The elements of statistical learning: data mining, inference, and prediction. Vol. 2. Springer.Google Scholar
- Annie Liang, Jay Lu, and Xiaosheng Mu. 2022. Algorithmic Design: Fairness Versus Accuracy. In EC. ACM, 58–59.Google Scholar
- Subha Maity, Debarghya Mukherjee, Mikhail Yurochkin, and Yuekai Sun. 2021. Does enforcing fairness mitigate biases caused by subpopulation shift?. In NeurIPS. 25773–25784.Google Scholar
- Tomasz Maszczyk and Wlodzislaw Duch. 2008. Comparison of Shannon, Renyi and Tsallis Entropy Used in Decision Trees. In ICAISC(Lecture Notes in Computer Science, Vol. 5097). Springer, 643–651.Google Scholar
- Jose G. Moreno-Torres, Troy Raeder, Rocío Alaíz-Rodríguez, Nitesh V. Chawla, and Francisco Herrera. 2012. A unifying view on dataset shift in classification. Pattern Recognit. 45, 1 (2012), 521–530.Google ScholarDigital Library
- Debarghya Mukherjee, Felix Petersen, Mikhail Yurochkin, and Yuekai Sun. 2022. Domain Adaptation meets Individual Fairness. And they get along. In NeurIPS.Google Scholar
- Ilya Nemenman, F. Shafee, and William Bialek. 2001. Entropy and Inference, Revisited. In NIPS. MIT Press, 471–478.Google Scholar
- Sebastian Nowozin. 2012. Improved Information Gain Estimates for Decision Tree Induction. In ICML. icml.cc / Omnipress.Google Scholar
- Vipin Kumar Pang-Ning Tan, Michael Steinbach. 2006. Introduction to Data Mining. Addison Wesley.Google Scholar
- Joaquin Quiñonero-Candela, Masashi Sugiyama, Neil D Lawrence, and Anton Schwaighofer. 2009. Dataset shift in machine learning. MIT Press.Google ScholarDigital Library
- Ievgen Redko, Emilie Morvant, Amaury Habrard, Marc Sebban, and Younès Bennani. 2020. A survey on domain adaptation theory. CoRR abs/2004.11829 (2020).Google Scholar
- Cynthia Rudin. 2016. A renaissance for decision tree learning. https://www.youtube.com/watch?v=bY7WEr6lcuY. Keynote at PAPIs 2016..Google Scholar
- Cynthia Rudin. 2019. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat. Mach. Intell. 1, 5 (2019), 206–215.Google ScholarCross Ref
- Thomas Schürmann. 2004. Bias analysis in entropy estimation. Journal of Physics A: Mathematical and General 37, 27 (2004), L295–L301. https://doi.org/10.1088/0305-4470/37/27/l02Google ScholarCross Ref
- Edward H Simpson. 1951. The interpretation of interaction in contingency tables. Journal of the Royal Statistical Society: Series B (Methodological) 13, 2 (1951), 238–241.Google ScholarCross Ref
- Gonen Singer, Roee Anuar, and Irad Ben-Gal. 2020. A weighted information-gain measure for ordinal classification trees. Expert Syst. Appl. 152 (2020), 113375.Google ScholarCross Ref
- Harini Suresh and John V. Guttag. 2021. A Framework for Understanding Sources of Harm throughout the Machine Learning Life Cycle. In EAAMO. ACM, 17:1–17:9.Google Scholar
- Ana Valdivia, Javier Sánchez-Monedero, and Jorge Casillas. 2021. How fair can we go in machine learning? Assessing the boundaries of accuracy and fairness. Int. J. Intell. Syst. 36, 4 (2021), 1619–1643.Google ScholarDigital Library
- João Vieira and Cláudia Antunes. 2014. Decision Tree Learner in the Presence of Domain Knowledge. In CSWS(Communications in Computer and Information Science, Vol. 480). Springer, 42–55.Google ScholarCross Ref
- White House. 2022. Blueprint for an AI Bill of Rights. https://www.whitehouse.gov/ostp/ai-bill-of-rights/. Accessed on January 2nd, 2023.Google Scholar
- Kun Zhang, Bernhard Schölkopf, Krikamol Muandet, and Zhikun Wang. 2013. Domain Adaptation under Target and Conditional Shift. In ICML (3)(JMLR Workshop and Conference Proceedings, Vol. 28). JMLR.org, 819–827.Google Scholar
- Wenbin Zhang and Albert Bifet. 2020. FEAT: A Fairness-Enhancing and Concept-Adapting Decision Tree Classifier. In DS(Lecture Notes in Computer Science, Vol. 12323). Springer, 175–189.Google ScholarDigital Library
- Wenbin Zhang and Eirini Ntoutsi. 2019. FAHT: An Adaptive Fairness-aware Decision Tree Classifier. In IJCAI. ijcai.org, 1480–1486.Google Scholar
Index Terms
- Domain Adaptive Decision Trees: Implications for Accuracy and Fairness
Recommendations
Feature-level domain adaptation
Domain adaptation is the supervised learning setting in which the training and test data are sampled from different distributions: training data is sampled from a source domain, whilst test data is sampled from a target domain. This paper proposes and ...
Graph Adaptive Knowledge Transfer for Unsupervised Domain Adaptation
Computer Vision – ECCV 2018AbstractUnsupervised domain adaptation has caught appealing attentions as it facilitates the unlabeled target learning by borrowing existing well-established source domain knowledge. Recent practice on domain adaptation manages to extract effective ...
Multivariate Decision Trees
Unlike a univariate decision tree, a multivariate decision tree is not restricted to splits of the instance space that are orthogonal to the features' axes. This article addresses several issues for constructing multivariate decision trees: representing ...
Comments