Fairlex: A multilingual benchmark for evaluating fairness in legal text processing

doi:10.5281/zenodo.6322643

Published March 2, 2022 | Version v1

Dataset Open

Fairlex: A multilingual benchmark for evaluating fairness in legal text processing

1. University of Copenhagen, Denmark
2. University of Defense Technology, People's Republic of China

We present a benchmark suite of four datasets for evaluating the fairness of pre-trained legal language models and the techniques used to fine-tune them for downstream tasks. Our benchmarks cover four jurisdictions (European Council, USA, Swiss, and Chinese), five languages (English, German, French, Italian, and Chinese), and fairness across five attributes (gender, age, nationality/region, language, and legal area). In our experiments, we evaluate pre-trained language models using several group-robust fine-tuning techniques and show that performance group disparities are vibrant in many cases, while none of these techniques guarantee fairness, nor consistently mitigate group disparities. Furthermore, we provide a quantitative and qualitative analysis of our results, highlighting open challenges in the development of robustness methods in legal NLP.

Files

cail.zip

Files (365.1 MB)

Name	Size	Download all
cail.zip md5:9685ab4741109b47bb31e1d18e0afcf9	113.0 MB	Preview Download
ecthr.zip md5:509a903019df018c0619de4b947bc507	31.9 MB	Preview Download
fscs.zip md5:62bdbc95dbf84af688959b02b3dea3c9	85.4 MB	Preview Download
scotus.zip md5:741dd0c4495d74510d287b3885cf4f1f	134.8 MB	Preview Download

	All versions	This version
Views	146	146
Downloads	718	714
Data volume	108.0 GB	107.6 GB

Fairlex: A multilingual benchmark for evaluating fairness in legal text processing

Creators

Description

Files

cail.zip

Files (365.1 MB)