论文标题
错误奇偶校验公平:在回归任务中对团体公平性的测试
Error Parity Fairness: Testing for Group Fairness in Regression Tasks
论文作者
论文摘要
人工智能(AI)的应用围绕着越来越多的人类生活的决定。社会通过对此类自动决策系统(ADSS)的责任制施加法律和社会期望来做出回应。公平是AI问责制的基本组成部分,与个人和敏感群体的治疗(例如,基于性别,种族)有关。尽管许多研究着重于对分类任务的公平学习和公平测试,但文献却相当有限地限制了如何检查回归任务中的公平性。这项工作将错误平价作为回归公平概念,并引入了基于统计假设测试程序评估群体公平性的测试方法。错误奇偶校验测试检查预测错误是否在敏感组之间相似地分布,以确定广告是否公平。随后是合适的置换测试,以比较几个统计数据的组,以探索差距并确定受影响的群体。提出的方法的有用性和适用性是通过对美国县一级的COVID-19预测的案例研究来证明的,该研究揭示了基于种族的预测错误的差异。总体而言,拟议的回归公平测试方法填补了公平的机器学习文献中的空白,并且可以作为更大的问责制评估和算法审核的一部分。
The applications of Artificial Intelligence (AI) surround decisions on increasingly many aspects of human lives. Society responds by imposing legal and social expectations for the accountability of such automated decision systems (ADSs). Fairness, a fundamental constituent of AI accountability, is concerned with just treatment of individuals and sensitive groups (e.g., based on sex, race). While many studies focus on fair learning and fairness testing for the classification tasks, the literature is rather limited on how to examine fairness in regression tasks. This work presents error parity as a regression fairness notion and introduces a testing methodology to assess group fairness based on a statistical hypothesis testing procedure. The error parity test checks whether prediction errors are distributed similarly across sensitive groups to determine if an ADS is fair. It is followed by a suitable permutation test to compare groups on several statistics to explore disparities and identify impacted groups. The usefulness and applicability of the proposed methodology are demonstrated via a case study on COVID-19 projections in the US at the county level, which revealed race-based differences in forecast errors. Overall, the proposed regression fairness testing methodology fills a gap in the fair machine learning literature and may serve as a part of larger accountability assessments and algorithm audits.