论文标题
了解实例级别对公平限制的影响
Understanding Instance-Level Impact of Fairness Constraints
论文作者
论文摘要
文献中已经提出了各种公平限制,以减轻群体级的统计偏见。它们的影响在很大程度上被评估为与一组敏感属性(例如种族或性别)相对应的不同人群。尽管如此,社区尚未观察到足够的探索,以实例限制公平的限制。基于影响力功能的概念,该措施表征了训练示例对目标模型及其预测性能的影响,这项工作研究了施加公平限制时训练示例的影响。我们发现,在某些假设下,有关公平约束的影响功能可以分解为训练示例的内核组合。提出的公平影响功能的一个有希望的应用是确定可疑的训练示例,这些训练示例可能会通过对其影响分数进行排名来导致模型歧视。我们通过广泛的实验证明,对一部分重量数据示例进行培训会导致违反公平性较低,而准确性的权衡。
A variety of fairness constraints have been proposed in the literature to mitigate group-level statistical bias. Their impacts have been largely evaluated for different groups of populations corresponding to a set of sensitive attributes, such as race or gender. Nonetheless, the community has not observed sufficient explorations for how imposing fairness constraints fare at an instance level. Building on the concept of influence function, a measure that characterizes the impact of a training example on the target model and its predictive performance, this work studies the influence of training examples when fairness constraints are imposed. We find out that under certain assumptions, the influence function with respect to fairness constraints can be decomposed into a kernelized combination of training examples. One promising application of the proposed fairness influence function is to identify suspicious training examples that may cause model discrimination by ranking their influence scores. We demonstrate with extensive experiments that training on a subset of weighty data examples leads to lower fairness violations with a trade-off of accuracy.