论文标题
机器学习的偏见 - 有什么好处?
Bias in Machine Learning -- What is it Good for?
论文作者
论文摘要
在公共媒体和科学出版物中,\ emph {bias}一词与机器学习在许多不同的情况下以及许多不同的含义一起使用。本文提出了通过调查机器学习的主要科学文献来对这些不同含义,术语和定义的分类学。在某些情况下,我们建议进行扩展和修改,以促进清晰的术语和完整性。调查之后进行分析和讨论,讨论如何连接不同类型的偏见并相互依赖。我们得出的结论是,在机器学习管道中发生的偏见与导致模型的偏差与模型的最终偏见(通常与社会歧视有关)。以前的偏见可能会或可能不会以有时很糟糕而有时不错的方式影响后者。
In public media as well as in scientific publications, the term \emph{bias} is used in conjunction with machine learning in many different contexts, and with many different meanings. This paper proposes a taxonomy of these different meanings, terminology, and definitions by surveying the, primarily scientific, literature on machine learning. In some cases, we suggest extensions and modifications to promote a clear terminology and completeness. The survey is followed by an analysis and discussion on how different types of biases are connected and depend on each other. We conclude that there is a complex relation between bias occurring in the machine learning pipeline that leads to a model, and the eventual bias of the model (which is typically related to social discrimination). The former bias may or may not influence the latter, in a sometimes bad, and sometime good way.