自动扬声器识别的偏见

论文标题

自动扬声器识别的偏见

Bias in Automated Speaker Recognition

论文作者

Hutiri, Wiebke Toussaint, Ding, Aaron

论文摘要

自动说话者识别使用数据处理来通过声音来识别说话者。如今，自动化发言人的认可已在数十亿个智能设备和呼叫中心等服务中部署。尽管它们在相关领域（如面部识别和自然语言处理）中进行了广泛的部署和已知的偏见来源，但自动说话者识别的偏见尚未系统地研究。我们介绍了机器学习开发工作流程中的偏见的深入经验和分析研究，这是自动说话者识别的语音生物特征和核心任务。利用一个既定的框架来理解机器学习中的伤害来源，我们表明在著名的Voxceleb说话者识别挑战中的每个开发阶段都存在偏见，包括数据生成，模型构建和实施。受影响的大多数是女性演讲者和非美国国籍，他们经历了重大的绩效降低。利用我们的发现中的见解，我们提出了减轻自动说话者识别偏见的实用建议，并概述了未来的研究方向。

Automated speaker recognition uses data processing to identify speakers by their voice. Today, automated speaker recognition is deployed on billions of smart devices and in services such as call centres. Despite their wide-scale deployment and known sources of bias in related domains like face recognition and natural language processing, bias in automated speaker recognition has not been studied systematically. We present an in-depth empirical and analytical study of bias in the machine learning development workflow of speaker verification, a voice biometric and core task in automated speaker recognition. Drawing on an established framework for understanding sources of harm in machine learning, we show that bias exists at every development stage in the well-known VoxCeleb Speaker Recognition Challenge, including data generation, model building, and implementation. Most affected are female speakers and non-US nationalities, who experience significant performance degradation. Leveraging the insights from our findings, we make practical recommendations for mitigating bias in automated speaker recognition, and outline future research directions.

下载PDF全文

下载文献需遵守相关版权规定

论文标题