包容性演讲者验证评估数据集的设计指南

论文标题

包容性演讲者验证评估数据集的设计指南

Design Guidelines for Inclusive Speaker Verification Evaluation Datasets

论文作者

Hutiri, Wiebke Toussaint, Gorce, Lauriane, Ding, Aaron Yi

论文摘要

扬声器验证（SV）为访问控制提供了数十亿个支持语音的设备，并确保语音驱动技术的安全性。作为一种生物识别技术，SV有必要公正，无论他们的人口，社会和经济属性如何，在演讲者之间保持一致且可靠的性能。当前的SV评估实践不足以评估偏见：它们过度简化和汇总用户，不代表现实生活中的使用情况，并且不考虑错误的后果。本文提出了设计指南，以构建解决这些短暂事件的SV评估数据集。我们提出了一个用于分级话语对的难度的模式，并提出了一种用于生成包容性SV数据集的算法。我们在Voxceleb1数据集的一组实验中验证了我们提出的方法。我们的结果证实了话语对/扬声器的计数，以及口语对的难度对评估性能和可变性具有重大影响。我们的工作有助于发展包容性和公平的SV评估实践。

Speaker verification (SV) provides billions of voice-enabled devices with access control, and ensures the security of voice-driven technologies. As a type of biometrics, it is necessary that SV is unbiased, with consistent and reliable performance across speakers irrespective of their demographic, social and economic attributes. Current SV evaluation practices are insufficient for evaluating bias: they are over-simplified and aggregate users, not representative of real-life usage scenarios, and consequences of errors are not accounted for. This paper proposes design guidelines for constructing SV evaluation datasets that address these short-comings. We propose a schema for grading the difficulty of utterance pairs, and present an algorithm for generating inclusive SV datasets. We empirically validate our proposed method in a set of experiments on the VoxCeleb1 dataset. Our results confirm that the count of utterance pairs/speaker, and the difficulty grading of utterance pairs have a significant effect on evaluation performance and variability. Our work contributes to the development of SV evaluation practices that are inclusive and fair.

下载PDF全文

下载文献需遵守相关版权规定

论文标题