论文标题

对语音到文本系统的自我监督模型中性别影响的研究

A Study of Gender Impact in Self-supervised Models for Speech-to-Text Systems

论文作者

Boito, Marcely Zanon, Besacier, Laurent, Tomashenko, Natalia, Estève, Yannick

论文摘要

最近,用于语音处理的自我监督模型最近作为语音处理管道中流行的基础块出现。这些模型在未标记的音频数据上进行了预训练,然后用于语音处理下游任务,例如自动语音识别(ASR)或语音翻译(ST)。由于这些模型现在都用于研究和工业系统,因此有必要理解某些特征在培训数据中的性别分布等特征所造成的影响。使用法语作为我们的调查语言,我们训练和比较特定性别的WAV2VEC 2.0模型与在其预训练数据中包含不同程度的性别平衡的模型。通过将这些模型应用于两个语音到文本下游任务:ASR和ST进行比较。结果显示了下游集成的类型。在微调端到端ASR系统之前,我们使用性别特定的预训练观察到较低的总体性能。但是,当使用自我监管的模型用作特征提取器时,总体ASR和ST结果遵循更复杂的模式,在这种模式下,平衡的预训练模型不一定会带来最佳结果。最后,我们粗制的“公平”度量标准,即男性和男性测试集之间测量的相对性能差异,并未显示出从平衡到特定性别特定的预训练的WAV2VEC 2.0模型的强烈变化。

Self-supervised models for speech processing emerged recently as popular foundation blocks in speech processing pipelines. These models are pre-trained on unlabeled audio data and then used in speech processing downstream tasks such as automatic speech recognition (ASR) or speech translation (ST). Since these models are now used in research and industrial systems alike, it becomes necessary to understand the impact caused by some features such as gender distribution within pre-training data. Using French as our investigation language, we train and compare gender-specific wav2vec 2.0 models against models containing different degrees of gender balance in their pre-training data. The comparison is performed by applying these models to two speech-to-text downstream tasks: ASR and ST. Results show the type of downstream integration matters. We observe lower overall performance using gender-specific pre-training before fine-tuning an end-to-end ASR system. However, when self-supervised models are used as feature extractors, the overall ASR and ST results follow more complex patterns in which the balanced pre-trained model does not necessarily lead to the best results. Lastly, our crude 'fairness' metric, the relative performance difference measured between female and male test sets, does not display a strong variation from balanced to gender-specific pre-trained wav2vec 2.0 models.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源