论文标题

通过利用上下文广泛的语音类信息来改善语音增强性能

Improving Speech Enhancement Performance by Leveraging Contextual Broad Phonetic Class Information

论文作者

Lu, Yen-Ju, Chang, Chia-Yu, Yu, Cheng, Liu, Ching-Feng, Hung, Jeih-weih, Watanabe, Shinji, Tsao, Yu

论文摘要

先前的研究已经证实,通过以关节特征的位置/方式来增强声学特征,可以指导语音增强(SE)过程,以考虑在执行增强功能以​​提高性能改善时输入语音的广泛语音特性。在本文中,我们探讨了发音属性的上下文信息,作为其他信息,以进一步受益。更具体地说,我们建议通过利用端到端自动语音识别(E2E-ASR)模型的损失来提高SE性能,该模型预测了广泛的语音类别(BPC)的顺序。我们还通过ASR和感知损失开发了多目标培训,以基于基于BPC的E2E-ASR训练SE系统。言语deoing,语音消失和言语增强任务受损的实验结果证实,上下文BPC信息可以提高SE的性能。此外,使用基于BPC的E2E-ASR训练的SE模型优于基于音素的E2E-ASR。结果表明,ASR系统对音素错误分类的目标可能会导致不完美的反馈,而BPC可能是一个更好的选择。最后,注意到,在计算额外目标时,将最容易发生的语音目标组合到相同的BPC中可以有效地改善SE性能。

Previous studies have confirmed that by augmenting acoustic features with the place/manner of articulatory features, the speech enhancement (SE) process can be guided to consider the broad phonetic properties of the input speech when performing enhancement to attain performance improvements. In this paper, we explore the contextual information of articulatory attributes as additional information to further benefit SE. More specifically, we propose to improve the SE performance by leveraging losses from an end-to-end automatic speech recognition (E2E-ASR) model that predicts the sequence of broad phonetic classes (BPCs). We also developed multi-objective training with ASR and perceptual losses to train the SE system based on a BPC-based E2E-ASR. Experimental results from speech denoising, speech dereverberation, and impaired speech enhancement tasks confirmed that contextual BPC information improves SE performance. Moreover, the SE model trained with the BPC-based E2E-ASR outperforms that with the phoneme-based E2E-ASR. The results suggest that objectives with misclassification of phonemes by the ASR system may lead to imperfect feedback, and BPC could be a potentially better choice. Finally, it is noted that combining the most-confusable phonetic targets into the same BPC when calculating the additional objective can effectively improve the SE performance.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源