论文标题

基于相关性加权的语音和音频信号的可解释表示

Interpretable Representation Learning for Speech and Audio Signals Based on Relevance Weighting

论文作者

Agrawal, Purvi, Ganapathy, Sriram

论文摘要

从原始数据中学习可解释的表示形式对时间序列数据(例如语音)提出了重大挑战。在这项工作中,我们提出了一个相关的加权方案,该方案允许在模型本身的远期传播过程中解释语音表示。相关加权是使用执行特征选择任务的子网络方法实现的。相关性子网络应用于在原始语音信号上运行的卷积神经网络模型的第一层的输出,充当具有相关权重的声学滤清器(FB)层。在第二卷积层上应用的类似相关性子网络以相关权重执行调制过滤库学习。由相关性子网络,卷积层和前馈层组成的完整声学模型经过培训,以在Aurora-4,Chime-3和Voices Dataset中的嘈杂和回响语音的语音识别任务进行培训。所提出的表示学习框架还适用于urbansound8k数据集中的声音分类任务。对模型学到的相关权重的详细分析表明,相关权重捕获了有关基础语音/音频内容的信息。此外,语音识别和声音分类实验表明,在神经网络体系结构中的相关性加权融合可显着提高性能。

The learning of interpretable representations from raw data presents significant challenges for time series data like speech. In this work, we propose a relevance weighting scheme that allows the interpretation of the speech representations during the forward propagation of the model itself. The relevance weighting is achieved using a sub-network approach that performs the task of feature selection. A relevance sub-network, applied on the output of first layer of a convolutional neural network model operating on raw speech signals, acts as an acoustic filterbank (FB) layer with relevance weighting. A similar relevance sub-network applied on the second convolutional layer performs modulation filterbank learning with relevance weighting. The full acoustic model consisting of relevance sub-networks, convolutional layers and feed-forward layers is trained for a speech recognition task on noisy and reverberant speech in the Aurora-4, CHiME-3 and VOiCES datasets. The proposed representation learning framework is also applied for the task of sound classification in the UrbanSound8K dataset. A detailed analysis of the relevance weights learned by the model reveals that the relevance weights capture information regarding the underlying speech/audio content. In addition, speech recognition and sound classification experiments reveal that the incorporation of relevance weighting in the neural network architecture improves the performance significantly.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源