深层多层感知尺寸语音情感识别

论文标题

深层多层感知尺寸语音情感识别

Deep Multilayer Perceptrons for Dimensional Speech Emotion Recognition

论文作者

Atmaja, Bagus Tris, Akagi, Masato

论文摘要

现代深度学习体系结构通常在高性能计算设施上执行，因为其模型的输入功能和复杂性很大。本文提出了具有深层和较小的输入尺寸的传统多层感知器（MLP），以应对计算需求限制。结果表明，我们提出的深层MLP优于现代深度学习体系结构，即LSTM和CNN，在相同数量的层和参数的价值上。在Iemocap和MSP-Improv语料库上，深度MLP在依赖说话者和扬声器的情景方面表现出最高的性能。

Modern deep learning architectures are ordinarily performed on high-performance computing facilities due to the large size of the input features and complexity of its model. This paper proposes traditional multilayer perceptrons (MLP) with deep layers and small input size to tackle that computation requirement limitation. The result shows that our proposed deep MLP outperformed modern deep learning architectures, i.e., LSTM and CNN, on the same number of layers and value of parameters. The deep MLP exhibited the highest performance on both speaker-dependent and speaker-independent scenarios on IEMOCAP and MSP-IMPROV corpus.

下载PDF全文

下载文献需遵守相关版权规定

论文标题