论文标题

探索过滤库学习以进行关键字发现

Exploring Filterbank Learning for Keyword Spotting

论文作者

López-Espejo, Iván, Tan, Zheng-Hua, Jensen, Jesper

论文摘要

尽管多年来的表现出色,但手工制作的语音功能不一定对于任何特定的语音应用程序都是最佳的。因此,随着成功或多或少的成功,已经研究了针对不同语音处理任务的最佳滤纸学习。在本文中,我们通过探索过滤库学习以获取关键字发现(KWS)来填补空白。检查了两种方法:在功率谱域中的滤波器矩阵学习和心理访问的Gammachirp Filterbank的参数学习。 FilterBank参数与现代深层神经网络后端的现代深层神经网络的联合优化。我们的实验结果表明,通常,在使用学习的滤纸和手工制作的语音特征之间,在KWS的准确性方面,没有统计学上的显着差异。因此,尽管我们得出的结论是,在使用现代KWS后端时,后者仍然是一个明智的选择,但我们还假设这可能是信息冗余的症状,这为小脚印KW的领域开辟了新的研究可能性。

Despite their great performance over the years, handcrafted speech features are not necessarily optimal for any particular speech application. Consequently, with greater or lesser success, optimal filterbank learning has been studied for different speech processing tasks. In this paper, we fill in a gap by exploring filterbank learning for keyword spotting (KWS). Two approaches are examined: filterbank matrix learning in the power spectral domain and parameter learning of a psychoacoustically-motivated gammachirp filterbank. Filterbank parameters are optimized jointly with a modern deep residual neural network-based KWS back-end. Our experimental results reveal that, in general, there are no statistically significant differences, in terms of KWS accuracy, between using a learned filterbank and handcrafted speech features. Thus, while we conclude that the latter are still a wise choice when using modern KWS back-ends, we also hypothesize that this could be a symptom of information redundancy, which opens up new research possibilities in the field of small-footprint KWS.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源