梯度调整的神经元激活曲线，用于卷积语音识别模型的全面内省

论文标题

梯度调整的神经元激活曲线，用于卷积语音识别模型的全面内省

Gradient-Adjusted Neuron Activation Profiles for Comprehensive Introspection of Convolutional Speech Recognition Models

论文作者

Krug, Andreas, Stober, Sebastian

论文摘要

基于深度学习的自动语音识别（ASR）模型非常成功，但很难解释。为了更好地了解人工神经网络（ANN）如何完成其任务，已经提出了内省方法。从计算机视觉识别到语音识别并不是直截了当的，因为语音数据比图像数据更复杂且易于解释。在这项工作中，我们介绍了梯度调整后的神经元激活概况（GradNAPS），以解释深神经网络中的特征和表示。毕业生是ANN对特定输入组的特征响应，其中结合了神经元对预测的相关性。我们展示了如何利用Gradnap来了解如何在ANN中处理数据。这包括可视化特征和毕业群集的不同方法，以比较给定网络的任何层中不同输入组的嵌入。我们使用完全趋化的ASR模型演示了我们提出的技术。

Deep Learning based Automatic Speech Recognition (ASR) models are very successful, but hard to interpret. To gain better understanding of how Artificial Neural Networks (ANNs) accomplish their tasks, introspection methods have been proposed. Adapting such techniques from computer vision to speech recognition is not straight-forward, because speech data is more complex and less interpretable than image data. In this work, we introduce Gradient-adjusted Neuron Activation Profiles (GradNAPs) as means to interpret features and representations in Deep Neural Networks. GradNAPs are characteristic responses of ANNs to particular groups of inputs, which incorporate the relevance of neurons for prediction. We show how to utilize GradNAPs to gain insight about how data is processed in ANNs. This includes different ways of visualizing features and clustering of GradNAPs to compare embeddings of different groups of inputs in any layer of a given network. We demonstrate our proposed techniques using a fully-convolutional ASR model.

下载PDF全文

下载文献需遵守相关版权规定

论文标题