一种基于多视图的CNN的声学分类系统，用于自动动物识别

论文标题

一种基于多视图的CNN的声学分类系统，用于自动动物识别

A Multi-view CNN-based Acoustic Classification System for Automatic Animal Species Identification

论文作者

Xu, Weitao, Zhang, Xiang, Yao, Lina, Xue, Wanli, Wei, Bo

论文摘要

动物物种通过发声自动识别是一项重要且具有挑战性的任务。尽管文献中已经提出了许多类型的音频监测系统，但它们遭受了多种缺点，例如非平凡的特征选择，由于环境噪声或密集的局部计算而准确降解。在本文中，我们提出了一个基于深度学习的无线声学传感器网络（WASN）的大声分类框架。提出的框架基于云体系结构，该构建放大了无线传感器节点上的计算负担。为了提高识别精度，我们设计了一个多视图卷积神经网络（CNN），以并行提取短期，中期和长期依赖性。在两个真实数据集上的评估表明，当环境噪声主导音频信号（低SNR）时，所提出的体系结构可以实现高精度，并胜过传统的分类系统。此外，我们在测试台上实施并部署了所提出的系统，并在现实世界环境中分析系统性能。模拟和现实世界评估都证明了拟议的声学分类系统在区分动物物种中的准确性和鲁棒性。

Automatic identification of animal species by their vocalization is an important and challenging task. Although many kinds of audio monitoring system have been proposed in the literature, they suffer from several disadvantages such as non-trivial feature selection, accuracy degradation because of environmental noise or intensive local computation. In this paper, we propose a deep learning based acoustic classification framework for Wireless Acoustic Sensor Network (WASN). The proposed framework is based on cloud architecture which relaxes the computational burden on the wireless sensor node. To improve the recognition accuracy, we design a multi-view Convolution Neural Network (CNN) to extract the short-, middle-, and long-term dependencies in parallel. The evaluation on two real datasets shows that the proposed architecture can achieve high accuracy and outperforms traditional classification systems significantly when the environmental noise dominate the audio signal (low SNR). Moreover, we implement and deploy the proposed system on a testbed and analyse the system performance in real-world environments. Both simulation and real-world evaluation demonstrate the accuracy and robustness of the proposed acoustic classification system in distinguishing species of animals.

下载PDF全文

下载文献需遵守相关版权规定

论文标题