论文标题
基于声音强度向量通过基于DNN的DeNoising和源分离来完善的声音事件本地化
Sound Event Localization based on Sound Intensity Vector Refined By DNN-Based Denoising and Source Separation
论文作者
论文摘要
我们提出了一种用于声音事件定位和检测(SELD)的排序方向(DOA)估计方法。直接使用深神经网络(DNN)(即完全传说的方法)直接估算DOA,可实现高精度。但是,单个和重叠来源的DOA估计之间的准确性存在差距,因为它们无法纳入物理知识。同时,尽管基于物理学的方法的准确性不如基于DNN的方法,但对于重叠源是可靠的。在这项研究中,我们考虑了基于物理和基于DNN的方法的组合。基于物理的DOA估计的声音强度向量(IVS)是基于基于DNN的DeNoising和源分离来完善的。该方法可以使用球形麦克风阵列对单个和重叠源进行准确的DOA估计。实验结果表明,所提出的方法在SELD的开放数据集上实现了最先进的DOA估计精度。
We propose a direction-of-arrival (DOA) estimation method for Sound Event Localization and Detection (SELD). Direct estimation of DOA using a deep neural network (DNN), i.e. completely-datadriven approach, achieves high accuracy. However, there is a gap in the accuracy between DOA estimation for single and overlapping sources because they cannot incorporate physical knowledge. Meanwhile, although the accuracy of physics-based approaches is inferior to DNN-based approaches, it is robust for overlapping source. In this study, we consider a combination of physics-based and DNN-based approaches; the sound intensity vectors (IVs) for physics-based DOA estimation is refined based on DNN-based denoising and source separation. This method enables the accurate DOA estimation for both single and overlapping sources using a spherical microphone array. Experimental results show that the proposed method achieves state-of-the-art DOA estimation accuracy on an open dataset of the SELD.