L3-NET深度音频嵌入以改善智能手机数据的COVID-19检测

论文标题

L3-NET深度音频嵌入以改善智能手机数据的COVID-19检测

L3-Net Deep Audio Embeddings to Improve COVID-19 Detection from Smartphone Data

论文作者

Campana, Mattia Giovanni, Rovati, Andrea, Delmastro, Franca, Pagani, Elena

论文摘要

智能手机和可穿戴设备以及人工智能，可以通过实施低成本和普遍的解决方案来代表大流行控制中的游戏规则，以识别在早期阶段的发展新疾病的发展，并有可能避免新爆发的崛起。最近的一些作品显示了通过使用机器学习和手工制作的声学特征从语音和咳嗽中检测到Covid-19的诊断信号的希望。在本文中，我们决定调查最近提出的深层嵌入模型L3-NET的功能，以自动从原始呼吸道音频录音中提取有意义的特征，以提高标准机器学习分类器的性能，以区分智能手机数据的COVID-19正面和负面受试者。我们在3个数据集上评估了所提出的模型，将获得的结果与两项参考作品的结果进行了比较。结果表明，在一组无关的实验中，L3-NET与手工制作的特征的组合克服了AUC的其他作品的性能。该结果为进一步研究不同的深度音频嵌入量铺平了道路，这也是自动检测不同疾病的方法。

Smartphones and wearable devices, along with Artificial Intelligence, can represent a game-changer in the pandemic control, by implementing low-cost and pervasive solutions to recognize the development of new diseases at their early stages and by potentially avoiding the rise of new outbreaks. Some recent works show promise in detecting diagnostic signals of COVID-19 from voice and coughs by using machine learning and hand-crafted acoustic features. In this paper, we decided to investigate the capabilities of the recently proposed deep embedding model L3-Net to automatically extract meaningful features from raw respiratory audio recordings in order to improve the performances of standard machine learning classifiers in discriminating between COVID-19 positive and negative subjects from smartphone data. We evaluated the proposed model on 3 datasets, comparing the obtained results with those of two reference works. Results show that the combination of L3-Net with hand-crafted features overcomes the performance of the other works of 28.57% in terms of AUC in a set of subject-independent experiments. This result paves the way to further investigation on different deep audio embeddings, also for the automatic detection of different diseases.

下载PDF全文

下载文献需遵守相关版权规定

论文标题