论文标题

使用反向声道变量的广义扩张的CNN模型进行抑郁检测

Generalized Dilated CNN Models for Depression Detection Using Inverted Vocal Tract Variables

论文作者

Seneviratne, Nadee, Espy-Wilson, Carol

论文摘要

使用声带生物标志物的抑郁症检测是一个高度研究的领域。关节协调功能(ACF)是根据精神运动放缓引起的神经运动协调的变化而开发的,这是重度抑郁症的关键特征。但是,现有研究的发现大多在单个数据库上得到验证,该数据库限制了结果的普遍性。不同抑郁症数据库的变异性会对跨语料库评估(CCE)的结果不利影响。我们建议使用扩张的卷积神经网络开发一种通用分类器来进行抑郁检测,该卷积神经网络接受了从两个抑郁数据库中提取的ACF培训。我们表明,源自声道变量(TVS)的ACF显示了有望作为抑郁症检测的一组强大功能。与在单个数据库训练的模型上进行的CCE相比,我们的模型可实现约10%的相对准确性提高。我们扩展了这项研究,以表明融合电视和MEL频率的Cepstral系数可以进一步提高该分类器的性能。

Depression detection using vocal biomarkers is a highly researched area. Articulatory coordination features (ACFs) are developed based on the changes in neuromotor coordination due to psychomotor slowing, a key feature of Major Depressive Disorder. However findings of existing studies are mostly validated on a single database which limits the generalizability of results. Variability across different depression databases adversely affects the results in cross corpus evaluations (CCEs). We propose to develop a generalized classifier for depression detection using a dilated Convolutional Neural Network which is trained on ACFs extracted from two depression databases. We show that ACFs derived from Vocal Tract Variables (TVs) show promise as a robust set of features for depression detection. Our model achieves relative accuracy improvements of ~10% compared to CCEs performed on models trained on a single database. We extend the study to show that fusing TVs and Mel-Frequency Cepstral Coefficients can further improve the performance of this classifier.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源