生物信号分类中的域概括

论文标题

生物信号分类中的域概括

Domain Generalization in Biosignal Classification

论文作者

Dissanayake, Theekshana, Fernando, Tharindu, Denman, Simon, Ghaemmaghami, Houman, Sridharan, Sridha, Fookes, Clinton

论文摘要

目的：当培训机器学习模型时，我们经常假设训练数据和评估数据是从相同分布中采样的。但是，即使该数据库包含相同的类，在另一个看不见但相似的数据库中评估模型时，就会违反此假设。这个问题是由域换档引起的，可以使用两种方法来解决：域的适应和域的概括。简而言之，域适应方法可以在培训期间访问来自看不见的域的数据；而在域的概括中，在培训期间没有看不见的数据。因此，域的概括涉及在无法访问的域移位数据上表现良好的模型。方法：我们提出的域概括方法代表了使用一组已知基础域的看不见的域，之后我们使用分类器融合对看不见的域进行了分类。为了展示我们的系统，我们采用了包含正常和异常声音（类）的心脏声音数据库集合。结果：对于四个完全看不见的域，我们提出的分类器融合方法可实现高达16％的准确性。结论：认识到生物信号数据固有的时间性质引起的复杂性，本研究中提出的两阶段方法能够有效地简化整个领域概括的过程，同时证明对看不见的域和所采用的基础域的良好结果。意义：据我们所知，这是第一项研究生物信号数据域概括的研究。我们提出的学习策略可用于有效地学习与域相关的特征，同时了解数据中的类别差异。

Objective: When training machine learning models, we often assume that the training data and evaluation data are sampled from the same distribution. However, this assumption is violated when the model is evaluated on another unseen but similar database, even if that database contains the same classes. This problem is caused by domain-shift and can be solved using two approaches: domain adaptation and domain generalization. Simply, domain adaptation methods can access data from unseen domains during training; whereas in domain generalization, the unseen data is not available during training. Hence, domain generalization concerns models that perform well on inaccessible, domain-shifted data. Method: Our proposed domain generalization method represents an unseen domain using a set of known basis domains, afterwhich we classify the unseen domain using classifier fusion. To demonstrate our system, we employ a collection of heart sound databases that contain normal and abnormal sounds (classes). Results: Our proposed classifier fusion method achieves accuracy gains of up to 16% for four completely unseen domains. Conclusion: Recognizing the complexity induced by the inherent temporal nature of biosignal data, the two-stage method proposed in this study is able to effectively simplify the whole process of domain generalization while demonstrating good results on unseen domains and the adopted basis domains. Significance: To our best knowledge, this is the first study that investigates domain generalization for biosignal data. Our proposed learning strategy can be used to effectively learn domain-relevant features while being aware of the class differences in the data.

下载PDF全文

下载文献需遵守相关版权规定

论文标题