通过变更点检测的半监督序列分类

论文标题

通过变更点检测的半监督序列分类

Semi-supervised sequence classification through change point detection

论文作者

Ahad, Nauman, Davenport, Mark A.

论文摘要

顺序传感器数据是在多种实际应用中生成的。一个基本挑战涉及对此类顺序数据学习有效分类器。尽管深度学习导致近年来在诸如语音之类的领域中获得了令人印象深刻的性能，但这依赖于具有高质量标签的大型序列数据集的可用性。但是，在许多应用中，关联的类标签通常非常有限，精确的标签/分割太昂贵，无法大量执行。但是，仍然可以使用大量未标记的数据。在本文中，我们为在这种情况下为半监督学习提供了一个新颖的框架。以无监督的方式，可以使用更改点检测方法来识别与可能类更改相对应的序列中的点。我们表明，更改点提供了相似/不同的序列对的示例，这些序列与标记时可以在半监督分类设置中使用。利用更改点和标记的数据，我们形成了相似/不同序列的示例，以训练神经网络以学习改进的分类表示。我们提供广泛的合成模拟，并表明学习的表示优于通过自动编码器学到的那些，并在模拟和现实世界的人类活动识别数据集上获得了改进的结果。

Sequential sensor data is generated in a wide variety of practical applications. A fundamental challenge involves learning effective classifiers for such sequential data. While deep learning has led to impressive performance gains in recent years in domains such as speech, this has relied on the availability of large datasets of sequences with high-quality labels. In many applications, however, the associated class labels are often extremely limited, with precise labelling/segmentation being too expensive to perform at a high volume. However, large amounts of unlabeled data may still be available. In this paper we propose a novel framework for semi-supervised learning in such contexts. In an unsupervised manner, change point detection methods can be used to identify points within a sequence corresponding to likely class changes. We show that change points provide examples of similar/dissimilar pairs of sequences which, when coupled with labeled, can be used in a semi-supervised classification setting. Leveraging the change points and labeled data, we form examples of similar/dissimilar sequences to train a neural network to learn improved representations for classification. We provide extensive synthetic simulations and show that the learned representations are superior to those learned through an autoencoder and obtain improved results on both simulated and real-world human activity recognition datasets.

下载PDF全文

下载文献需遵守相关版权规定

论文标题