论文标题
基于预定义的均匀分布的类质心的半监督学习方法
Semi-supervised learning method based on predefined evenly-distributed class centroids
论文作者
论文摘要
与受监督的学习相比,半监督的学习降低了深度学习对大量标记样本的依赖性。在这项工作中,我们使用少量标记的样品,并对未标记的样品进行数据增强以实现图像分类。我们的方法通过相应的损耗函数将所有样品限制为预定义的均匀分布的类质心(PEDCC)。具体而言,用于标记的样品的PEDCC损失以及未标记样品的最大平均差异损失用于使特征分布更接近PEDCC的分布。我们的方法确保了类间距离很大,并且类内距离足够小,可以使不同类别之间的分类边界更清晰。同时,对于未标记的样本,我们还使用KL差异来限制未标记和增强样品之间网络预测的一致性。我们的半监督学习方法可实现最先进的结果,在CIFAR10上有4000个标记的样品,SVHN上有1000个标记的样品,精度分别为95.10%和97.58%。
Compared to supervised learning, semi-supervised learning reduces the dependence of deep learning on a large number of labeled samples. In this work, we use a small number of labeled samples and perform data augmentation on unlabeled samples to achieve image classification. Our method constrains all samples to the predefined evenly-distributed class centroids (PEDCC) by the corresponding loss function. Specifically, the PEDCC-Loss for labeled samples, and the maximum mean discrepancy loss for unlabeled samples are used to make the feature distribution closer to the distribution of PEDCC. Our method ensures that the inter-class distance is large and the intra-class distance is small enough to make the classification boundaries between different classes clearer. Meanwhile, for unlabeled samples, we also use KL divergence to constrain the consistency of the network predictions between unlabeled and augmented samples. Our semi-supervised learning method achieves the state-of-the-art results, with 4000 labeled samples on CIFAR10 and 1000 labeled samples on SVHN, and the accuracy is 95.10% and 97.58% respectively.