论文标题
共同信用自我监督学习
Conformal Credal Self-Supervised Learning
论文作者
论文摘要
在半监督的学习中,自我训练的范式是指从学习者本身建议的伪标记中学习的想法。在各个领域,相应的方法已被证明有效并实现了最先进的性能。但是,伪标签通常源于临时启发式,尽管依赖于预测的质量,但仍无法保证其有效性。一种这样的方法,即所谓的信用自我监督学习,以标签上的(而不是单个)概率分布的形式保持伪分布,从而允许灵活但不确定性的标签。然而,除了经验效力之外,没有任何理由。为了解决这种缺陷,我们利用保形预测,这种方法随着设定值预测的有效性保证。结果,严格的理论基础支持了信用标签集的构建,从而为未标记的数据提供了更好的校准和更少的错误监督监督。除此之外,我们还提出了从信用自我审议中学习的有效算法。一项实证研究表明,伪统治的出色校准特性以及我们方法在几个基准数据集上的竞争力。
In semi-supervised learning, the paradigm of self-training refers to the idea of learning from pseudo-labels suggested by the learner itself. Across various domains, corresponding methods have proven effective and achieve state-of-the-art performance. However, pseudo-labels typically stem from ad-hoc heuristics, relying on the quality of the predictions though without guaranteeing their validity. One such method, so-called credal self-supervised learning, maintains pseudo-supervision in the form of sets of (instead of single) probability distributions over labels, thereby allowing for a flexible yet uncertainty-aware labeling. Again, however, there is no justification beyond empirical effectiveness. To address this deficiency, we make use of conformal prediction, an approach that comes with guarantees on the validity of set-valued predictions. As a result, the construction of credal sets of labels is supported by a rigorous theoretical foundation, leading to better calibrated and less error-prone supervision for unlabeled data. Along with this, we present effective algorithms for learning from credal self-supervision. An empirical study demonstrates excellent calibration properties of the pseudo-supervision, as well as the competitiveness of our method on several benchmark datasets.