在初始池上进行深入积极学习

论文标题

在初始池上进行深入积极学习

On Initial Pools for Deep Active Learning

论文作者

Chandra, Akshay L, Desai, Sai Vikas, Devaguptapu, Chaitanya, Balasubramanian, Vineeth N

论文摘要

主动学习（AL）技术旨在最大程度地减少培训模型所需的培训数据。基于池的AL技术从一个小的初始标记池开始，然后迭代地选择最有用的标签样品的批次。通常，对初始池进行随机采样并标记为播种。尽管最近的研究重点是评估AL中各种查询功能的鲁棒性，但几乎没有关注最初标记的池的设计以进行深度积极学习。鉴于学习表征最近以自我监督/无监督的方式取得的成功，我们研究是否智能采样的初始标记池可以改善深度绩效。我们研究了智能采样的初始标记池的影响，包括使用自制和无监督的策略对深度AL方法的使用。在进行实验之前，通过同行评审评估了设置，假设，方法和实施细节。从长远来看，实验结果无法最终证明，智能采样初始池比随机初始池更好，尽管基于各种自动编码器的初始池抽样策略显示出有趣的趋势，这表现出了有趣的趋势，这是值得更深入研究的趋势。

Active Learning (AL) techniques aim to minimize the training data required to train a model for a given task. Pool-based AL techniques start with a small initial labeled pool and then iteratively pick batches of the most informative samples for labeling. Generally, the initial pool is sampled randomly and labeled to seed the AL iterations. While recent studies have focused on evaluating the robustness of various query functions in AL, little to no attention has been given to the design of the initial labeled pool for deep active learning. Given the recent successes of learning representations in self-supervised/unsupervised ways, we study if an intelligently sampled initial labeled pool can improve deep AL performance. We investigate the effect of intelligently sampled initial labeled pools, including the use of self-supervised and unsupervised strategies, on deep AL methods. The setup, hypotheses, methodology, and implementation details were evaluated by peer review before experiments were conducted. Experimental results could not conclusively prove that intelligently sampled initial pools are better for AL than random initial pools in the long run, although a Variational Autoencoder-based initial pool sampling strategy showed interesting trends that merit deeper investigation.

下载PDF全文

下载文献需遵守相关版权规定

论文标题