我们能走多低？语义分割的像素注释

论文标题

我们能走多低？语义分割的像素注释

How Low Can We Go? Pixel Annotation for Semantic Segmentation

论文作者

Kigli, Daniel, Shamir, Ariel, Avidan, Shai

论文摘要

在没有任何先验知识的情况下，需要多少个标记的像素来细分图像？我们进行了一个实验来回答这个问题。在我们的实验中，Oracle正在使用主动学习从头开始训练网络。 Oracle可以访问图像的整个标签图，但目标是向网络揭示尽可能小的像素标签。我们发现，甲骨文平均需要揭示（即注释）小于0.1％的像素以训练网络。然后，网络可以以超过98％的精度标记图像中的所有像素。基于这个单形图表的实验，我们设计了一个实验，以快速注释整个数据集。在数据集级别中，Oracle训练从头开始的每个图像的新网络。然后，网络可用于创建伪标记，该标签是整个图像的网络预测标签。只有这样，数据集级别的网络才会一次从头开始训练所有伪标记的图像。我们在两个非常不同的现实世界数据集上重复图像级别和数据集级别实验，并发现使用注释成本的一小部分可以达到完全注释数据集的性能。

How many labeled pixels are needed to segment an image, without any prior knowledge? We conduct an experiment to answer this question. In our experiment, an Oracle is using Active Learning to train a network from scratch. The Oracle has access to the entire label map of the image, but the goal is to reveal as little pixel labels to the network as possible. We find that, on average, the Oracle needs to reveal (i.e., annotate) less than 0.1% of the pixels in order to train a network. The network can then label all pixels in the image at an accuracy of more than 98%. Based on this single-image-annotation experiment, we design an experiment to quickly annotate an entire data set. In the data set level experiment the Oracle trains a new network for each image from scratch. The network can then be used to create pseudo-labels, which are the network predicted labels of the unlabeled pixels, for the entire image. Only then, a data set level network is trained from scratch on all the pseudo-labeled images at once. We repeat both image level and data set level experiments on two, very different, real-world data sets, and find that it is possible to reach the performance of a fully annotated data set using a fraction of the annotation cost.

下载PDF全文

下载文献需遵守相关版权规定

论文标题