论文标题
通过从视频中学到的时空一致性来改善语义细分
Improving Semantic Segmentation through Spatio-Temporal Consistency Learned from Videos
论文作者
论文摘要
我们通过在视频框架上执行3D几何和时间的一致性来利用无监督的学习深度学习,egomotion和相机内在的内在学习来提高单片图上分割的性能。预测的深度,egomotion和相机内在用于为分割模型提供额外的监督信号,可显着提高其质量,或者减少分段模型需求的标签数量。我们的实验是在扫描仪数据集上进行的。
We leverage unsupervised learning of depth, egomotion, and camera intrinsics to improve the performance of single-image semantic segmentation, by enforcing 3D-geometric and temporal consistency of segmentation masks across video frames. The predicted depth, egomotion, and camera intrinsics are used to provide an additional supervision signal to the segmentation model, significantly enhancing its quality, or, alternatively, reducing the number of labels the segmentation model needs. Our experiments were performed on the ScanNet dataset.