嘈杂的LSTM：提高视频语义细分的时间意识

论文标题

嘈杂的LSTM：提高视频语义细分的时间意识

Noisy-LSTM: Improving Temporal Awareness for Video Semantic Segmentation

论文作者

Wang, Bowen, Li, Liangzhi, Nakashima, Yuta, Kawasaki, Ryo, Nagahara, Hajime, Yagi, Yasushi

论文摘要

语义视频细分是各种应用程序的关键挑战。本文提出了一种名为Noisy-LSTM的新模型，该模型可以端到端的方式进行训练，并使用卷积的LSTMS（ConvlSTMS）来利用视频帧中的时间相干性。我们还提出了一种简单而有效的培训策略，该策略用噪音代替了视频序列中的框架。该策略破坏了训练过程中视频帧的时间相干性，因此使ConvlSTMS中的时间链接不可靠，因此可能会改善视频帧中的功能提取，并用作常规器以避免过度拟合，而无需额外的数据注释或计算成本。实验结果表明，所提出的模型可以在CityScapes和Endovis2018数据集中实现最先进的性能。

Semantic video segmentation is a key challenge for various applications. This paper presents a new model named Noisy-LSTM, which is trainable in an end-to-end manner, with convolutional LSTMs (ConvLSTMs) to leverage the temporal coherency in video frames. We also present a simple yet effective training strategy, which replaces a frame in video sequence with noises. This strategy spoils the temporal coherency in video frames during training and thus makes the temporal links in ConvLSTMs unreliable, which may consequently improve feature extraction from video frames, as well as serve as a regularizer to avoid overfitting, without requiring extra data annotation or computational costs. Experimental results demonstrate that the proposed model can achieve state-of-the-art performances in both the CityScapes and EndoVis2018 datasets.

下载PDF全文

下载文献需遵守相关版权规定

论文标题