诊断和预防经常性视频处理中的不稳定性

论文标题

诊断和预防经常性视频处理中的不稳定性

Diagnosing and Preventing Instabilities in Recurrent Video Processing

论文作者

Tanay, Thomas, Sootla, Aivar, Maggioni, Matteo, Dokania, Puneet K., Torr, Philip, Leonardis, Ales, Slabaugh, Gregory

论文摘要

复发模型是视频增强任务（例如视频denoising或超分辨率）的流行选择。在这项工作中，我们专注于它们作为动力学系统的稳定性，并表明它们在长时间视频序列上的推理时间往往会灾难性地失败。为了解决这个问题，我们（1）引入了一种诊断工具，该工具会产生针对触发不稳定性进行优化的输入序列，并且可以将其解释为时间接收场的可视化序列，（2）提出了两种方法来在训练过程中执行模型的稳定性：限制光谱规范或构造其卷积层稳定等级的稳定等级。然后，我们引入了卷积层（SRN-C）的稳定等级归一化，这是一种实施这些约束的新算法。我们的实验结果表明，SRN-C成功地在反复的视频处理模型中实施了稳定性，而没有大幅度的性能损失。

Recurrent models are a popular choice for video enhancement tasks such as video denoising or super-resolution. In this work, we focus on their stability as dynamical systems and show that they tend to fail catastrophically at inference time on long video sequences. To address this issue, we (1) introduce a diagnostic tool which produces input sequences optimized to trigger instabilities and that can be interpreted as visualizations of temporal receptive fields, and (2) propose two approaches to enforce the stability of a model during training: constraining the spectral norm or constraining the stable rank of its convolutional layers. We then introduce Stable Rank Normalization for Convolutional layers (SRN-C), a new algorithm that enforces these constraints. Our experimental results suggest that SRN-C successfully enforces stability in recurrent video processing models without a significant performance loss.

下载PDF全文

下载文献需遵守相关版权规定

论文标题