学习用于记忆效率的视频课程学习的凝结框架

论文标题

学习用于记忆效率的视频课程学习的凝结框架

Learning a Condensed Frame for Memory-Efficient Video Class-Incremental Learning

论文作者

Pei, Yixuan, Qing, Zhiwu, Cen, Jun, Wang, Xiang, Zhang, Shiwei, Wang, Yaxiong, Tang, Mingqian, Sang, Nong, Qian, Xueming

论文摘要

动作识别的最新增量学习通常存储代表性视频，以减轻灾难性遗忘。但是，由于记忆力有限，只能存储几个笨重的视频。为了解决这个问题，我们提出了FrameMaker，这是一种记忆效率的视频课程学习方法，该方法学会为每个选定的视频产生一个冷凝的框架。具体而言，FrameMaker主要由两个关键组成部分组成：框架冷凝和实例特定提示。前者是通过仅保留一个冷凝框架而不是整个视频来降低记忆成本，而后者旨在补偿框架凝结阶段中丢失的时空细节。通过这种方式，FrameMaker可以显着减少内存，但可以保留足够的信息，可应用于以下任务。实验结果对多个具有挑战性的基准测试，即HMDB51，UCF101和Sothing homeThich V2，表明FrameMaker可以在仅消耗20％的内存中，可以在最近的高级方法上实现更好的性能。此外，在相同的记忆消耗条件下，FrameMaker通过令人信服的利润率大大优于现有的最先进。

Recent incremental learning for action recognition usually stores representative videos to mitigate catastrophic forgetting. However, only a few bulky videos can be stored due to the limited memory. To address this problem, we propose FrameMaker, a memory-efficient video class-incremental learning approach that learns to produce a condensed frame for each selected video. Specifically, FrameMaker is mainly composed of two crucial components: Frame Condensing and Instance-Specific Prompt. The former is to reduce the memory cost by preserving only one condensed frame instead of the whole video, while the latter aims to compensate the lost spatio-temporal details in the Frame Condensing stage. By this means, FrameMaker enables a remarkable reduction in memory but keep enough information that can be applied to following incremental tasks. Experimental results on multiple challenging benchmarks, i.e., HMDB51, UCF101 and Something-Something V2, demonstrate that FrameMaker can achieve better performance to recent advanced methods while consuming only 20% memory. Additionally, under the same memory consumption conditions, FrameMaker significantly outperforms existing state-of-the-arts by a convincing margin.

下载PDF全文

下载文献需遵守相关版权规定

论文标题