论文标题
学习用于记忆效率的视频课程学习的凝结框架
Learning a Condensed Frame for Memory-Efficient Video Class-Incremental Learning
论文作者
论文摘要
动作识别的最新增量学习通常存储代表性视频,以减轻灾难性遗忘。但是,由于记忆力有限,只能存储几个笨重的视频。为了解决这个问题,我们提出了FrameMaker,这是一种记忆效率的视频课程学习方法,该方法学会为每个选定的视频产生一个冷凝的框架。具体而言,FrameMaker主要由两个关键组成部分组成:框架冷凝和实例特定提示。前者是通过仅保留一个冷凝框架而不是整个视频来降低记忆成本,而后者旨在补偿框架凝结阶段中丢失的时空细节。通过这种方式,FrameMaker可以显着减少内存,但可以保留足够的信息,可应用于以下任务。实验结果对多个具有挑战性的基准测试,即HMDB51,UCF101和Sothing homeThich V2,表明FrameMaker可以在仅消耗20%的内存中,可以在最近的高级方法上实现更好的性能。此外,在相同的记忆消耗条件下,FrameMaker通过令人信服的利润率大大优于现有的最先进。
Recent incremental learning for action recognition usually stores representative videos to mitigate catastrophic forgetting. However, only a few bulky videos can be stored due to the limited memory. To address this problem, we propose FrameMaker, a memory-efficient video class-incremental learning approach that learns to produce a condensed frame for each selected video. Specifically, FrameMaker is mainly composed of two crucial components: Frame Condensing and Instance-Specific Prompt. The former is to reduce the memory cost by preserving only one condensed frame instead of the whole video, while the latter aims to compensate the lost spatio-temporal details in the Frame Condensing stage. By this means, FrameMaker enables a remarkable reduction in memory but keep enough information that can be applied to following incremental tasks. Experimental results on multiple challenging benchmarks, i.e., HMDB51, UCF101 and Something-Something V2, demonstrate that FrameMaker can achieve better performance to recent advanced methods while consuming only 20% memory. Additionally, under the same memory consumption conditions, FrameMaker significantly outperforms existing state-of-the-arts by a convincing margin.