基于视频的人重新识别的集中多元融合多意见网络

论文标题

基于视频的人重新识别的集中多元融合多意见网络

Concentrated Multi-Grained Multi-Attention Network for Video Based Person Re-Identification

论文作者

Hu, Panwen, Liu, Jiazhen, Huang, Rui

论文摘要

在基于视频的重新识别（RE-ID）任务中，闭塞仍然是一个严重的问题，这对成功率有很大的影响。事实证明，注意机制有助于通过大量现有方法解决闭塞问题。但是，他们的注意机制仍然缺乏将足够的判别信息从视频中提取到最终表示形式中的能力。现有方法采用的单个注意模块方案无法利用多尺度空间提示，并且单个模块的注意力将被人的多个显着部分分散。在本文中，我们提出了一个集中的多层次多意见网络（CMMANET），其中两个多意见模块旨在通过处理多尺度的中间功能来提取多透明信息。此外，每个多意见模块中的多个注意力模块可以自动发现视频帧的多个歧视区域。为了实现这一目标，我们引入了多样性损失，以多样化每个多发出模块中的子模型，并集中损失以整合其注意力响应，以便每个sopodule都可以将每个sopodule强烈地集中在特定的有意义的部分上。实验结果表明，所提出的方法在多个公共数据集上的大幅度优于最先进的方法。

Occlusion is still a severe problem in the video-based Re-IDentification (Re-ID) task, which has a great impact on the success rate. The attention mechanism has been proved to be helpful in solving the occlusion problem by a large number of existing methods. However, their attention mechanisms still lack the capability to extract sufficient discriminative information into the final representations from the videos. The single attention module scheme employed by existing methods cannot exploit multi-scale spatial cues, and the attention of the single module will be dispersed by multiple salient parts of the person. In this paper, we propose a Concentrated Multi-grained Multi-Attention Network (CMMANet) where two multi-attention modules are designed to extract multi-grained information through processing multi-scale intermediate features. Furthermore, multiple attention submodules in each multi-attention module can automatically discover multiple discriminative regions of the video frames. To achieve this goal, we introduce a diversity loss to diversify the submodules in each multi-attention module, and a concentration loss to integrate their attention responses so that each submodule can strongly focus on a specific meaningful part. The experimental results show that the proposed approach outperforms the state-of-the-art methods by large margins on multiple public datasets.

下载PDF全文

下载文献需遵守相关版权规定

论文标题