基于解析的视觉嵌入网络用于汽车重新识别

论文标题

基于解析的视觉嵌入网络用于汽车重新识别

Parsing-based View-aware Embedding Network for Vehicle Re-Identification

论文作者

Meng, Dechao, Li, Liang, Liu, Xuejing, Li, Yadong, Yang, Shijie, Zha, Zhengjun, Gao, Xingyu, Wang, Shuhui, Huang, Qingming

论文摘要

车辆的重新识别是在跨摄像机场景中从各种视图中找到同一车辆的图像。这项任务的主要挑战是由不同的视图引起的较大的内在距离以及由类似车辆引起的微妙的企业际差异。在本文中，我们提出了一个基于解析的视觉嵌入网络（PVEN），以实现“ REID”的视图特征对齐和增强。首先，我们引入一个解析网络，将车辆解析为四个不同的视图，然后按面罩平均池对齐功能。这种对齐方式提供了对车辆的细粒度表示。其次，为了增强视觉特征，我们设计了一个共同的可见注意力，以专注于常见的可见视图，这不仅缩短了内部内在之间的距离，而且还扩大了企业际交往的差异。 PVEN有助于捕获不同视图下车辆的稳定判别信息。在三个数据集上进行的实验表明，我们的模型的表现要优于最先进的方法。

Vehicle Re-Identification is to find images of the same vehicle from various views in the cross-camera scenario. The main challenges of this task are the large intra-instance distance caused by different views and the subtle inter-instance discrepancy caused by similar vehicles. In this paper, we propose a parsing-based view-aware embedding network (PVEN) to achieve the view-aware feature alignment and enhancement for vehicle ReID. First, we introduce a parsing network to parse a vehicle into four different views, and then align the features by mask average pooling. Such alignment provides a fine-grained representation of the vehicle. Second, in order to enhance the view-aware features, we design a common-visible attention to focus on the common visible views, which not only shortens the distance among intra-instances, but also enlarges the discrepancy of inter-instances. The PVEN helps capture the stable discriminative information of vehicle under different views. The experiments conducted on three datasets show that our model outperforms state-of-the-art methods by a large margin.

下载PDF全文

下载文献需遵守相关版权规定

论文标题