论文标题
粗糙和细粒度的注意网络,具有人群密度图估计的背景感知损失
Coarse- and Fine-grained Attention Network with Background-aware Loss for Crowd Density Map Estimation
论文作者
论文摘要
在本文中,我们提出了一种新颖的方法粗糙和细粒度注意网络(CFANET),用于生成高质量的人群密度图和人们通过合并注意力图以更好地关注人群区域来计算估计。我们通过整合人群区域识别仪(CRR)和密度水平估计器(DLE)分支来设计一种从循环渐进的注意力机制,这可以抑制不相关的背景的影响并根据人群密度的水平分配注意力,因为通常很难产生准确的细粒度注意力图。我们还采用多层监督机制来帮助梯度的反向传播并减少过度拟合。此外,我们提出了一种背景感知的结构损失(BSL),以降低错误识别比,同时改善与地面的结构相似性。关于常用数据集的广泛实验表明,就计数精度而言,我们的方法不仅可以优于先前的先前最新方法,而且可以提高密度图的图像质量以及降低错误识别率。
In this paper, we present a novel method Coarse- and Fine-grained Attention Network (CFANet) for generating high-quality crowd density maps and people count estimation by incorporating attention maps to better focus on the crowd area. We devise a from-coarse-to-fine progressive attention mechanism by integrating Crowd Region Recognizer (CRR) and Density Level Estimator (DLE) branch, which can suppress the influence of irrelevant background and assign attention weights according to the crowd density levels, because generating accurate fine-grained attention maps directly is normally difficult. We also employ a multi-level supervision mechanism to assist the backpropagation of gradient and reduce overfitting. Besides, we propose a Background-aware Structural Loss (BSL) to reduce the false recognition ratio while improving the structural similarity to groundtruth. Extensive experiments on commonly used datasets show that our method can not only outperform previous state-of-the-art methods in terms of count accuracy but also improve the image quality of density maps as well as reduce the false recognition ratio.