视觉计数的深度学习技术

论文标题

视觉计数的深度学习技术

Deep Learning Techniques for Visual Counting

论文作者

Ciampi, Luca

论文摘要

在本文中，我们在静止图像或视频框架中调查了和增强了针对行人，牢房或车辆等对象的深度学习（DL）技术。特别是，我们解决了与培训当前基于DL的解决方案所需的数据有关的挑战。鉴于标签的预算有限，数据稀缺仍然代表了一个开放的问题，它基于对神经网络的监督学习，可以防止现有解决方案的可伸缩性，并且在推理时，当新算法呈现新场景时，在推理时间的性能下降。我们介绍了从几个互补方面解决此问题的解决方案，收集了从自动标记的虚拟环境中收集的数据集，提出了域名适应策略，旨在减轻培训数据和测试数据分布之间存在的域间隙，并在弱标记的数据场景中呈现较弱的计算策略，即在不合理的情况下，在不适合的情况下。此外，我们解决了在电力资源有限的环境中采用基于卷积神经网络的技术所面临的非平凡工程挑战，引入了直接在嵌入式视觉系统上计算车辆和行人的解决方案，即配备了具有约束计算能力的设备，可以捕获图像并使用图像和精心掌握图像。

In this dissertation, we investigated and enhanced Deep Learning (DL) techniques for counting objects, like pedestrians, cells or vehicles, in still images or video frames. In particular, we tackled the challenge related to the lack of data needed for training current DL-based solutions. Given that the budget for labeling is limited, data scarcity still represents an open problem that prevents the scalability of existing solutions based on the supervised learning of neural networks and that is responsible for a significant drop in performance at inference time when new scenarios are presented to these algorithms. We introduced solutions addressing this issue from several complementary sides, collecting datasets gathered from virtual environments automatically labeled, proposing Domain Adaptation strategies aiming at mitigating the domain gap existing between the training and test data distributions, and presenting a counting strategy in a weakly labeled data scenario, i.e., in the presence of non-negligible disagreement between multiple annotators. Moreover, we tackled the non-trivial engineering challenges coming out of the adoption of Convolutional Neural Network-based techniques in environments with limited power resources, introducing solutions for counting vehicles and pedestrians directly onboard embedded vision systems, i.e., devices equipped with constrained computational capabilities that can capture images and elaborate them.

下载PDF全文

下载文献需遵守相关版权规定

论文标题