Yolosa：基于2D本地特征叠加自我注意的对象检测

论文标题

Yolosa：基于2D本地特征叠加自我注意的对象检测

YOLOSA: Object detection based on 2D local feature superimposed self-attention

论文作者

Li, Weisheng, Huang, Lin

论文摘要

我们分析了实时对象检测模型的网络结构，发现特征串联阶段中的特征非常丰富。在此处应用注意模块可以有效提高模型的检测准确性。但是，常用的注意模块或自我发项模块在检测准确性和推理效率方面的性能差。因此，我们提出了一个新型的自我发场模块，称为颈网的特征串联阶段，称为2D局部特征叠加的自我注意。这个自我发场模块通过局部特征和本地接收场反映了全球特征。我们还提出并优化有效的脱钩头和AB-OTA，并实现SOTA结果。对于使用我们建议的改进，获得了49.0％（71fps，14ms），46.1％（85fps，11.7ms）和39.1％（107fps，9.3ms）的平均精度，使用我们建议的改进，获得了大型，中和小型模型。我们的模型的平均精度超过了0.8％ - 3.1％。

We analyzed the network structure of real-time object detection models and found that the features in the feature concatenation stage are very rich. Applying an attention module here can effectively improve the detection accuracy of the model. However, the commonly used attention module or self-attention module shows poor performance in detection accuracy and inference efficiency. Therefore, we propose a novel self-attention module, called 2D local feature superimposed self-attention, for the feature concatenation stage of the neck network. This self-attention module reflects global features through local features and local receptive fields. We also propose and optimize an efficient decoupled head and AB-OTA, and achieve SOTA results. Average precisions of 49.0% (71FPS, 14ms), 46.1% (85FPS, 11.7ms), and 39.1% (107FPS, 9.3ms) were obtained for large, medium, and small-scale models built using our proposed improvements. Our models exceeded YOLOv5 by 0.8% -- 3.1% in average precision.

下载PDF全文

下载文献需遵守相关版权规定

论文标题