Unitail：在零售场景中检测，阅读和匹配

论文标题

Unitail：在零售场景中检测，阅读和匹配

Unitail: Detecting, Reading, and Matching in Retail Scene

论文作者

Chen, Fangyi, Zhang, Han, Li, Zaiwang, Dou, Jiachen, Mo, Shentong, Chen, Hao, Zhang, Yongxin, Ahmed, Uzair, Zhu, Chenchen, Savvides, Marios

论文摘要

为了在商店中充分利用计算机视觉技术，需要考虑适合零售场景特征的实际需求。为了实现这一目标，我们介绍了联合零售数据集（Unitail），这是针对挑战算法检测，阅读和匹配的算法的基本视觉任务的大规模基准。凭借有180万个四边形的实例注释，Nitail提供了一个检测数据集，以更好地对齐产品外观。此外，它提供了一个画廊风格的OCR数据集，其中包含1454个产品类别，30k文本区域和21k转录，以实现对产品的强大阅读并激励增强的产品匹配。除了使用各种最先进的方法对数据集进行基准测试外，我们还定制了一个新的检测器进行产品检测，并提供了一个基于OCR的简单匹配解决方案，以验证其有效性。

To make full use of computer vision technology in stores, it is required to consider the actual needs that fit the characteristics of the retail scene. Pursuing this goal, we introduce the United Retail Datasets (Unitail), a large-scale benchmark of basic visual tasks on products that challenges algorithms for detecting, reading, and matching. With 1.8M quadrilateral-shaped instances annotated, the Unitail offers a detection dataset to align product appearance better. Furthermore, it provides a gallery-style OCR dataset containing 1454 product categories, 30k text regions, and 21k transcriptions to enable robust reading on products and motivate enhanced product matching. Besides benchmarking the datasets using various state-of-the-arts, we customize a new detector for product detection and provide a simple OCR-based matching solution that verifies its effectiveness.

下载PDF全文

下载文献需遵守相关版权规定

论文标题