单位：场景文本检测的无监督中间训练阶段

论文标题

单位：场景文本检测的无监督中间训练阶段

UNITS: Unsupervised Intermediate Training Stage for Scene Text Detection

论文作者

Guo, Youhui, Zhou, Yu, Qin, Xugong, Xie, Enze, Wang, Weiping

论文摘要

最近的场景文本检测方法几乎基于深度学习和数据驱动。由于昂贵的注释成本，通常用于预训练的合成数据。但是，合成数据和现实数据之间存在明显的域差异。在微调阶段，直接采用合成数据初始化的模型可能会导致次优性能。在本文中，我们为场景文本检测提出了一个新的培训范例，该训练范围介绍了\ textbf {un}监督\ textbf {i} ntermediate \ textbf {t}雨水\ textbf {s} s} tage（s} tage（units），该阶段可以建立一个真实的阶段，可以融入现实的阶段，并在范围内降低了gap的范围。进一步探讨了三种培训策略以无监督的方式从现实世界数据中感知信息。使用单位，场景文本检测器将得到改进，而无需在推理过程中引入任何参数和计算。广泛的实验结果表明，在三个公共数据集上的性能一致。

Recent scene text detection methods are almost based on deep learning and data-driven. Synthetic data is commonly adopted for pre-training due to expensive annotation cost. However, there are obvious domain discrepancies between synthetic data and real-world data. It may lead to sub-optimal performance to directly adopt the model initialized by synthetic data in the fine-tuning stage. In this paper, we propose a new training paradigm for scene text detection, which introduces an \textbf{UN}supervised \textbf{I}ntermediate \textbf{T}raining \textbf{S}tage (UNITS) that builds a buffer path to real-world data and can alleviate the gap between the pre-training stage and fine-tuning stage. Three training strategies are further explored to perceive information from real-world data in an unsupervised way. With UNITS, scene text detectors are improved without introducing any parameters and computations during inference. Extensive experimental results show consistent performance improvements on three public datasets.

下载PDF全文

下载文献需遵守相关版权规定

论文标题