Chart-RCNN：从相机图像提取有效的线图数据提取

论文标题

Chart-RCNN：从相机图像提取有效的线图数据提取

Chart-RCNN: Efficient Line Chart Data Extraction from Camera Images

论文作者

Li, Shufan, Lu, Congxi, Li, Linkai, Zhou, Haoshuai

论文摘要

线图数据提取是光学特征识别的自然扩展，其中目的是恢复图像图像所代表的基础数值信息。使用多阶段网络将OCR模型与对象检测框架相结合的多个阶段网络，例如Chartocr方法，例如Chartocr方法。但是，大多数现有的数据集和模型都是基于“干净”图像，例如与相机照片截然不同的屏幕截图。此外，创建特定领域的新数据集需要广泛的标签，这可能是耗时的。我们的主要贡献如下：我们提出了一个合成数据生成框架和一个单阶段模型，该模型同时输出文本标签，标记坐标和透视估计。我们收集了两个数据集，这些数据集由真实的相机照片进行评估。结果表明，我们仅在合成数据上训练的模型可以应用于真实照片，而无需进行任何微调，并且对于实际应用是可行的。

Line Chart Data Extraction is a natural extension of Optical Character Recognition where the objective is to recover the underlying numerical information a chart image represents. Some recent works such as ChartOCR approach this problem using multi-stage networks combining OCR models with object detection frameworks. However, most of the existing datasets and models are based on "clean" images such as screenshots that drastically differ from camera photos. In addition, creating domain-specific new datasets requires extensive labeling which can be time-consuming. Our main contributions are as follows: we propose a synthetic data generation framework and a one-stage model that outputs text labels, mark coordinates, and perspective estimation simultaneously. We collected two datasets consisting of real camera photos for evaluation. Results show that our model trained only on synthetic data can be applied to real photos without any fine-tuning and is feasible for real-world application.

下载PDF全文

下载文献需遵守相关版权规定

论文标题