论文标题

联合布局分析,角色检测和对历史文档数字化的认可

Joint Layout Analysis, Character Detection and Recognition for Historical Document Digitization

论文作者

Ma, Weihong, Zhang, Hesuo, Jin, Lianwen, Wu, Sihang, Wang, Jiapeng, Wang, Yongpan

论文摘要

在本文中,我们提出了一个端到端的可训练框架,用于恢复遵循正确阅读顺序的历史文档内容。在此框架中,添加了两个名为“字符分支”分支的分支,并在特征提取网络后面添加了分支。字符分支将单个字符定位在文档图像中,并同时识别它们。然后,我们采用后处理方法将它们分组为文本行。基于完全卷积网络的布局分支输出二进制掩码。然后,我们使用Hough Transform在二进制掩码上进行线路检测,并将角色结果与布局信息结合在一起以还原文档内容。这两个分支可以并行训练,易于训练。此外,我们提出了一种重新分数机制,以最大程度地减少识别误差。扩展的中国历史文档MTHV2数据集的实验结果证明了拟议框架的有效性。

In this paper, we propose an end-to-end trainable framework for restoring historical documents content that follows the correct reading order. In this framework, two branches named character branch and layout branch are added behind the feature extraction network. The character branch localizes individual characters in a document image and recognizes them simultaneously. Then we adopt a post-processing method to group them into text lines. The layout branch based on fully convolutional network outputs a binary mask. We then use Hough transform for line detection on the binary mask and combine character results with the layout information to restore document content. These two branches can be trained in parallel and are easy to train. Furthermore, we propose a re-score mechanism to minimize recognition error. Experiment results on the extended Chinese historical document MTHv2 dataset demonstrate the effectiveness of the proposed framework.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源