论文标题
基于蒙版的边界框选择基于文本旋转预测的重新连接预测变量
A Masked Bounding-Box Selection Based ResNet Predictor for Text Rotation Prediction
论文作者
论文摘要
现有的光学特征识别(OCR)系统能够识别具有水平文本的图像。但是,当文本的旋转增加时,识别这些文本变得更加困难。 OCR系统的性能降低。因此,预测文本的旋转并纠正图像很重要。先前的工作主要使用传统的计算机视觉方法,例如Hough Transform和深度学习方法,例如卷积神经网络。但是,所有这些方法都容易出现带有文本的一般图像中通常存在的背景噪声。为了解决此问题,在这项工作中,我们引入了一种新的蒙版边界框选择方法,该方法将边界盒信息纳入系统。通过训练重新连接预测指标,以将重点放在关注区域(ROI)(ROI)上,预测器学会忽略背景噪声。对文本旋转预测任务的评估表明,我们的方法将性能提高了很大的边距。
The existing Optical Character Recognition (OCR) systems are capable of recognizing images with horizontal texts. However, when the rotation of the texts increases, it becomes harder to recognizing these texts. The performance of the OCR systems decreases. Thus predicting the rotations of the texts and correcting the images are important. Previous work mainly uses traditional Computer Vision methods like Hough Transform and Deep Learning methods like Convolutional Neural Network. However, all of these methods are prone to background noises commonly existing in general images with texts. To tackle this problem, in this work, we introduce a new masked bounding-box selection method, that incorporating the bounding box information into the system. By training a ResNet predictor to focus on the bounding box as the region of interest (ROI), the predictor learns to overlook the background noises. Evaluations on the text rotation prediction tasks show that our method improves the performance by a large margin.