论文标题
Quzzlenet:通过段的场景文本检测图表图学习
PuzzleNet: Scene Text Detection by Segment Context Graph Learning
论文作者
论文摘要
最近,一系列基于分解的场景文本检测方法通过将具有挑战性的文本区域分解成碎片并以自下而上的方式将其链接到了令人印象深刻的进步。但是,他们中的大多数只是专注于在上下文信息被低估时链接独立文本作品。在益智游戏中,求解器经常根据每件作品的上下文信息以逻辑方式将碎片放在一起,以便得出正确的解决方案。受其启发,我们提出了一种新型基于分解的方法,称为拼图网络(puzzlenet),以解决这项工作中具有挑战性的场景文本检测任务。 Puzzlenet由段建议网络(SPN)组成,该网络拟合文本区域拟合任意形状的文本区域以及两分支多相似图卷积网络(MSGCN),该段落模拟每个段之间的外观和几何相关性。通过构建段作为上下文图,MSGCN有效地采用了细分环境来预测段的组合。根据预测组合,通过合并段来产生多边形形状的最终检测。对三个基准数据集(ICDAR15,MSRA-TD500和SCUT-CTW1500)的评估证明,我们的方法可以比当前的最新技术获得更好或可比的性能,这对于段环境图的利用而言是有益的。
Recently, a series of decomposition-based scene text detection methods has achieved impressive progress by decomposing challenging text regions into pieces and linking them in a bottom-up manner. However, most of them merely focus on linking independent text pieces while the context information is underestimated. In the puzzle game, the solver often put pieces together in a logical way according to the contextual information of each piece, in order to arrive at the correct solution. Inspired by it, we propose a novel decomposition-based method, termed Puzzle Networks (PuzzleNet), to address the challenging scene text detection task in this work. PuzzleNet consists of the Segment Proposal Network (SPN) that predicts the candidate text segments fitting arbitrary shape of text region, and the two-branch Multiple-Similarity Graph Convolutional Network (MSGCN) that models both appearance and geometry correlations between each segment to its contextual ones. By building segments as context graphs, MSGCN effectively employs segment context to predict combinations of segments. Final detections of polygon shape are produced by merging segments according to the predicted combinations. Evaluations on three benchmark datasets, ICDAR15, MSRA-TD500 and SCUT-CTW1500, have demonstrated that our method can achieve better or comparable performance than current state-of-the-arts, which is beneficial from the exploitation of segment context graph.