重新排名作家身份证明和作家检索

论文标题

重新排名作家身份证明和作家检索

Re-ranking for Writer Identification and Writer Retrieval

论文作者

Jordan, Simon, Seuret, Mathias, Král, Pavel, Lenc, Ladislav, Martínek, Jiří, Wiermann, Barbara, Schwinger, Tobias, Maier, Andreas, Christlein, Vincent

论文摘要

自动作者识别是文档分析中的常见问题。最先进的方法通常集中于传统或深度学习技术的特征提取步骤。在检索问题中，重新排列是一种改善结果的常用技术。重新排列通过使用排名结果中包含的知识，e。 g。，通过利用最近的邻居关系。据我们所知，重新排列尚未用于作者识别/检索。一个可能的原因可能是，公开可用的基准数据集仅包含每个作者的几个样本，这使得重新排列的有希望降低。我们表明，即使每个作者只有几个样本可用，也有基于K-重点最近的邻居关系的重新排列步骤对于作者识别也是有利的。我们以两种方式使用这些相互关系：将它们编码为最初提出的新向量，或将它们整合在查询扩张方面。我们表明，这两种技术的表现都优于三个作者身份数据集的地图。

Automatic writer identification is a common problem in document analysis. State-of-the-art methods typically focus on the feature extraction step with traditional or deep-learning-based techniques. In retrieval problems, re-ranking is a commonly used technique to improve the results. Re-ranking refines an initial ranking result by using the knowledge contained in the ranked result, e. g., by exploiting nearest neighbor relations. To the best of our knowledge, re-ranking has not been used for writer identification/retrieval. A possible reason might be that publicly available benchmark datasets contain only few samples per writer which makes a re-ranking less promising. We show that a re-ranking step based on k-reciprocal nearest neighbor relationships is advantageous for writer identification, even if only a few samples per writer are available. We use these reciprocal relationships in two ways: encode them into new vectors, as originally proposed, or integrate them in terms of query-expansion. We show that both techniques outperform the baseline results in terms of mAP on three writer identification datasets.

下载PDF全文

下载文献需遵守相关版权规定

论文标题