论文标题
重新排名作家身份证明和作家检索
Re-ranking for Writer Identification and Writer Retrieval
论文作者
论文摘要
自动作者识别是文档分析中的常见问题。最先进的方法通常集中于传统或深度学习技术的特征提取步骤。在检索问题中,重新排列是一种改善结果的常用技术。重新排列通过使用排名结果中包含的知识,e。 g。,通过利用最近的邻居关系。据我们所知,重新排列尚未用于作者识别/检索。一个可能的原因可能是,公开可用的基准数据集仅包含每个作者的几个样本,这使得重新排列的有希望降低。我们表明,即使每个作者只有几个样本可用,也有基于K-重点最近的邻居关系的重新排列步骤对于作者识别也是有利的。我们以两种方式使用这些相互关系:将它们编码为最初提出的新向量,或将它们整合在查询扩张方面。我们表明,这两种技术的表现都优于三个作者身份数据集的地图。
Automatic writer identification is a common problem in document analysis. State-of-the-art methods typically focus on the feature extraction step with traditional or deep-learning-based techniques. In retrieval problems, re-ranking is a commonly used technique to improve the results. Re-ranking refines an initial ranking result by using the knowledge contained in the ranked result, e. g., by exploiting nearest neighbor relations. To the best of our knowledge, re-ranking has not been used for writer identification/retrieval. A possible reason might be that publicly available benchmark datasets contain only few samples per writer which makes a re-ranking less promising. We show that a re-ranking step based on k-reciprocal nearest neighbor relationships is advantageous for writer identification, even if only a few samples per writer are available. We use these reciprocal relationships in two ways: encode them into new vectors, as originally proposed, or integrate them in terms of query-expansion. We show that both techniques outperform the baseline results in terms of mAP on three writer identification datasets.