通过概率类似映射进行零拍的视觉推理

论文标题

通过概率类似映射进行零拍的视觉推理

Zero-shot visual reasoning through probabilistic analogical mapping

论文作者

Webb, Taylor W., Fu, Shuhao, Bihl, Trevor, Holyoak, Keith J., Lu, Hongjing

论文摘要

人类推理的基础是识别高度抽象的共同点，该共同点是表面上不同的视觉输入。以这种能力开发算法的最新努力主要集中在需要对视觉推理任务进行广泛直接培训的方法上，并对新内容问题产生有限的概括。相比之下，认知科学研究的悠久传统一直集中在阐明人类类似推理的基础计算原理上。但是，这项工作通常依赖于手动构造的表示形式。在这里，我们提出Visipam（视觉概率类似映射），这是一种综合这两种方法的视觉推理模型。 Visipam采用直接从自然主义的视觉输入中得出的学说，再加上从人类推理的认知理论得出的基于相似性的映射操作。我们表明，如果没有进行任何直接培训，Visipam的表现就优于类似映射任务的最先进的深度学习模型。此外，Visipam在新任务上与人类绩效的模式紧密匹配，该任务涉及跨不同类别的3D对象的映射。

Human reasoning is grounded in an ability to identify highly abstract commonalities governing superficially dissimilar visual inputs. Recent efforts to develop algorithms with this capacity have largely focused on approaches that require extensive direct training on visual reasoning tasks, and yield limited generalization to problems with novel content. In contrast, a long tradition of research in cognitive science has focused on elucidating the computational principles underlying human analogical reasoning; however, this work has generally relied on manually constructed representations. Here we present visiPAM (visual Probabilistic Analogical Mapping), a model of visual reasoning that synthesizes these two approaches. VisiPAM employs learned representations derived directly from naturalistic visual inputs, coupled with a similarity-based mapping operation derived from cognitive theories of human reasoning. We show that without any direct training, visiPAM outperforms a state-of-the-art deep learning model on an analogical mapping task. In addition, visiPAM closely matches the pattern of human performance on a novel task involving mapping of 3D objects across disparate categories.

下载PDF全文

下载文献需遵守相关版权规定

论文标题