多次互动学习与问题类型的先验知识，以限制视觉问题答案中的答案搜索空间

论文标题

多次互动学习与问题类型的先验知识，以限制视觉问题答案中的答案搜索空间

Multiple interaction learning with question-type prior knowledge for constraining answer search space in visual question answering

论文作者

Do, Tuong, Nguyen, Binh X., Tran, Huy, Tjiputra, Erman, Tran, Quang D., Do, Thanh-Toan

论文摘要

已经提出了不同的方法来回答视觉问题（VQA）。但是，很少有作品意识到在限制答案搜索空间中从数据中提取的问题类型的各种联合模式方法的行为，哪些信息给出了可靠的提示，以理解输入图像中提出的问题的答案。在本文中，我们提出了一个新颖的VQA模型，该模型利用问题类型的先验信息来改善VQA，通过利用不同类型的问题的行为来利用不同类型的问题之间的多种相互作用。在两个基准数据集（即VQA 2.0和TDIUC）上进行的实验实验表明，提出的方法通过最有竞争力的方法产生了最佳性能。

Different approaches have been proposed to Visual Question Answering (VQA). However, few works are aware of the behaviors of varying joint modality methods over question type prior knowledge extracted from data in constraining answer search space, of which information gives a reliable cue to reason about answers for questions asked in input images. In this paper, we propose a novel VQA model that utilizes the question-type prior information to improve VQA by leveraging the multiple interactions between different joint modality methods based on their behaviors in answering questions from different types. The solid experiments on two benchmark datasets, i.e., VQA 2.0 and TDIUC, indicate that the proposed method yields the best performance with the most competitive approaches.

下载PDF全文

下载文献需遵守相关版权规定

论文标题