论文标题
实用问题敏感的图像字幕
Pragmatic Issue-Sensitive Image Captioning
论文作者
论文摘要
图像字幕系统最近已大大改善,但它们仍然倾向于产生对字幕应实现的交流目标不敏感的字幕。为了解决这个问题,我们提出了对问题敏感的图像字幕(ISIC)。在ISIC中,为字幕系统提供了一个目标图像和问题,这是一组被分区的图像,以指定相关信息的方式。标题者的目的是制作解决此问题的标题。为了建模此任务,我们使用了务实语言使用的理性语音行为模型的扩展。我们的扩展是建立在最先进的预验证的神经图像标题基础上的,并且明确的原因是我们意义上的问题。我们通过实验确定这些模型产生的标题既具有描述性且对问题敏感,又显示ISIC如何补充和丰富视觉问题回答的相关任务。
Image captioning systems have recently improved dramatically, but they still tend to produce captions that are insensitive to the communicative goals that captions should meet. To address this, we propose Issue-Sensitive Image Captioning (ISIC). In ISIC, a captioning system is given a target image and an issue, which is a set of images partitioned in a way that specifies what information is relevant. The goal of the captioner is to produce a caption that resolves this issue. To model this task, we use an extension of the Rational Speech Acts model of pragmatic language use. Our extension is built on top of state-of-the-art pretrained neural image captioners and explicitly reasons about issues in our sense. We establish experimentally that these models generate captions that are both highly descriptive and issue-sensitive, and we show how ISIC can complement and enrich the related task of Visual Question Answering.