通过脱钩的多模式对比学习无监督的自然语言推断

论文标题

通过脱钩的多模式对比学习无监督的自然语言推断

Unsupervised Natural Language Inference via Decoupled Multimodal Contrastive Learning

论文作者

Cui, Wanyun, Zheng, Guangyu, Wang, Wei

论文摘要

我们建议通过通过任务不合时式预处理从推理标签进行任何监督的情况下解决自然语言推理问题。尽管对多模式自学学习的最新研究也代表了语言和视觉背景，但它们针对不同方式的编码是耦合的。因此，他们在单独编码纯文本时无法合并视觉信息。在本文中，我们提出了多模式对齐对比度解耦学习（MACD）网络。 MACD强迫解耦的文本编码器通过对比度学习表示视觉信息。因此，它甚至用于纯文本推断也嵌入了视觉知识。我们通过纯文本推理数据集（即SNLI和STS-B）进行了全面的实验。无监督的MACD甚至胜过STS-B上的全面监督Bilstm和Bilstm+Elmo。

We propose to solve the natural language inference problem without any supervision from the inference labels via task-agnostic multimodal pretraining. Although recent studies of multimodal self-supervised learning also represent the linguistic and visual context, their encoders for different modalities are coupled. Thus they cannot incorporate visual information when encoding plain text alone. In this paper, we propose Multimodal Aligned Contrastive Decoupled learning (MACD) network. MACD forces the decoupled text encoder to represent the visual information via contrastive learning. Therefore, it embeds visual knowledge even for plain text inference. We conducted comprehensive experiments over plain text inference datasets (i.e. SNLI and STS-B). The unsupervised MACD even outperforms the fully-supervised BiLSTM and BiLSTM+ELMO on STS-B.

下载PDF全文

下载文献需遵守相关版权规定

论文标题