从语言模型中探测以任务为导向的对话表示

论文标题

从语言模型中探测以任务为导向的对话表示

Probing Task-Oriented Dialogue Representation from Language Models

论文作者

Wu, Chien-Sheng, Xiong, Caiming

论文摘要

本文研究了预训练的语言模型，以找出哪种模型本质上携带了最有用的以任务对话任务的代表。我们从两个方面解决了问题：监督分类器探测和无监督的相互信息探测器。我们以固定的预训练语言模型的基础上的分类器探测器微调了馈送层，并以带有监督的方式进行了带注释的标签。同时，我们提出了一个无监督的相互信息探测，以评估实际聚类和表示聚类之间的相互依赖性。该经验论文的目标是1）研究探测技术，尤其是从无监督的相互信息方面，2）为对话研究社区提供预训练的语言模型选择的指南，3）找到对话预培训因素的见解，以实现对话的培训因素，这可能是成功的关键。

This paper investigates pre-trained language models to find out which model intrinsically carries the most informative representation for task-oriented dialogue tasks. We approach the problem from two aspects: supervised classifier probe and unsupervised mutual information probe. We fine-tune a feed-forward layer as the classifier probe on top of a fixed pre-trained language model with annotated labels in a supervised way. Meanwhile, we propose an unsupervised mutual information probe to evaluate the mutual dependence between a real clustering and a representation clustering. The goals of this empirical paper are to 1) investigate probing techniques, especially from the unsupervised mutual information aspect, 2) provide guidelines of pre-trained language model selection for the dialogue research community, 3) find insights of pre-training factors for dialogue application that may be the key to success.

下载PDF全文

下载文献需遵守相关版权规定

论文标题