MFORMS：多模式形式填充问题

论文标题

MFORMS：多模式形式填充问题

mForms : Multimodal Form-Filling with Question Answering

论文作者

Heck, Larry, Heck, Simon, Sundar, Anirudh

论文摘要

本文通过将任务重新设计为多模式自然语言问答（QA），提出了一种新的形式填充方法。重新制作是通过首先将GUI形式（文本字段，按钮，图标等）上的元素转换为自然语言问题而实现的，这些问题捕获了该元素的多模式语义。确定表单元素（问题）和用户话语（答案）之间的匹配项后，表单元素将通过预先训练的提取质量检查系统填充。通过利用预先训练的质量检查模型而不需要特定形式的训练，这种形式填充方法是零射。本文还提出了一种方法，可以通过使用多任务训练来进一步完善形式的填充，以结合潜在的连续任务。最后，本文介绍了多模式的自然语言形式填充数据集多模式形式（mforms），以及流行的ATIS数据集的多模式扩展，以支持未来的研究和实验。结果表明，新方法不仅在稀疏训练条件下保持了强大的准确性，而且在ATIS上获得了0.97的最新F1，而训练数据的约为1/10。

This paper presents a new approach to form-filling by reformulating the task as multimodal natural language Question Answering (QA). The reformulation is achieved by first translating the elements on the GUI form (text fields, buttons, icons, etc.) to natural language questions, where these questions capture the element's multimodal semantics. After a match is determined between the form element (Question) and the user utterance (Answer), the form element is filled through a pre-trained extractive QA system. By leveraging pre-trained QA models and not requiring form-specific training, this approach to form-filling is zero-shot. The paper also presents an approach to further refine the form-filling by using multi-task training to incorporate a potentially large number of successive tasks. Finally, the paper introduces a multimodal natural language form-filling dataset Multimodal Forms (mForms), as well as a multimodal extension of the popular ATIS dataset to support future research and experimentation. Results show the new approach not only maintains robust accuracy for sparse training conditions but achieves state-of-the-art F1 of 0.97 on ATIS with approximately 1/10th of the training data.

下载PDF全文

下载文献需遵守相关版权规定

论文标题