论文标题
成分作为词汇对称性
Compositionality as Lexical Symmetry
论文作者
论文摘要
在语义解析,指导以下和问题回答之类的任务中,标准深网无法从小型数据集中概述。许多现有方法通过模型体系结构来克服这种限制,从而强制执行句子解释的组成过程。在本文中,我们提出了组成性的域将军和模型 - 敏锐性公式,这是对数据分布而非模型对称性的约束。非正式地,我们证明,每当可以通过组成模型解决任务时,就会有一个相应的数据增强方案 - 将示例转换为其他形成良好的示例的过程 - 赋予任何训练以解决相同任务的模型中赋予组成归纳偏置。我们描述了一个称为Lexsym的程序,该程序会自动发现这些转换,然后将它们应用于普通神经序列模型的训练数据。与现有的组成数据增强程序不同,可以在文本,结构化数据甚至图像之间不可或缺地部署Lexsym。它匹配或超越了有关COGS语义解析,扫描和炼金术指令的最先进的,特定于任务的模型,以及CLEVR-COGENT的视觉问题回答数据集。
In tasks like semantic parsing, instruction following, and question answering, standard deep networks fail to generalize compositionally from small datasets. Many existing approaches overcome this limitation with model architectures that enforce a compositional process of sentence interpretation. In this paper, we present a domain-general and model-agnostic formulation of compositionality as a constraint on symmetries of data distributions rather than models. Informally, we prove that whenever a task can be solved by a compositional model, there is a corresponding data augmentation scheme -- a procedure for transforming examples into other well formed examples -- that imparts compositional inductive bias on any model trained to solve the same task. We describe a procedure called LEXSYM that discovers these transformations automatically, then applies them to training data for ordinary neural sequence models. Unlike existing compositional data augmentation procedures, LEXSYM can be deployed agnostically across text, structured data, and even images. It matches or surpasses state-of-the-art, task-specific models on COGS semantic parsing, SCAN and ALCHEMY instruction following, and CLEVR-COGENT visual question answering datasets.