语法控制的知识以秩序和语义一致性为单位的文本生成

论文标题

语法控制的知识以秩序和语义一致性为单位的文本生成

Syntax Controlled Knowledge Graph-to-Text Generation with Order and Semantic Consistency

论文作者

Liu, Jin, Fan, Chongfeng, Zhou, Fengyu, Xu, Huijuan

论文摘要

知识图（kg）存储了大量的结构知识，而直接人类的理解并不容易。知识图表到文本（kg-to-text）生成旨在从kg产生易于理解的句子，同时，在生成的句子和kg之间保持语义一致性。现有的kg至文本生成方法将此任务用作线性化kg作为序列到序列生成任务作为输入，并通过在每个时间步骤中的解码句子和kg节点单词之间的简单选择来考虑生成的文本和kg的一致性问题。但是，线性化的kg顺序通常是通过启发式搜索获得的，而无需数据驱动的优化。在本文中，我们根据从字幕提取的顺序监督优化了知识描述顺序预测，并通过句法和语义正则化进一步提高了生成的句子和kg的一致性。我们合并了言论（POS）句法标签，以限制位置以复制kg中的单词，并采用语义上下文评分函数来评估在生成句子中解码每个单词时，在其本地上下文中的每个单词的语义适应性。在两个数据集（WebNLG和DART）上进行了广泛的实验，并实现了最先进的性能。

The knowledge graph (KG) stores a large amount of structural knowledge, while it is not easy for direct human understanding. Knowledge graph-to-text (KG-to-text) generation aims to generate easy-to-understand sentences from the KG, and at the same time, maintains semantic consistency between generated sentences and the KG. Existing KG-to-text generation methods phrase this task as a sequence-to-sequence generation task with linearized KG as input and consider the consistency issue of the generated texts and KG through a simple selection between decoded sentence word and KG node word at each time step. However, the linearized KG order is commonly obtained through a heuristic search without data-driven optimization. In this paper, we optimize the knowledge description order prediction under the order supervision extracted from the caption and further enhance the consistency of the generated sentences and KG through syntactic and semantic regularization. We incorporate the Part-of-Speech (POS) syntactic tags to constrain the positions to copy words from the KG and employ a semantic context scoring function to evaluate the semantic fitness for each word in its local context when decoding each word in the generated sentence. Extensive experiments are conducted on two datasets, WebNLG and DART, and achieve state-of-the-art performances.

下载PDF全文

下载文献需遵守相关版权规定

论文标题