学习对LTL可执行语义解析器的自然语言，用于接地机器人技术

论文标题

学习对LTL可执行语义解析器的自然语言，用于接地机器人技术

Learning a natural-language to LTL executable semantic parser for grounded robotics

论文作者

Wang, Christopher, Ross, Candace, Kuo, Yen-Ling, Katz, Boris, Barbu, Andrei

论文摘要

儿童通过观察语言如何在上下文中使用并自行使用语言，从而轻松地获取母语。他们这样做，没有费力的注释，负面的例子甚至是直接的更正。我们迈出了可以通过训练接地语义解析器来做同样的机器人的一步，该语义解析器发现可用于执行自然语言命令的潜在语言表示。特别是，我们专注于具有时间方面的命令的困难域，我们使用线性时间逻辑LTL捕获其语义。我们的解析器经过培训，并接受了许多句子和执行和执行者的培训。在训练时，解析器假设输入作为LTL中的公式的意义表示。三个竞争压力使解析器可以从语言中发现意义。首先，任何假设的句子含义都必须足够允许，以反映所有带注释的执行轨迹。其次，执行人 - 验证的端到端LTL规划师 - 必须发现观察轨迹可能是含义的执行。最后，重建原始输入的生成器鼓励模型找到保存有关命令知识的表示形式。共同确保含义既不是太笼统也不是太具体了。我们的模型可以很好地概括，尽管人类生成的句子与开放的词典更加多样化和复杂，但能够以几乎相等的精度解析和执行机器生成和人类生成的命令。此处提出的方法不是特定于LTL的：它可以应用于可以假设句子含义的任何域，而执行人可以验证这些含义，从而为许多机器人代理的应用打开了大门。

Children acquire their native language with apparent ease by observing how language is used in context and attempting to use it themselves. They do so without laborious annotations, negative examples, or even direct corrections. We take a step toward robots that can do the same by training a grounded semantic parser, which discovers latent linguistic representations that can be used for the execution of natural-language commands. In particular, we focus on the difficult domain of commands with a temporal aspect, whose semantics we capture with Linear Temporal Logic, LTL. Our parser is trained with pairs of sentences and executions as well as an executor. At training time, the parser hypothesizes a meaning representation for the input as a formula in LTL. Three competing pressures allow the parser to discover meaning from language. First, any hypothesized meaning for a sentence must be permissive enough to reflect all the annotated execution trajectories. Second, the executor -- a pretrained end-to-end LTL planner -- must find that the observe trajectories are likely executions of the meaning. Finally, a generator, which reconstructs the original input, encourages the model to find representations that conserve knowledge about the command. Together these ensure that the meaning is neither too general nor too specific. Our model generalizes well, being able to parse and execute both machine-generated and human-generated commands, with near-equal accuracy, despite the fact that the human-generated sentences are much more varied and complex with an open lexicon. The approach presented here is not specific to LTL: it can be applied to any domain where sentence meanings can be hypothesized and an executor can verify these meanings, thus opening the door to many applications for robotic agents.

下载PDF全文

下载文献需遵守相关版权规定

论文标题