学习语言指导机器人操纵的神经符号计划

论文标题

学习语言指导机器人操纵的神经符号计划

Learning Neuro-symbolic Programs for Language Guided Robot Manipulation

论文作者

Kalithasan, Namasivayam, Singh, Himanshu, Bindal, Vishal, Tuli, Arnav, Agrawal, Vishwajeet, Jain, Rahul, Singla, Parag, Paul, Rohan

论文摘要

鉴于自然语言指令和输入场景，我们的目标是训练模型以输出可以由机器人执行的操作程序。此任务的先前方法具有以下局限性之一：（i）依靠手工编码的符号来限制概念的概念，超出了训练过程中所看到的概念[1]（ii）从指令中推断动作序列，但需要从指令中推断行动序列[2]或（iii）缺乏更深层次的对象中心的语义所需的语义，以解释复杂的指令[3]。相反，我们的方法可以处理语言和感知变化，端到端训练，并且不需要中间监督。所提出的模型使用以潜在神经对象为中心表示的符号推理构建体，从而可以对输入场景进行更深入的推理。我们方法的核心是一个模块化结构，该结构由分层指令解析器和一个动作模拟器组成，以学习分离的动作表示。我们在具有7-DOF操纵器的模拟环境上进行的实验，由具有不同数量对象的步骤和场景的指令组成，表明我们的模型对此类变化具有鲁棒性，并且显着超过了基准，尤其是在概括设置中。代码，数据集和实验视频可在https://nsrmp.github.io上找到。

Given a natural language instruction and an input scene, our goal is to train a model to output a manipulation program that can be executed by the robot. Prior approaches for this task possess one of the following limitations: (i) rely on hand-coded symbols for concepts limiting generalization beyond those seen during training [1] (ii) infer action sequences from instructions but require dense sub-goal supervision [2] or (iii) lack semantics required for deeper object-centric reasoning inherent in interpreting complex instructions [3]. In contrast, our approach can handle linguistic as well as perceptual variations, end-to-end trainable and requires no intermediate supervision. The proposed model uses symbolic reasoning constructs that operate on a latent neural object-centric representation, allowing for deeper reasoning over the input scene. Central to our approach is a modular structure consisting of a hierarchical instruction parser and an action simulator to learn disentangled action representations. Our experiments on a simulated environment with a 7-DOF manipulator, consisting of instructions with varying number of steps and scenes with different number of objects, demonstrate that our model is robust to such variations and significantly outperforms baselines, particularly in the generalization settings. The code, dataset and experiment videos are available at https://nsrmp.github.io

下载PDF全文

下载文献需遵守相关版权规定

论文标题