语言模型作为零拍的计划者：为具体代理提取可行的知识

论文标题

语言模型作为零拍的计划者：为具体代理提取可行的知识

Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents

论文作者

Huang, Wenlong, Abbeel, Pieter, Pathak, Deepak, Mordatch, Igor

论文摘要

可以使用大语言模型（LLM）学到的世界知识在交互式环境中采取行动？在本文中，我们调查了以自然语言表示高级任务的可能性（例如“做早餐”），以选择一组可行的步骤（例如“开放冰箱”）。虽然先前的工作着重于从明确的逐步示例中学习如何采取行动，但我们出人意料地发现，如果预先训练的LM足够大并适当提示，则可以有效地将高级任务分解为中层计划，而无需任何进一步的培训。但是，LLMS天真制定的计划通常不能准确地绘制为可接受的行动。我们提出了一项程序，该程序对现有示范的条件进行了条件，并将计划转化为可接受的行动。我们在最近的VirtualHome环境中的评估表明，所得的方法显着提高了LLM基线的可执行性。进行的人类评估揭示了可执行性和正确性之间的权衡，但在从语言模型中提取可行的知识方面有一个有希望的信号。网站https://huangwl18.github.io/language-planner

Can world knowledge learned by large language models (LLMs) be used to act in interactive environments? In this paper, we investigate the possibility of grounding high-level tasks, expressed in natural language (e.g. "make breakfast"), to a chosen set of actionable steps (e.g. "open fridge"). While prior work focused on learning from explicit step-by-step examples of how to act, we surprisingly find that if pre-trained LMs are large enough and prompted appropriately, they can effectively decompose high-level tasks into mid-level plans without any further training. However, the plans produced naively by LLMs often cannot map precisely to admissible actions. We propose a procedure that conditions on existing demonstrations and semantically translates the plans to admissible actions. Our evaluation in the recent VirtualHome environment shows that the resulting method substantially improves executability over the LLM baseline. The conducted human evaluation reveals a trade-off between executability and correctness but shows a promising sign towards extracting actionable knowledge from language models. Website at https://huangwl18.github.io/language-planner

下载PDF全文

下载文献需遵守相关版权规定

论文标题