LLM-Planner：具有大语言模型的具体代理的基础计划很少

论文标题

LLM-Planner：具有大语言模型的具体代理的基础计划很少

LLM-Planner: Few-Shot Grounded Planning for Embodied Agents with Large Language Models

论文作者

Song, Chan Hee, Wu, Jiaman, Washington, Clayton, Sadler, Brian M., Chao, Wei-Lun, Su, Yu

论文摘要

这项研究着重于使用大型语言模型（LLM）作为体现代理的计划者，可以遵循自然语言说明以在视觉感知的环境中完成复杂的任务。现有方法的高数据成本和样本效率较差，阻碍了能够执行许多任务并且可以快速学习新任务的多功能代理的发展。在这项工作中，我们提出了一种新颖的方法LLM-Planner，该方法利用了大型语言模型的力量来对体现的代理进行很少的计划。我们进一步提出了一种简单但有效的方法，以增强LLM的物理接地，以生成和更新在当前环境中的计划。 ALFRED数据集的实验表明，我们的方法可以达到非常有竞争力的几次表现：尽管使用了少于0.5％的配对训练数据，但LLM-Planner仍可以通过使用完整培训数据进行培训的最近基线来实现竞争性能。现有方法几乎无法在相同的几次设置下成功完成任何任务。我们的工作为开发多功能和样品效率的体现代理打开了大门，这些代理可以快速学习许多任务。网站：https：//dki-lab.github.io/llm-planner

This study focuses on using large language models (LLMs) as a planner for embodied agents that can follow natural language instructions to complete complex tasks in a visually-perceived environment. The high data cost and poor sample efficiency of existing methods hinders the development of versatile agents that are capable of many tasks and can learn new tasks quickly. In this work, we propose a novel method, LLM-Planner, that harnesses the power of large language models to do few-shot planning for embodied agents. We further propose a simple but effective way to enhance LLMs with physical grounding to generate and update plans that are grounded in the current environment. Experiments on the ALFRED dataset show that our method can achieve very competitive few-shot performance: Despite using less than 0.5% of paired training data, LLM-Planner achieves competitive performance with recent baselines that are trained using the full training data. Existing methods can barely complete any task successfully under the same few-shot setting. Our work opens the door for developing versatile and sample-efficient embodied agents that can quickly learn many tasks. Website: https://dki-lab.github.io/LLM-Planner

下载PDF全文

下载文献需遵守相关版权规定

论文标题