通过同时探索和标识进行3D感知对象目标导航

论文标题

通过同时探索和标识进行3D感知对象目标导航

3D-Aware Object Goal Navigation via Simultaneous Exploration and Identification

论文作者

Zhang, Jiazhao, Dai, Liu, Meng, Fanpeng, Fan, Qingnan, Chen, Xuelin, Xu, Kai, Wang, He

论文摘要

在看不见的环境中，对象目标导航（ObjectNAV）是体现AI的基本任务。现有作品中的代理学习基于2D地图，场景图或图像序列的ObjectNAV策略。考虑到此任务发生在3D空间中，3D感知的代理可以通过从细粒的空间信息中学习来提高其对象NAV功能。但是，由于样本效率低和昂贵的计算成本，在此地板级任务中，利用3D场景表示对于政策学习可能是非法的。在这项工作中，我们根据两个直接的子派系为具有挑战性的3D Aware Objectnav提出了一个框架。这两个子政策，即转角引导的勘探政策和类别感知的识别策略，同时通过利用在线融合的3D点作为观察来执行。通过广泛的实验，我们表明该框架可以通过从3D场景表示形式中学习来大大提高ObjectNAV的性能。我们的框架在MatterPort3D和Gibson数据集上所有基于模块化的方法中都取得了最佳性能，同时需要（最多30倍）培训的计算成本（最多30倍）。

Object goal navigation (ObjectNav) in unseen environments is a fundamental task for Embodied AI. Agents in existing works learn ObjectNav policies based on 2D maps, scene graphs, or image sequences. Considering this task happens in 3D space, a 3D-aware agent can advance its ObjectNav capability via learning from fine-grained spatial information. However, leveraging 3D scene representation can be prohibitively unpractical for policy learning in this floor-level task, due to low sample efficiency and expensive computational cost. In this work, we propose a framework for the challenging 3D-aware ObjectNav based on two straightforward sub-policies. The two sub-polices, namely corner-guided exploration policy and category-aware identification policy, simultaneously perform by utilizing online fused 3D points as observation. Through extensive experiments, we show that this framework can dramatically improve the performance in ObjectNav through learning from 3D scene representation. Our framework achieves the best performance among all modular-based methods on the Matterport3D and Gibson datasets, while requiring (up to 30x) less computational cost for training.

下载PDF全文

下载文献需遵守相关版权规定

论文标题