AirCaprl：使用深入的增强学习的自主空中运动捕获

论文标题

AirCaprl：使用深入的增强学习的自主空中运动捕获

AirCapRL: Autonomous Aerial Human Motion Capture using Deep Reinforcement Learning

论文作者

Tallamraju, Rahul, Saini, Nitin, Bonetto, Elia, Pabst, Michael, Liu, Yu Tang, Black, Michael J., Ahmad, Aamir

论文摘要

在这封信中，我们引入了基于自主空中运动捕获任务（MOCAP）任务的基于深的增强学习（RL）多机器人形成控制器。我们专注于基于视觉的MOCAP，其目的是估计使用多个微型航空车的单个移动人物的身体姿势和形状的轨迹。解决此问题的最新解决方案是基于经典的控制方法，该方法取决于手工制作的系统和观察模型。这样的模型很难在不同系统之间得出和概括。此外，这些模型的非线性和非跨性别性导致了亚最佳对照。在我们的工作中，我们将此问题提出为一项顺序决策任务，以实现基于视觉的运动捕获目标，并使用基于神经网络的RL方法来解决它。我们利用近端政策优化（PPO）来训练形成控制的随机分散控制政策。神经网络在合成环境中的并行设置中进行了训练。我们进行了广泛的模拟实验以验证我们的方法。最后，现实机器人实验表明，我们的政策概括为现实世界条件。视频链接：https：//bit.ly/38SJFJO补充：https：//bit.ly/3evfo1o

In this letter, we introduce a deep reinforcement learning (RL) based multi-robot formation controller for the task of autonomous aerial human motion capture (MoCap). We focus on vision-based MoCap, where the objective is to estimate the trajectory of body pose and shape of a single moving person using multiple micro aerial vehicles. State-of-the-art solutions to this problem are based on classical control methods, which depend on hand-crafted system and observation models. Such models are difficult to derive and generalize across different systems. Moreover, the non-linearity and non-convexities of these models lead to sub-optimal controls. In our work, we formulate this problem as a sequential decision making task to achieve the vision-based motion capture objectives, and solve it using a deep neural network-based RL method. We leverage proximal policy optimization (PPO) to train a stochastic decentralized control policy for formation control. The neural network is trained in a parallelized setup in synthetic environments. We performed extensive simulation experiments to validate our approach. Finally, real-robot experiments demonstrate that our policies generalize to real world conditions. Video Link: https://bit.ly/38SJfjo Supplementary: https://bit.ly/3evfo1O

下载PDF全文

下载文献需遵守相关版权规定

论文标题