一种深层的多机构增强学习方法，用于自主分离保证

论文标题

一种深层的多机构增强学习方法，用于自主分离保证

A Deep Multi-Agent Reinforcement Learning Approach to Autonomous Separation Assurance

论文作者

Brittain, Marc, Yang, Xuxi, Wei, Peng

论文摘要

提出了一个新型的深层增强学习框架，以在高密度，随机和动态部门中识别和解决可变数量的飞机之间的冲突。目前，该部门的容量受到人类空中交通管制员的认知限制的限制。我们调查了新概念（自主分离保证）的可行性，以及一种将部门能力推向人类认知限制的新方法。我们提出了使用分布式车辆自主权来确保分离的概念，而不是集中式空中交通管制员。我们提出的框架采用了我们修改以合并注意力网络的近端策略优化（PPO）。这使代理商可以采用可扩展，高效的方法访问该行业中的可变飞机信息，以在不确定性下实现高流量吞吐量。使用集中学习，分散的执行方案对代理进行培训，其中所有代理都学习和共享一个神经网络。在布鲁斯基空中交通管制环境中，在三个挑战性的案例研究中对拟议的框架进行了验证。数值结果表明，所提出的框架大大减少了离线训练时间，增加了绩效并导致更有效的政策。

A novel deep multi-agent reinforcement learning framework is proposed to identify and resolve conflicts among a variable number of aircraft in a high-density, stochastic, and dynamic sector. Currently the sector capacity is constrained by human air traffic controller's cognitive limitation. We investigate the feasibility of a new concept (autonomous separation assurance) and a new approach to push the sector capacity above human cognitive limitation. We propose the concept of using distributed vehicle autonomy to ensure separation, instead of a centralized sector air traffic controller. Our proposed framework utilizes Proximal Policy Optimization (PPO) that we modify to incorporate an attention network. This allows the agents to have access to variable aircraft information in the sector in a scalable, efficient approach to achieve high traffic throughput under uncertainty. Agents are trained using a centralized learning, decentralized execution scheme where one neural network is learned and shared by all agents. The proposed framework is validated on three challenging case studies in the BlueSky air traffic control environment. Numerical results show the proposed framework significantly reduces offline training time, increases performance, and results in a more efficient policy.

下载PDF全文

下载文献需遵守相关版权规定

论文标题