沟通支持的深入强化学习以优化无人机辅助网络的能源效率

论文标题

沟通支持的深入强化学习以优化无人机辅助网络的能源效率

Communication-Enabled Deep Reinforcement Learning to Optimise Energy-Efficiency in UAV-Assisted Networks

论文作者

Omoniwa, Babatunji, Galkin, Boris, Dusparic, Ivana

论文摘要

在网络需求增加或现有陆地蜂窝基础架构中的网络需求或故障点，无人机（UAV）越来越多地部署到静态和移动地面用户的无线连接。但是，无人机受到能源约束，并经历了来自附近无人机细胞共享相同频谱的干扰挑战，从而影响了系统的能效（EE）。最近的方法通过优化仅服务静态地面用户并忽略移动用户的无人机的轨迹来优化系统的EE。其他几个人忽略了附近无人机细胞的干扰的影响，假设没有干扰的网络环境。尽管对集中式无人机控制的分散控制的研究兴趣越来越大，但无人机之间的直接合作以改善协调，同时优化系统的EE尚未得到充分探索。为了解决这个问题，我们提出了一种直接的协作沟通启用的多代理分散式Q-Network（CMAD-DDQN）方法。 CMAD-DDQN是一种协作算法，允许无人机通过与他们最近的邻居进行通信，通过现有的3GPP指南明确分享其遥测。这允许代理控制的无人机通过填补知识差距并融合到最佳策略来优化其3D飞行轨迹。仿真结果表明，所提出的方法在最大化系统的EE方面优于现有基准，而不会降低网络中的覆盖范围性能。 CMAD-DDQN方法的表现优于MAD-DDQN，它忽略了无人机之间的直接协作，多代理的深层确定性政策梯度（MADDPG）和随机政策方法，这些方法考虑了2D UAV部署设计，而从附近的UAV细胞中忽略了约15％，65％，65％和85％的UAV细胞的干扰。

Unmanned aerial vehicles (UAVs) are increasingly deployed to provide wireless connectivity to static and mobile ground users in situations of increased network demand or points of failure in existing terrestrial cellular infrastructure. However, UAVs are energy-constrained and experience the challenge of interference from nearby UAV cells sharing the same frequency spectrum, thereby impacting the system's energy efficiency (EE). Recent approaches focus on optimising the system's EE by optimising the trajectory of UAVs serving only static ground users and neglecting mobile users. Several others neglect the impact of interference from nearby UAV cells, assuming an interference-free network environment. Despite growing research interest in decentralised control over centralised UAVs' control, direct collaboration among UAVs to improve coordination while optimising the systems' EE has not been adequately explored. To address this, we propose a direct collaborative communication-enabled multi-agent decentralised double deep Q-network (CMAD-DDQN) approach. The CMAD-DDQN is a collaborative algorithm that allows UAVs to explicitly share their telemetry via existing 3GPP guidelines by communicating with their nearest neighbours. This allows the agent-controlled UAVs to optimise their 3D flight trajectories by filling up knowledge gaps and converging to optimal policies. Simulation results show that the proposed approach outperforms existing baselines in terms of maximising the systems' EE without degrading coverage performance in the network. The CMAD-DDQN approach outperforms the MAD-DDQN that neglects direct collaboration among UAVs, the multi-agent deep deterministic policy gradient (MADDPG) and random policy approaches that consider a 2D UAV deployment design while neglecting interference from nearby UAV cells by about 15%, 65% and 85%, respectively.

下载PDF全文

下载文献需遵守相关版权规定

论文标题