可重新配置的智能表面辅助多源味o系统利用深度强化学习

论文标题

可重新配置的智能表面辅助多源味o系统利用深度强化学习

Reconfigurable Intelligent Surface Assisted Multiuser MISO Systems Exploiting Deep Reinforcement Learning

论文作者

Huang, Chongwen, Mo, Ronghong, Yuen, Chau

论文摘要

最近，可重新配置的智能表面（RIS）受益于制造可编程元材料的突破，被推测为对未来六代（6G）无线通信系统的关键促成技术之一，超越了大型多重输入（Massive Mosive Mimo），以实现智能广播环境。 RI被用作反映阵列的反射阵列，能够在无需射频链的而无需射频链的情况下帮助MIMO传输，从而大大降低了功耗。在本文中，我们通过利用最新的深钢筋学习（DRL）来研究基站发射光束矩阵和RIS的相移矩阵的关节设计。我们首先开发了一种基于DRL的算法，其中在连续状态和行动的背景下，通过观察预定义的奖励来通过与环境进行试验的相互作用来获得联合设计。与使用交替优化技术替代获得发射光束成形和相移的最报告的作品不同，基于DRL的算法同时获得了关节设计作为DRL神经网络的输出。仿真结果表明，所提出的算法不仅能够从环境中学习并逐渐改善其行为，而且与两个最先进的基准相比，还获得了可比的性能。还可以观察到，适当的神经网络参数设置将显着提高所提出算法的性能和收敛速率。

Recently, the reconfigurable intelligent surface (RIS), benefited from the breakthrough on the fabrication of programmable meta-material, has been speculated as one of the key enabling technologies for the future six generation (6G) wireless communication systems scaled up beyond massive multiple input multiple output (Massive-MIMO) technology to achieve smart radio environments. Employed as reflecting arrays, RIS is able to assist MIMO transmissions without the need of radio frequency chains resulting in considerable reduction in power consumption. In this paper, we investigate the joint design of transmit beamforming matrix at the base station and the phase shift matrix at the RIS, by leveraging recent advances in deep reinforcement learning (DRL). We first develop a DRL based algorithm, in which the joint design is obtained through trial-and-error interactions with the environment by observing predefined rewards, in the context of continuous state and action. Unlike the most reported works utilizing the alternating optimization techniques to alternatively obtain the transmit beamforming and phase shifts, the proposed DRL based algorithm obtains the joint design simultaneously as the output of the DRL neural network. Simulation results show that the proposed algorithm is not only able to learn from the environment and gradually improve its behavior, but also obtains the comparable performance compared with two state-of-the-art benchmarks. It is also observed that, appropriate neural network parameter settings will improve significantly the performance and convergence rate of the proposed algorithm.

下载PDF全文

下载文献需遵守相关版权规定

论文标题