大规模MEC网络中的在线资源调度的基于自动编码器的堆叠自动编码器学习

论文标题

大规模MEC网络中的在线资源调度的基于自动编码器的堆叠自动编码器学习

Stacked Auto Encoder Based Deep Reinforcement Learning for Online Resource Scheduling in Large-Scale MEC Networks

论文作者

Jiang, Feibo, Wang, Kezhi, Dong, Li, Pan, Cunhua, Yang, Kun

论文摘要

提出了一个在线资源调度框架，以最大程度地减少所有物联网（IoT）用户的加权任务延迟总和，通过优化大规模移动边缘计算（MEC）系统中的卸载决策，传输功率和资源分配。为此，提出了基于深厚的增强学习（DRL）解决方案，其中包括以下组件。首先，使用无监督学习的相关和正规堆叠的自动编码器（2R-SAE）用于执行数据压缩和表示高维通道质量信息（CQI）数据的表示，这可以减少DRL的状态空间。其次，我们提出了一种基于自适应的模拟退火方法（ASA）作为DRL的动作搜索方法，其中使用自适应H-munt来指导搜索方向，并提出了自适应迭代，以提高DRL过程中的搜索效率。第三，引入了保留和优先的经验重播（2P-ER），以帮助DRL训练策略网络并找到最佳的卸载策略。提供了数值结果，以证明所提出的算法可以实现近乎最佳的性能，同时与现有基准相比显着减少计算时间。

An online resource scheduling framework is proposed for minimizing the sum of weighted task latency for all the Internet of things (IoT) users, by optimizing offloading decision, transmission power and resource allocation in the large-scale mobile edge computing (MEC) system. Towards this end, a deep reinforcement learning (DRL) based solution is proposed, which includes the following components. Firstly, a related and regularized stacked auto encoder (2r-SAE) with unsupervised learning is applied to perform data compression and representation for high dimensional channel quality information (CQI) data, which can reduce the state space for DRL. Secondly, we present an adaptive simulated annealing based approach (ASA) as the action search method of DRL, in which an adaptive h-mutation is used to guide the search direction and an adaptive iteration is proposed to enhance the search efficiency during the DRL process. Thirdly, a preserved and prioritized experience replay (2p-ER) is introduced to assist the DRL to train the policy network and find the optimal offloading policy. Numerical results are provided to demonstrate that the proposed algorithm can achieve near-optimal performance while significantly decreasing the computational time compared with existing benchmarks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题