DMRO：一个基于元元学习的深度增强任务卸载框架，用于边缘云计算

论文标题

DMRO：一个基于元元学习的深度增强任务卸载框架，用于边缘云计算

DMRO:A Deep Meta Reinforcement Learning-based Task Offloading Framework for Edge-Cloud Computing

论文作者

Qu, Guanjin, Wu, Huaming

论文摘要

随着移动数据的持续增长和对计算能力的前所未有的需求，资源受限的边缘设备无法有效地满足物联网应用程序（IoT）应用程序和深度神经网络（DNN）计算的要求。作为一个分布式计算范式，将复杂任务从IoT设备迁移到Edge-Cloud服务器的边缘卸载可以突破IoT设备的资源限制，减少计算负担并提高任务处理的效率。但是，最佳卸载决策的问题是NP-HARD，传统优化方法很难有效地实现结果。此外，现有的深度学习方法仍然存在一些缺点，例如，当环境变化时，学习速度的缓慢学习速度和原始网络参数的失败。为了应对这些挑战，我们提出了基于深度元加强学习的卸载（DMRO）算法，该算法将多个平行DNN与Q学习结合在一起，以做出精细的卸载决策。通过汇总深度学习能力，强化学习的决策能力以及元学习能力的快速环境学习能力，可以快速而灵活地从IoT环境中快速而灵活地获得最佳的卸载策略。仿真结果表明，所提出的算法在深度Q学习算法上取得了明显的改进，并且即使在随着时光的IoT环境中，也具有强大的可移植性。

With the continuous growth of mobile data and the unprecedented demand for computing power, resource-constrained edge devices cannot effectively meet the requirements of Internet of Things (IoT) applications and Deep Neural Network (DNN) computing. As a distributed computing paradigm, edge offloading that migrates complex tasks from IoT devices to edge-cloud servers can break through the resource limitation of IoT devices, reduce the computing burden and improve the efficiency of task processing. However, the problem of optimal offloading decision-making is NP-hard, traditional optimization methods are difficult to achieve results efficiently. Besides, there are still some shortcomings in existing deep learning methods, e.g., the slow learning speed and the failure of the original network parameters when the environment changes. To tackle these challenges, we propose a Deep Meta Reinforcement Learning-based offloading (DMRO) algorithm, which combines multiple parallel DNNs with Q-learning to make fine-grained offloading decisions. By aggregating the perceptive ability of deep learning, the decision-making ability of reinforcement learning, and the rapid environment learning ability of meta-learning, it is possible to quickly and flexibly obtain the optimal offloading strategy from the IoT environment. Simulation results demonstrate that the proposed algorithm achieves obvious improvement over the Deep Q-Learning algorithm and has strong portability in making real-time offloading decisions even in time-varying IoT environments.

下载PDF全文

下载文献需遵守相关版权规定

论文标题