自然危害期间公民搬迁的双Q学习

论文标题

自然危害期间公民搬迁的双Q学习

Double Q-Learning for Citizen Relocation During Natural Hazards

论文作者

da Silva, Alysson Ribeiro

论文摘要

由于死亡率，搬迁，费率和重建决策，自然灾害可能会在世界范围内造成重大负面的社会经济影响。在发生自然危害期间，机器人技术已成功地用于识别和营救受害者。但是，在部署解决方案方面几乎没有努力在这些解决方案中可以自己搬迁公民的生命，而无需等待由人类组成的救援团队。强化学习方法可用于部署这种解决方案，但是，部署它的最著名算法之一，Q学习，在执行其学习例程时会产生偏见的结果。在这项研究中，采用了基于可观察到的马尔可夫决策过程的公民搬迁解决方案，在此，根据基于网格世界的拟议危害模拟引擎，评估了双Q学习在自然危害期间搬迁公民的能力。该解决方案的性能是作为公民搬迁程序的成功率测量的，结果表明，该技术将其描绘成超过100％的性能，对于轻松方案，而硬性方案的性能接近50％。

Natural disasters can cause substantial negative socio-economic impacts around the world, due to mortality, relocation, rates, and reconstruction decisions. Robotics has been successfully applied to identify and rescue victims during the occurrence of a natural hazard. However, little effort has been taken to deploy solutions where an autonomous robot can save the life of a citizen by itself relocating it, without the need to wait for a rescue team composed of humans. Reinforcement learning approaches can be used to deploy such a solution, however, one of the most famous algorithms to deploy it, the Q-learning, suffers from biased results generated when performing its learning routines. In this research a solution for citizen relocation based on Partially Observable Markov Decision Processes is adopted, where the capability of the Double Q-learning in relocating citizens during a natural hazard is evaluated under a proposed hazard simulation engine based on a grid world. The performance of the solution was measured as a success rate of a citizen relocation procedure, where the results show that the technique portrays a performance above 100% for easy scenarios and near 50% for hard ones.

下载PDF全文

下载文献需遵守相关版权规定

论文标题