VACSIM：使用加固学习的COVID-19疫苗分布的学习有效策略

论文标题

VACSIM：使用加固学习的COVID-19疫苗分布的学习有效策略

VacSIM: Learning Effective Strategies for COVID-19 Vaccine Distribution using Reinforcement Learning

论文作者

Awasthi, Raghav, Guliani, Keerat Kaur, Khan, Saif Ahmad, Vashishtha, Aniket, Gill, Mehrab Singh, Bhatt, Arshita, Nagori, Aditya, Gupta, Aniket, Kumaraguru, Ponnurangam, Sethi, Tavpritesh

论文摘要

Covid-19-19疫苗是减轻大流行袭击的最佳选择。但是，预计疫苗也将是有限的资源。最佳分配策略，尤其是在访问不平等和热点时间分离的国家，可能是停止疾病传播的有效方法。我们通过提出一种新型的管道VACSIM来解决这个问题，将深层增强学习模型吻合到上下文匪徒的方法中，以优化COVID-19疫苗的分布。尽管增强学习模型提出了更好的行动和奖励，但上下文匪徒允许在线修改，这些修改可能需要在现实世界中日常实施。我们根据印度五个不同州（阿萨姆邦，德里，德里，贾坎德邦，马哈拉施特拉邦和纳加兰）在五个不同状态下的疫苗成比例的疫苗成正比的天真分配方法评估该框架，并证明了9039年的潜在感染可阻止限制45天的效果，并在45天的范围内限制了效果。我们的模型和平台对于印度的所有州以及全球范围内都可以扩展。我们还提出了新颖的评估策略，包括基于标准隔室模型的预测和对我们模型的因果关系评估。由于所有模型都携带可能需要在各种情况下进行测试的假设，因此我们开源的模型Vacsim并为与OpenAI Gym兼容的新增强学习环境提供了贡献，以使其可用于全球现实应用程序。（http://vacsim.tavlab.iiitd.edu.in:8000/）。

A COVID-19 vaccine is our best bet for mitigating the ongoing onslaught of the pandemic. However, vaccine is also expected to be a limited resource. An optimal allocation strategy, especially in countries with access inequities and temporal separation of hot-spots, might be an effective way of halting the disease spread. We approach this problem by proposing a novel pipeline VacSIM that dovetails Deep Reinforcement Learning models into a Contextual Bandits approach for optimizing the distribution of COVID-19 vaccine. Whereas the Reinforcement Learning models suggest better actions and rewards, Contextual Bandits allow online modifications that may need to be implemented on a day-to-day basis in the real world scenario. We evaluate this framework against a naive allocation approach of distributing vaccine proportional to the incidence of COVID-19 cases in five different States across India (Assam, Delhi, Jharkhand, Maharashtra and Nagaland) and demonstrate up to 9039 potential infections prevented and a significant increase in the efficacy of limiting the spread over a period of 45 days through the VacSIM approach. Our models and the platform are extensible to all states of India and potentially across the globe. We also propose novel evaluation strategies including standard compartmental model-based projections and a causality-preserving evaluation of our model. Since all models carry assumptions that may need to be tested in various contexts, we open source our model VacSIM and contribute a new reinforcement learning environment compatible with OpenAI gym to make it extensible for real-world applications across the globe. (http://vacsim.tavlab.iiitd.edu.in:8000/).

下载PDF全文

下载文献需遵守相关版权规定

论文标题