一种保守的Q学习方法，用于处理败血症治疗策略中的分配变化

论文标题

一种保守的Q学习方法，用于处理败血症治疗策略中的分配变化

A Conservative Q-Learning approach for handling distribution shift in sepsis treatment strategies

论文作者

Kaushik, Pramod, Kummetha, Sneha, Moodley, Perusha, Bapi, Raju S.

论文摘要

败血症是死亡率的主要原因，其治疗非常昂贵。败血症治疗也非常具有挑战性，因为关于哪些干预措施效果最好，不同的患者对同一治疗的反应却大不相同。深度强化学习方法可用于制定最佳政策，以反映医师行动的治疗策略。在医疗保健方案中，可用的数据主要是离线收集的，而与环境没有相互作用，这需要使用离线RL技术。离线RL范式受到行动分布的转变，进而对学习治疗的最佳政策产生负面影响。在这项工作中，保守-Q学习（CQL）算法用于减轻这一转变，其相应的政策比传统的深度Q学习更接近医师政策。该政策所学的政策可以帮助重症监护病房的临床医生在治疗化粪池患者并提高生存率的同时做出更好的决定。

Sepsis is a leading cause of mortality and its treatment is very expensive. Sepsis treatment is also very challenging because there is no consensus on what interventions work best and different patients respond very differently to the same treatment. Deep Reinforcement Learning methods can be used to come up with optimal policies for treatment strategies mirroring physician actions. In the healthcare scenario, the available data is mostly collected offline with no interaction with the environment, which necessitates the use of offline RL techniques. The Offline RL paradigm suffers from action distribution shifts which in turn negatively affects learning an optimal policy for the treatment. In this work, a Conservative-Q Learning (CQL) algorithm is used to mitigate this shift and its corresponding policy reaches closer to the physicians policy than conventional deep Q Learning. The policy learned could help clinicians in Intensive Care Units to make better decisions while treating septic patients and improve survival rate.

下载PDF全文

下载文献需遵守相关版权规定

论文标题