闭环血糖控制的深度加固学习

论文标题

闭环血糖控制的深度加固学习

Deep Reinforcement Learning for Closed-Loop Blood Glucose Control

论文作者

Fox, Ian, Lee, Joyce, Pop-Busui, Rodica, Wiens, Jenna

论文摘要

患有1型糖尿病（T1D）的人缺乏产生其身体需求胰岛素的能力。结果，他们必须不断决定自我管理多少胰岛素以充分控制其血糖水平。从可穿戴设备中捕获的纵向数据流，例如连续的葡萄糖监测器，可以帮助这些人管理健康，但目前，大多数决策负担仍然留在用户身上。为了减轻这一负担，研究人员正在研究将连续葡萄糖监测器和胰岛素泵与“人造胰腺”中的对照算法结合在一起的闭环解决方案。这样的系统旨在估计和提供适当量的胰岛素。在这里，我们开发了用于自动血糖控制的增强学习（RL）技术。通过一系列实验，我们比较了非RL方法的不同深度RL方法的性能。我们强调了RL方法的灵活性，展示了他们如何使用几乎没有其他数据适应新个人。在30名模拟患者的超过210万小时的数据中，我们的RL方法的表现优于基线控制算法：导致中位数血糖风险从8.34到4.24降低到4.24，总数下降了近50％，总降量低血糖的降低为99.8％，从4,610天到6次，从4,610天到6的途径，这些方法可以适应24％的速度（可预测餐饮），以适应餐饮（可预测餐点），以供应量的途中，以供应量的途中，以供应量（可预测餐点）。可预测性）。这项工作证明了Deep RL的潜力，可以帮助T1D患者管理其血糖水平而无需专家知识。我们所有的代码均可公开使用，允许复制和扩展。

People with type 1 diabetes (T1D) lack the ability to produce the insulin their bodies need. As a result, they must continually make decisions about how much insulin to self-administer to adequately control their blood glucose levels. Longitudinal data streams captured from wearables, like continuous glucose monitors, can help these individuals manage their health, but currently the majority of the decision burden remains on the user. To relieve this burden, researchers are working on closed-loop solutions that combine a continuous glucose monitor and an insulin pump with a control algorithm in an `artificial pancreas.' Such systems aim to estimate and deliver the appropriate amount of insulin. Here, we develop reinforcement learning (RL) techniques for automated blood glucose control. Through a series of experiments, we compare the performance of different deep RL approaches to non-RL approaches. We highlight the flexibility of RL approaches, demonstrating how they can adapt to new individuals with little additional data. On over 2.1 million hours of data from 30 simulated patients, our RL approach outperforms baseline control algorithms: leading to a decrease in median glycemic risk of nearly 50% from 8.34 to 4.24 and a decrease in total time hypoglycemic of 99.8%, from 4,610 days to 6. Moreover, these approaches are able to adapt to predictable meal times (decreasing average risk by an additional 24% as meals increase in predictability). This work demonstrates the potential of deep RL to help people with T1D manage their blood glucose levels without requiring expert knowledge. All of our code is publicly available, allowing for replication and extension.

下载PDF全文

下载文献需遵守相关版权规定

论文标题