来自区域温度控制数据的近乎最佳的深钢筋学习政策

论文标题

来自区域温度控制数据的近乎最佳的深钢筋学习政策

Near-optimal Deep Reinforcement Learning Policies from Data for Zone Temperature Control

论文作者

Di Natale, Loris, Svetozarevic, Bratislav, Heer, Philipp, Jones, Colin N.

论文摘要

用更智能的解决方案替换表现不佳的现有控制器将降低建筑部门的能量强度。最近，基于深度加固学习（DRL）的控制器比传统的基线更有效。但是，由于通常未知的最佳解决方案，因此尚不清楚DRL代理在一般而言的表现近乎最佳的性能还是是否仍然存在较大的桥梁。在本文中，我们研究了与理论上最佳解决方案相比，DRL剂的性能。为此，我们利用物理上一致的神经网络（PCNN）作为仿真环境，为此，最佳控制输入易于计算。此外，PCNN仅依靠要训练的数据，避免了基于物理的困难建模阶段，同时保持身体一致性。我们的结果暗示，DRL代理不仅要明显优于常规规则的控制器，而且还具有近乎最佳的性能。

Replacing poorly performing existing controllers with smarter solutions will decrease the energy intensity of the building sector. Recently, controllers based on Deep Reinforcement Learning (DRL) have been shown to be more effective than conventional baselines. However, since the optimal solution is usually unknown, it is still unclear if DRL agents are attaining near-optimal performance in general or if there is still a large gap to bridge. In this paper, we investigate the performance of DRL agents compared to the theoretically optimal solution. To that end, we leverage Physically Consistent Neural Networks (PCNNs) as simulation environments, for which optimal control inputs are easy to compute. Furthermore, PCNNs solely rely on data to be trained, avoiding the difficult physics-based modeling phase, while retaining physical consistency. Our results hint that DRL agents not only clearly outperform conventional rule-based controllers, they furthermore attain near-optimal performance.

下载PDF全文

下载文献需遵守相关版权规定

论文标题