通过深入的增强学习解决单轨火车调度问题

论文标题

通过深入的增强学习解决单轨火车调度问题

Solving the single-track train scheduling problem via Deep Reinforcement Learning

论文作者

Agasucci, Valerio, Grani, Giorgio, Lamorgese, Leonardo

论文摘要

每天，铁路在网络和车队方面都会遇到干扰和干扰，这会影响铁路交通的稳定性。引起的延迟通过网络传播，这导致需求不匹配，商品和乘客的报价不匹配，进而导致服务质量的损失。在这些情况下，人类交通管制员，即所谓的调度员的责任是尽最大努力最大程度地减少对交通的影响。但是，调度员不可避免地对决策的敲门效应的感知深度有限，尤其是他们如何影响网络领域的直接控制之外。近年来，在决策科学方面的许多工作都致力于开发自动解决该问题并支持调度员在这项具有挑战性的任务中的方法。本文研究了基于机器学习的方法来解决此问题，提出了两种不同的深Q学习方法（分散和集中化）。数值结果表明，基于矩阵的经典线性Q学习的优越性。此外，将集中式方法与显示出有趣结果的MILP配方进行了比较。实验的灵感来自美国1级铁路提供的数据。

Every day, railways experience disturbances and disruptions, both on the network and the fleet side, that affect the stability of rail traffic. Induced delays propagate through the network, which leads to a mismatch in demand and offer for goods and passengers, and, in turn, to a loss in service quality. In these cases, it is the duty of human traffic controllers, the so-called dispatchers, to do their best to minimize the impact on traffic. However, dispatchers inevitably have a limited depth of perception of the knock-on effect of their decisions, particularly how they affect areas of the network that are outside their direct control. In recent years, much work in Decision Science has been devoted to developing methods to solve the problem automatically and support the dispatchers in this challenging task. This paper investigates Machine Learning-based methods for tackling this problem, proposing two different Deep Q-Learning methods(Decentralized and Centralized). Numerical results show the superiority of these techniques with respect to the classical linear Q-Learning based on matrices. Moreover, the Centralized approach is compared with a MILP formulation showing interesting results. The experiments are inspired by data provided by a U.S. Class 1 railroad.

下载PDF全文

下载文献需遵守相关版权规定

论文标题