论文标题
现代HPC系统和应用程序的基于编程的约束工作调度员
A Constraint Programming-based Job Dispatcher for Modern HPC Systems and Applications
论文作者
论文摘要
约束编程(CP)是AI中一个完善的领域,作为用于建模和解决离散优化问题的编程范例,并且已成功应用它来解决HPC系统中的在线工作调度问题,包括运行现代应用程序的人。可用的基于CP的求职者的局限性可能会阻碍其在当今系统越来越大且资源分配要求更高的系统中的实际使用。为了使基本的AI研究更接近部署的应用程序,我们为现代HPC系统和应用程序提供了新的基于CP的在线工作调度员。与前任不同,我们的新调度员在CP中解决了整个问题,并且其模型大小与系统大小无关。基于仿真研究的实验结果表明,在我们的方法中,在大型系统和分配是不平凡的系统中,派遣性能的增加显着增加。
Constraint Programming (CP) is a well-established area in AI as a programming paradigm for modelling and solving discrete optimization problems, and it has been been successfully applied to tackle the on-line job dispatching problem in HPC systems including those running modern applications. The limitations of the available CP-based job dispatchers may hinder their practical use in today's systems that are becoming larger in size and more demanding in resource allocation. In an attempt to bring basic AI research closer to a deployed application, we present a new CP-based on-line job dispatcher for modern HPC systems and applications. Unlike its predecessors, our new dispatcher tackles the entire problem in CP and its model size is independent of the system size. Experimental results based on a simulation study show that with our approach dispatching performance increases significantly in a large system and in a system where allocation is nontrivial.