论文标题

沟通高效的机器学习的灵活框架:从HPC到IoT

A flexible framework for communication-efficient machine learning: from HPC to IoT

论文作者

Khirirat, Sarit, Magnússon, Sindri, Aytekin, Arda, Johansson, Mikael

论文摘要

随着机器学习任务规模的越来越大,减少计算节点之间的通信至关重要。在CPU和GPU之间的瓶颈上的梯度压缩方面的早期工作,但是现在在各种不同的系统体系结构中需要进行沟通效率,从高性能簇到能源约束的物联网设备。在当前的练习中,通常在训练和设置很好的一个任务的训练和设置之前选择压缩级别,对于另一个架构上的另一个数据集可能非常优越。在本文中,我们提出了一个灵活的框架,该框架在每次迭代时都适应了压缩水平,以最大程度地提高每个通信位实现的目标函数的改善。通过建模通信成本如何取决于特定技术的压缩水平,我们的框架很容易从一项技术到另一种技术。理论结果和实际实验表明,自动调整策略可显着提高几种最新的压缩方案的沟通效率。

With the increasing scale of machine learning tasks, it has become essential to reduce the communication between computing nodes. Early work on gradient compression focused on the bottleneck between CPUs and GPUs, but communication-efficiency is now needed in a variety of different system architectures, from high-performance clusters to energy-constrained IoT devices. In the current practice, compression levels are typically chosen before training and settings that work well for one task may be vastly suboptimal for another dataset on another architecture. In this paper, we propose a flexible framework which adapts the compression level to the true gradient at each iteration, maximizing the improvement in the objective function that is achieved per communicated bit. Our framework is easy to adapt from one technology to the next by modeling how the communication cost depends on the compression level for the specific technology. Theoretical results and practical experiments indicate that the automatic tuning strategies significantly increase communication efficiency on several state-of-the-art compression schemes.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源