论文标题
双变量多项式编码用于有效的分布式矩阵乘法
Bivariate Polynomial Coding for Efficient Distributed Matrix Multiplication
论文作者
论文摘要
编码计算是一种有效的技术,可以减轻大规模和分布式矩阵乘法中的“散滴”。特别是,通过使计算时间仅取决于最快的工人,单变量多项式代码已被证明可以有效缓解散曲。但是,这些方案完全忽略了散乱的工人所做的工作,从而浪费了计算资源。为了减少工人未完成的工作量,可以将矩阵乘法任务进一步分解为较小的子任务,并将多个子任务分配给每个工人(可能是异质),以更好地适合其特定的存储和计算能力。在这项工作中,我们提出了一个新型的双变量多项式代码系列,以有效利用散乱的工人进行的工作。我们表明,双变量多项式代码在上传通信成本和存储效率方面具有显着优势,该代码根据可以计算的每个工人的子任务数量来衡量。我们提出了两个双变量多项式编码方案。第一个利用了以下事实:在评估点的矩形网格上,双变量插值始终是可能的。我们以添加一些冗余计算为代价获得了此类点。对于第二个方案,我们放宽解码约束,几乎需要解释性,以便几乎所有评估点的选择。我们提出插值集可满足工人某些存储配置的这种可分解性条件。我们的数值结果表明,双变量多项式编码大大减少了分布式矩阵乘法的平均计算时间。我们认为,这项工作为有效的编码分布式计算打开了一类新的未开发的编码方案。
Coded computing is an effective technique to mitigate "stragglers" in large-scale and distributed matrix multiplication. In particular, univariate polynomial codes have been shown to be effective in straggler mitigation by making the computation time depend only on the fastest workers. However, these schemes completely ignore the work done by the straggling workers resulting in a waste of computational resources. To reduce the amount of work left unfinished at workers, one can further decompose the matrix multiplication task into smaller sub-tasks, and assign multiple sub-tasks to each worker, possibly heterogeneously, to better fit their particular storage and computation capacities. In this work, we propose a novel family of bivariate polynomial codes to efficiently exploit the work carried out by straggling workers. We show that bivariate polynomial codes bring significant advantages in terms of upload communication costs and storage efficiency, measured in terms of the number of sub-tasks that can be computed per worker. We propose two bivariate polynomial coding schemes. The first one exploits the fact that bivariate interpolation is always possible on a rectangular grid of evaluation points. We obtain such points at the cost of adding some redundant computations. For the second scheme, we relax the decoding constraints and require decodability for almost all choices of the evaluation points. We present interpolation sets satisfying such decodability conditions for certain storage configurations of workers. Our numerical results show that bivariate polynomial coding considerably reduces the average computation time of distributed matrix multiplication. We believe this work opens up a new class of previously unexplored coding schemes for efficient coded distributed computation.