论文标题
以太网和HPC网络中具有多路径和路径多样性的高性能路由
High-Performance Routing with Multipathing and Path Diversity in Ethernet and HPC Networks
论文作者
论文摘要
最近对拓扑设计的研究渠道着重于降低网络直径。已经提出了许多低直径的拓扑,例如细长的苍蝇或水母,这些拓扑已大大降低了成本,功耗和潜伏期。实现这些拓扑的好处的关键挑战是路由。一方面,这些网络提供的路径长度比诸如CLOS或torus之类的已建立的拓扑长度更短,从而导致性能提高。另一方面,每对端点之间的最短路径的数量比CLOS小得多,但是路由器对之间有大量的非最小路径。这障碍甚至使得无法使用已建立的多路由路由方案,例如ECMP。在这项工作中,为了促进现代网络中的高性能路由,我们分析了现有的路由协议和体系结构,重点关注它们如何利用最小和非最小程度路径的多样性。我们首先开发出对多路径和整体路径多样性的不同形式支持的分类学。然后,我们分析现有的路由计划如何支持这种多样性。除其他外,我们考虑使用最短和非最短路的多路径,对差异路径的支持或实现适应性。为了解决HPC和“大数据”域的持续收敛性,我们考虑针对HPC系统和数据中心以及一般簇开发的路由协议。因此,我们涵盖了基于以太网,Infiniband和其他HPC网络(例如Myrinet)的体系结构和协议。我们的审查将促进超级计算机和数据中心中未来的高性能多路由路由协议。
The recent line of research into topology design focuses on lowering network diameter. Many low-diameter topologies such as Slim Fly or Jellyfish that substantially reduce cost, power consumption, and latency have been proposed. A key challenge in realizing the benefits of these topologies is routing. On one hand, these networks provide shorter path lengths than established topologies such as Clos or torus, leading to performance improvements. On the other hand, the number of shortest paths between each pair of endpoints is much smaller than in Clos, but there is a large number of non-minimal paths between router pairs. This hampers or even makes it impossible to use established multipath routing schemes such as ECMP. In this work, to facilitate high-performance routing in modern networks, we analyze existing routing protocols and architectures, focusing on how well they exploit the diversity of minimal and non-minimal paths. We first develop a taxonomy of different forms of support for multipathing and overall path diversity. Then, we analyze how existing routing schemes support this diversity. Among others, we consider multipathing with both shortest and non-shortest paths, support for disjoint paths, or enabling adaptivity. To address the ongoing convergence of HPC and "Big Data" domains, we consider routing protocols developed for both HPC systems and for data centers as well as general clusters. Thus, we cover architectures and protocols based on Ethernet, InfiniBand, and other HPC networks such as Myrinet. Our review will foster developing future high-performance multipathing routing protocols in supercomputers and data centers.