论文标题
MGPU-TSM:具有真正共享内存的多GPU系统
MGPU-TSM: A Multi-GPU System with Truly Shared Memory
论文作者
论文摘要
GPU应用的尺寸正在迅速增长。他们用尽了单个GPU的计算和内存资源,并要求转移到多个GPU。但是,由于跨多个GPU的数据移动的开销,这些应用程序的性能用GPU计数缩放为尺度。此外,缺乏对相干性的硬件支持加剧了该问题,因为程序员必须在GPU上复制数据,或使用高额外部芯片链接来获取远程数据。为了解决这些问题,我们提出了一个具有真正共享内存(MGPU-TSM)的多GPU系统,其中主内存在所有GPU中都在物理上共享。我们消除了远程访问,并使用MGPU-TSM系统避免了数据复制,从而简化了内存层次结构。我们的初步分析表明,MGPU-TSM平均表现为4 GPU?比当前最佳性能多GPU配置用于标准应用程序基准。
The sizes of GPU applications are rapidly growing. They are exhausting the compute and memory resources of a single GPU, and are demanding the move to multiple GPUs. However, the performance of these applications scales sub-linearly with GPU count because of the overhead of data movement across multiple GPUs. Moreover, a lack of hardware support for coherency exacerbates the problem because a programmer must either replicate the data across GPUs or fetch the remote data using high-overhead off-chip links. To address these problems, we propose a multi-GPU system with truly shared memory (MGPU-TSM), where the main memory is physically shared across all the GPUs. We eliminate remote accesses and avoid data replication using an MGPU-TSM system, which simplifies the memory hierarchy. Our preliminary analysis shows that MGPU-TSM with 4 GPUs performs, on average, 3.9x? better than the current best performing multi-GPU configuration for standard application benchmarks.