论文标题
在计算存储平台上加速基于图的大规模基于图
Accelerating Large-Scale Graph-based Nearest Neighbor Search on a Computational Storage Platform
论文作者
论文摘要
K-Nearest邻居搜索是各种应用程序中的基本任务之一,层次可导航的小世界(HNSW)最近在大规模云服务中引起了人们的注意,因为它在提供快速搜索的同时很容易扩大数据库。另一方面,将可编程逻辑和单板上的可编程逻辑模块结合在一起的计算存储设备(CSD)变得流行,以解决现代计算系统的数据带宽瓶颈。在本文中,我们提出了一个计算存储平台,该平台可以加速基于SMARTSSSD CSD的基于图形的最近的邻居搜索算法。为此,我们使用基于HLS和RTL的方法进行了各种优化方法来修改算法更适合硬件,并使用HLS和RTL的方法实现两种类型的加速器。此外,我们扩展了提议的平台,以拥有4个SMARTSSS,并应用图形并行性以进一步提高系统性能。结果,提议的计算存储平台在258.66W功率耗散时,SIFT1B数据集的每秒吞吐量为75.59个查询,该数据集的功率耗散为12.83 x和17.91倍,并且比常规的基于CPU的基于CPU和基于GPU的服务器平台更快,10.43 x和10.43 x和24.33 x。借助多稳定的存储和自定义加速能力,我们相信所提出的计算存储平台是对成本敏感云数据中心的有前途解决方案。
K-nearest neighbor search is one of the fundamental tasks in various applications and the hierarchical navigable small world (HNSW) has recently drawn attention in large-scale cloud services, as it easily scales up the database while offering fast search. On the other hand, a computational storage device (CSD) that combines programmable logic and storage modules on a single board becomes popular to address the data bandwidth bottleneck of modern computing systems. In this paper, we propose a computational storage platform that can accelerate a large-scale graph-based nearest neighbor search algorithm based on SmartSSD CSD. To this end, we modify the algorithm more amenable on the hardware and implement two types of accelerators using HLS- and RTL-based methodology with various optimization methods. In addition, we scale up the proposed platform to have 4 SmartSSDs and apply graph parallelism to boost the system performance further. As a result, the proposed computational storage platform achieves 75.59 query per second throughput for the SIFT1B dataset at 258.66W power dissipation, which is 12.83x and 17.91x faster and 10.43x and 24.33x more energy efficient than the conventional CPU-based and GPU-based server platform, respectively. With multi-terabyte storage and custom acceleration capability, we believe that the proposed computational storage platform is a promising solution for cost-sensitive cloud datacenters.