论文标题
SIBYL:使用在线增强学习中的混合存储系统中的自适应和可扩展的数据放置
Sibyl: Adaptive and Extensible Data Placement in Hybrid Storage Systems Using Online Reinforcement Learning
论文作者
论文摘要
混合存储系统(HSS)使用多种不同的存储设备在高性能下提供高可扩展的存储容量。最近的研究提出了各种技术,旨在准确识别至关重要的数据以将其放置在“最合适”的存储设备中。不幸的是,这些技术中的大多数都是刚性的,(1)将其适应性限制为在多种工作负载和存储设备配置方面的表现良好,并且(2)使设计人员难以将这些技术扩展到不同的存储系统配置(例如,具有不同数量的存储设备或不同类型的存储设备)。我们介绍了SIBYL,这是第一种使用加固学习将数据放置在混合存储系统中的技术。 Sibyl观察运行工作量的不同功能以及存储设备,以做出系统意识的数据放置决策。对于它做出的每个决定,Sibyl都会从系统中获得奖励,以评估其决策的长期绩效影响,并不断优化其在线数据放置政策。我们在具有各种HSS配置的真实系统上实现SIBYL。我们的结果表明,与最佳先前数据放置技术相比,SIBYL在面向性能/面向成本的HSS配置方面提供了21.6%/19.9%的性能提高。我们使用三种不同存储设备的HSS配置进行评估表明,SIBYL的表现优于最先进的数据放置策略23.9%-48.2%,同时大大减少了系统架构师在设计数据放置机制时的负担,该机制可以同时合并三个存储设备。我们表明,西比尔(Sibyl)达到了甲骨文策略的80%的绩效,该策略完全了解未来的访问模式,同时仅产生了仅124.4 KIB的非常适度的存储开销。
Hybrid storage systems (HSS) use multiple different storage devices to provide high and scalable storage capacity at high performance. Recent research proposes various techniques that aim to accurately identify performance-critical data to place it in a "best-fit" storage device. Unfortunately, most of these techniques are rigid, which (1) limits their adaptivity to perform well for a wide range of workloads and storage device configurations, and (2) makes it difficult for designers to extend these techniques to different storage system configurations (e.g., with a different number or different types of storage devices) than the configuration they are designed for. We introduce Sibyl, the first technique that uses reinforcement learning for data placement in hybrid storage systems. Sibyl observes different features of the running workload as well as the storage devices to make system-aware data placement decisions. For every decision it makes, Sibyl receives a reward from the system that it uses to evaluate the long-term performance impact of its decision and continuously optimizes its data placement policy online. We implement Sibyl on real systems with various HSS configurations. Our results show that Sibyl provides 21.6%/19.9% performance improvement in a performance-oriented/cost-oriented HSS configuration compared to the best previous data placement technique. Our evaluation using an HSS configuration with three different storage devices shows that Sibyl outperforms the state-of-the-art data placement policy by 23.9%-48.2%, while significantly reducing the system architect's burden in designing a data placement mechanism that can simultaneously incorporate three storage devices. We show that Sibyl achieves 80% of the performance of an oracle policy that has complete knowledge of future access patterns while incurring a very modest storage overhead of only 124.4 KiB.