论文标题

mlproxy:用于机器学习推理在无服务器计算平台上服务的SLA-AWARE反向代理

MLProxy: SLA-Aware Reverse Proxy for Machine Learning Inference Serving on Serverless Computing Platforms

论文作者

Mahmoudi, Nima, Khazaei, Hamzeh

论文摘要

在云上服务机器学习推理工作负载仍然是生产水平上的一项艰巨的任务。由于批处理配置,资源配置和可变到达过程之间的复杂相互作用,因此在优化基础架构成本的同时,在优化基础架构成本的同时,对推理工作量的最佳配置非常复杂。近年来已经出现了无服务器计算,以使大多数基础架构管理任务自动化。工作量批处理揭示了改善机器学习服务工作量的响应时间和成本效益的潜力。但是,尚未通过无服务器计算平台支持它。我们的实验表明,对于各种机器学习工作负载,批处理可以通过根据请求降低处理间接费用来大大提高系统的效率。 在这项工作中,我们提出了MLProxy,这是一种自适应反向代理,可在无服务器计算系统上支持有效的机器学习服务。 MLProxy支持自适应批处理,以确保SLA合规性同时优化无服务器成本。我们对knative进行了严格的实验,以证明mlproxy的有效性。我们表明,MLProxy可以将无服务器部署的成本降低92%,同时将SLA违规行为减少多达99%,这可以在最新的模型服务框架中概括。

Serving machine learning inference workloads on the cloud is still a challenging task on the production level. Optimal configuration of the inference workload to meet SLA requirements while optimizing the infrastructure costs is highly complicated due to the complex interaction between batch configuration, resource configurations, and variable arrival process. Serverless computing has emerged in recent years to automate most infrastructure management tasks. Workload batching has revealed the potential to improve the response time and cost-effectiveness of machine learning serving workloads. However, it has not yet been supported out of the box by serverless computing platforms. Our experiments have shown that for various machine learning workloads, batching can hugely improve the system's efficiency by reducing the processing overhead per request. In this work, we present MLProxy, an adaptive reverse proxy to support efficient machine learning serving workloads on serverless computing systems. MLProxy supports adaptive batching to ensure SLA compliance while optimizing serverless costs. We performed rigorous experiments on Knative to demonstrate the effectiveness of MLProxy. We showed that MLProxy could reduce the cost of serverless deployment by up to 92% while reducing SLA violations by up to 99% that can be generalized across state-of-the-art model serving frameworks.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源