论文标题

迈向网络和系统的域自动化机器学习框架

Towards A Domain-Customized Automated Machine Learning Framework For Networks and Systems

论文作者

Arzani, Behnaz, Rouhani, Bita

论文摘要

云从其网络系统中收集大量的遥测,其中包含有价值的信息,这些信息可以帮助解决许多继续困扰它们的问题。但是,很难从此类原始数据中提取有用的信息。机器学习(ML)模型是有用的工具,可以使操作员能够利用此数据解决此类问题或发展有关是否可以解决的直觉。构建实用的ML模型是耗时的,并且需要ML和网络系统中的专家来将模型定制为系统/网络(又称“域“限制” IT)IT)。我们部署的应用程序数量会加剧问题。我们的系统发展的速度以及部署新的监视系统的速度(不推荐使用)意味着这些模型通常需要适应以跟上。如今,缺乏两套专业知识的人正在成为采用云操作中ML的瓶颈之一。本文认为,可以为网络系统构建一个域的自动化ML框架,以节省宝贵的操作员的时间和精力。

Clouds gather a vast volume of telemetry from their networked systems which contain valuable information that can help solve many of the problems that continue to plague them. However, it is hard to extract useful information from such raw data. Machine Learning (ML) models are useful tools that enable operators to either leverage this data to solve such problems or develop intuition about whether/how they can be solved. Building practical ML models is time-consuming and requires experts in both ML and networked systems to tailor the model to the system/network (a.k.a "domain-customize" it). The number of applications we deploy exacerbates the problem. The speed with which our systems evolve and with which new monitoring systems are deployed (deprecated) means these models often need to be adapted to keep up. Today, the lack of individuals with both sets of expertise is becoming one of the bottlenecks for adopting ML in cloud operations. This paper argues it is possible to build a domain-customized automated ML framework for networked systems that can help save valuable operator time and effort.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源