论文标题
网络系统中快速,低空学习的在线功能选择
Online feature selection for rapid, low-overhead learning in networked systems
论文作者
论文摘要
数据驱动的操作和管理功能通常需要通过监视模型培训和预测来收集的测量。数据源的数量可能非常大,这需要大量的通信和计算开销,以连续提取和收集此数据,并训练和更新机器学习模型。我们提出了一种称为OSFS的在线算法,该算法从大量可用的数据源中选择一个小功能,该功能允许快速,低空且有效的学习和预测。 OSFS通过功能排名算法实例化,并应用了稳定功能集的概念,我们在论文中介绍了该集合。我们对内部测试台数据的方法进行广泛的实验评估。我们发现,OSF需要数百次测量,以将数据源数量减少两个数量级,从中训练模型,以可接受的预测准确性。虽然我们的方法是启发式方法,并且可以通过许多方式进行改进,但结果清楚地表明,许多学习任务不需要冗长的监视阶段和昂贵的离线培训。
Data-driven functions for operation and management often require measurements collected through monitoring for model training and prediction. The number of data sources can be very large, which requires a significant communication and computing overhead to continuously extract and collect this data, as well as to train and update the machine-learning models. We present an online algorithm, called OSFS, that selects a small feature set from a large number of available data sources, which allows for rapid, low-overhead, and effective learning and prediction. OSFS is instantiated with a feature ranking algorithm and applies the concept of a stable feature set, which we introduce in the paper. We perform extensive, experimental evaluation of our method on data from an in-house testbed. We find that OSFS requires several hundreds measurements to reduce the number of data sources by two orders of magnitude, from which models are trained with acceptable prediction accuracy. While our method is heuristic and can be improved in many ways, the results clearly suggests that many learning tasks do not require a lengthy monitoring phase and expensive offline training.