论文标题

重新思考流媒体机学习评估

Rethinking Streaming Machine Learning Evaluation

论文作者

Shankar, Shreya, Herman, Bernease, Parameswaran, Aditya G.

论文摘要

虽然大多数评估机器学习(ML)模型的工作侧重于计算数据批次的准确性,但单独跟踪流媒体设置(即,无限制的,时间戳订购的数据集)的准确性未能适当地识别模型何时表现出乎意料。在该职位论文中,我们讨论了流媒体问题的性质如何引入新的现实世界挑战(例如,标签延迟到达),并建议其他指标来评估流媒体ML性能。

While most work on evaluating machine learning (ML) models focuses on computing accuracy on batches of data, tracking accuracy alone in a streaming setting (i.e., unbounded, timestamp-ordered datasets) fails to appropriately identify when models are performing unexpectedly. In this position paper, we discuss how the nature of streaming ML problems introduces new real-world challenges (e.g., delayed arrival of labels) and recommend additional metrics to assess streaming ML performance.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源