论文标题
终身学习指标
Lifelong Learning Metrics
论文作者
论文摘要
DARPA终身学习机(L2M)计划旨在产生人工智能(AI)系统的进步,以便他们能够连续学习(和改进),利用一项任务的数据来提高另一个任务的绩效,并以计算可持续的方式进行。该计划的表演者开发了能够执行各种功能的系统,包括自动驾驶,实时策略和无人机模拟。这些系统具有多种特征(例如,任务结构,终身持续时间),该计划的测试和评估团队面临的直接挑战是在这些不同的环境中衡量系统性能。该文档与DARPA和计划表演者密切合作开发,概述了构建和表征执行终身学习场景的代理商表现的形式主义。
The DARPA Lifelong Learning Machines (L2M) program seeks to yield advances in artificial intelligence (AI) systems so that they are capable of learning (and improving) continuously, leveraging data on one task to improve performance on another, and doing so in a computationally sustainable way. Performers on this program developed systems capable of performing a diverse range of functions, including autonomous driving, real-time strategy, and drone simulation. These systems featured a diverse range of characteristics (e.g., task structure, lifetime duration), and an immediate challenge faced by the program's testing and evaluation team was measuring system performance across these different settings. This document, developed in close collaboration with DARPA and the program performers, outlines a formalism for constructing and characterizing the performance of agents performing lifelong learning scenarios.