论文标题
学会与完全未知的队友合作
Learning to Cooperate with Completely Unknown Teammates
论文作者
论文摘要
临时团队合作的一个关键目标是开发与未知团队合作的学习代理,而无需诉诸于任何前协调协议。尽管文献中有大量的临时团队合作算法,但他们中的大多数人无法解决与完全未知团队合作的问题,除非它从头开始学习。本文介绍了一种新颖的方法,该方法将转移学习与最先进的塑料 - 政策一起使用,以迅速适应完全未知的队友。我们用五个不同的队友在半场进攻模拟器中测试解决方案。队友是由来自不同国家和不同时间的开发人员独立设计的。我们的经验评估表明,在适应新团队时,临时代理人利用其过去的知识是有利的,而不是学习如何从头开始与之合作。
A key goal of ad hoc teamwork is to develop a learning agent that cooperates with unknown teams, without resorting to any pre-coordination protocol. Despite a vast number of ad hoc teamwork algorithms in the literature, most of them cannot address the problem of learning to cooperate with a completely unknown team, unless it learns from scratch. This article presents a novel approach that uses transfer learning alongside the state-of-the-art PLASTIC-Policy to adapt to completely unknown teammates quickly. We test our solution within the Half Field Offense simulator with five different teammates. The teammates were designed independently by developers from different countries and at different times. Our empirical evaluation shows that it is advantageous for an ad hoc agent to leverage its past knowledge when adapting to a new team instead of learning how to cooperate with it from scratch.