论文标题
代理环境周期游戏
Agent Environment Cycle Games
论文作者
论文摘要
部分可观察到的随机游戏(POSG)是多代理增强学习(MARL)中使用的最通用和最常见的游戏模型。我们认为,POSG模型在概念上是不适合软件MARL环境的,并提供了文献研究的案例研究,而这种不匹配导致了严重意外的行为。为此,我们介绍了代理环境周期游戏(AEC游戏)模型,该模型更代表软件实施。然后,我们证明它是POSG的同等模型。 AEC游戏模型也非常有用,因为它可以优雅地代表所有形式的MARL环境,例如POSG不能优雅地代表像国际象棋一样严格代表基于转弯的游戏。
Partially Observable Stochastic Games (POSGs) are the most general and common model of games used in Multi-Agent Reinforcement Learning (MARL). We argue that the POSG model is conceptually ill suited to software MARL environments, and offer case studies from the literature where this mismatch has led to severely unexpected behavior. In response to this, we introduce the Agent Environment Cycle Games (AEC Games) model, which is more representative of software implementation. We then prove it's as an equivalent model to POSGs. The AEC games model is also uniquely useful in that it can elegantly represent both all forms of MARL environments, whereas for example POSGs cannot elegantly represent strictly turn based games like chess.