论文标题
使用图形卷积网络和TD($λ$)玩风险游戏
Using Graph Convolutional Networks and TD($λ$) to play the game of Risk
论文作者
论文摘要
风险是6个玩家游戏,具有很大的随机性和较大的游戏树复杂性,这对创建有效玩游戏的代理商构成了挑战。以前的AIS专注于创建高级手工制作的功能决定代理决策。在这个项目中,我创建了D.A.D,这是一种使用时间差异加强学习的风险代理,以训练深层神经网络,包括图形卷积网络以评估玩家位置。这是在游戏树中使用的,以选择最佳动作。这允许将知识的手工制作到AI中,确保输入功能尽可能低,即使网络从随机的初始化开始,也可以使网络提取有用和复杂的特征本身。我还通过引入一种解释搜索所需的新方法来解决风险中的非确定性问题。结果是一个AI,赢得了35%的时间,而在Lux Delux中的最佳INSUILT AIS,这是风险变体。
Risk is 6 player game with significant randomness and a large game-tree complexity which poses a challenge to creating an agent to play the game effectively. Previous AIs focus on creating high-level handcrafted features determine agent decision making. In this project, I create D.A.D, A Risk agent using temporal difference reinforcement learning to train a Deep Neural Network including a Graph Convolutional Network to evaluate player positions. This is used in a game-tree to select optimal moves. This allows minimal handcrafting of knowledge into the AI, assuring input features are as low-level as possible to allow the network to extract useful and sophisticated features itself, even with the network starting from a random initialisation. I also tackle the issue of non-determinism in Risk by introducing a new method of interpreting attack moves necessary for the search. The result is an AI which wins 35% of the time versus 5 of best inbuilt AIs in Lux Delux, a Risk variant.