用于分散稳定匹配的多代理增强学习

论文标题

用于分散稳定匹配的多代理增强学习

Multi-agent Reinforcement Learning for Decentralized Stable Matching

论文作者

Taywade, Kshitija, Goldsmith, Judy, Harrison, Brent

论文摘要

在现实世界中，人们/实体通常会独立和自主地找到匹配项，例如寻找工作，伴侣，室友等。这次搜索匹配可能始于没有对环境的初始知识。我们建议将多机构增强学习（MARL）范式用于具有独立和自主代理的空间配方分散的双向匹配市场。具有独立行动的自主代理会使我们的环境非常动态和不确定。此外，代理商缺乏对其他代理商偏好的了解，必须探索环境并与其他代理商进行互动，以通过嘈杂的奖励来发现自己的偏好。我们认为这样的环境更好地近似于现实世界，我们研究了MARL方法的实用性。除了常规的稳定匹配案例外，代理具有严格订购的首选项，我们还检查了方法的适用性，以符合不完整的列表和领带的稳定匹配。我们研究了我们的稳定性，不稳定水平（不稳定结果）和公平性的结果。我们的Marl方法主要产生稳定且公平的结果。

In the real world, people/entities usually find matches independently and autonomously, such as finding jobs, partners, roommates, etc. It is possible that this search for matches starts with no initial knowledge of the environment. We propose the use of a multi-agent reinforcement learning (MARL) paradigm for a spatially formulated decentralized two-sided matching market with independent and autonomous agents. Having autonomous agents acting independently makes our environment very dynamic and uncertain. Moreover, agents lack the knowledge of preferences of other agents and have to explore the environment and interact with other agents to discover their own preferences through noisy rewards. We think such a setting better approximates the real world and we study the usefulness of our MARL approach for it. Along with conventional stable matching case where agents have strictly ordered preferences, we check the applicability of our approach for stable matching with incomplete lists and ties. We investigate our results for stability, level of instability (for unstable results), and fairness. Our MARL approach mostly yields stable and fair outcomes.

下载PDF全文

下载文献需遵守相关版权规定

论文标题