离线元级基于模型的加固学习方法，用于冷启动建议

论文标题

离线元级基于模型的加固学习方法，用于冷启动建议

Offline Meta-level Model-based Reinforcement Learning Approach for Cold-Start Recommendation

论文作者

Wang, Yanan, Ge, Yong, Li, Li, Chen, Rui, Xu, Tong

论文摘要

强化学习（RL）在优化对推荐系统的长期用户兴趣方面表现出了巨大的希望。但是，现有的基于RL的建议方法需要大量交互作用，以便每个用户学习强大的建议策略。当推荐对互动数量有限的新用户推荐时，挑战变得更加至关重要。为此，在本文中，我们通过提出一种基于元模型的强化学习方法来快速用户适应来解决基于RL的推荐系统中的冷启动挑战。在我们的方法中，我们学会使用用户上下文变量来推断每个用户的偏好，该变量使推荐系统可以更好地适应几乎没有交互的新用户。为了提高适应效率，我们学会了通过通过逆增强学习方法从几个互动中恢复用户政策，以协助元级建议代理。此外，我们从信息理论的角度对用户模型和推荐代理之间的相互作用关系进行了建模。经验结果表明，仅使用单个相互作用序列适应新用户时，提出的方法的有效性。我们进一步提供了建议性能结合的理论分析。

Reinforcement learning (RL) has shown great promise in optimizing long-term user interest in recommender systems. However, existing RL-based recommendation methods need a large number of interactions for each user to learn a robust recommendation policy. The challenge becomes more critical when recommending to new users who have a limited number of interactions. To that end, in this paper, we address the cold-start challenge in the RL-based recommender systems by proposing a meta-level model-based reinforcement learning approach for fast user adaptation. In our approach, we learn to infer each user's preference with a user context variable that enables recommendation systems to better adapt to new users with few interactions. To improve adaptation efficiency, we learn to recover the user policy and reward from only a few interactions via an inverse reinforcement learning method to assist a meta-level recommendation agent. Moreover, we model the interaction relationship between the user model and recommendation agent from an information-theoretic perspective. Empirical results show the effectiveness of the proposed method when adapting to new users with only a single interaction sequence. We further provide a theoretical analysis of the recommendation performance bound.

下载PDF全文

下载文献需遵守相关版权规定

论文标题