通过互动的学习（IGL）个性化的奖励学习（IGL）

论文标题

通过互动的学习（IGL）个性化的奖励学习（IGL）

Personalized Reward Learning with Interaction-Grounded Learning (IGL)

论文作者

Maghakian, Jessica, Mineiro, Paul, Panaganti, Kishan, Rucker, Mark, Saran, Akanksha, Tan, Cheng

论文摘要

在无数内容产品的时代，建议系统通过向用户提供个性化的内容建议来减轻信息过载。由于缺乏明确的用户反馈，现代推荐系统通常会优化所有用户中隐性反馈信号的固定组合。但是，这种方法忽略了越来越多的作品，强调了（i）用户可以以各种方式使用隐式信号，向从满意度到积极的不喜欢的任何东西发出任何信号，以及（ii）不同的用户以不同方式传达偏好。我们建议应用最近的互动扎根学习（IGL）范式来应对各种用户沟通方式的学习表示的挑战。 IGL不需要固定，设计的奖励功能，而是能够为不同用户学习个性化奖励功能，然后直接优化为潜在的用户满意度。我们通过使用模拟以及现实世界生产轨迹来证明IgL的成功。

In an era of countless content offerings, recommender systems alleviate information overload by providing users with personalized content suggestions. Due to the scarcity of explicit user feedback, modern recommender systems typically optimize for the same fixed combination of implicit feedback signals across all users. However, this approach disregards a growing body of work highlighting that (i) implicit signals can be used by users in diverse ways, signaling anything from satisfaction to active dislike, and (ii) different users communicate preferences in different ways. We propose applying the recent Interaction Grounded Learning (IGL) paradigm to address the challenge of learning representations of diverse user communication modalities. Rather than requiring a fixed, human-designed reward function, IGL is able to learn personalized reward functions for different users and then optimize directly for the latent user satisfaction. We demonstrate the success of IGL with experiments using simulations as well as with real-world production traces.

下载PDF全文

下载文献需遵守相关版权规定

论文标题