部署终身开放域对话学习

论文标题

部署终身开放域对话学习

Deploying Lifelong Open-Domain Dialogue Learning

论文作者

Shuster, Kurt, Urbanek, Jack, Dinan, Emily, Szlam, Arthur, Weston, Jason

论文摘要

NLP的大部分研究都集中在众包静态数据集上，并且一次有监督的培训学习范式，然后评估测试性能。正如De Vries等人所说的。（2020），众包数据存在缺乏自然性和与现实世界用例相关的问题，而静态数据集范式不允许模型从其使用语言的经验中学习（Silver等，2013）。相反，人们可能希望在与人互动时变得更有用的机器学习系统。在这项工作中，我们构建和部署了一个角色扮演游戏，在该游戏中，人类玩家与位于开放域幻想世界中的学习代理商交谈。我们表明，通过培训模型，通过自动指标和在线参与分数来衡量，通过与人类在游戏中的对话进行对话。当应用于与真实用户的对话时，该学习比众包数据更有效，并且可以便宜得多。

Much of NLP research has focused on crowdsourced static datasets and the supervised learning paradigm of training once and then evaluating test performance. As argued in de Vries et al. (2020), crowdsourced data has the issues of lack of naturalness and relevance to real-world use cases, while the static dataset paradigm does not allow for a model to learn from its experiences of using language (Silver et al., 2013). In contrast, one might hope for machine learning systems that become more useful as they interact with people. In this work, we build and deploy a role-playing game, whereby human players converse with learning agents situated in an open-domain fantasy world. We show that by training models on the conversations they have with humans in the game the models progressively improve, as measured by automatic metrics and online engagement scores. This learning is shown to be more efficient than crowdsourced data when applied to conversations with real users, as well as being far cheaper to collect.

下载PDF全文

下载文献需遵守相关版权规定

论文标题