论文标题
用于对话图像编辑的多模式对话系统
A Multimodal Dialogue System for Conversational Image Editing
论文作者
论文摘要
在本文中,我们提出了用于对话图像编辑的多模式对话系统。我们将多模式对话系统作为部分观察到的Markov决策过程(POMDP),并使用深Q-Network(DQN)和用户模拟器对其进行了训练。我们的评估表明,DQN策略的表现优于基于规则的基线政策,在高误差率下达到90 \%的成功率。我们还进行了真实的用户研究并分析了真实的用户行为。
In this paper, we present a multimodal dialogue system for Conversational Image Editing. We formulate our multimodal dialogue system as a Partially Observed Markov Decision Process (POMDP) and trained it with Deep Q-Network (DQN) and a user simulator. Our evaluation shows that the DQN policy outperforms a rule-based baseline policy, achieving 90\% success rate under high error rates. We also conducted a real user study and analyzed real user behavior.