论文标题

通过弱监督进行电子邮件意图检测学习

Learning with Weak Supervision for Email Intent Detection

论文作者

Shu, Kai, Mukherjee, Subhabrata, Zheng, Guoqing, Awadallah, Ahmed Hassan, Shokouhi, Milad, Dumais, Susan

论文摘要

电子邮件仍然是最常用的在线通信手段之一。人们每天花费大量时间在电子邮件上,以交换信息,管理任务和安排活动。先前的工作研究了通过优先级电子邮件,提示自动答复或确定建议采取适当行动的意图来提高电子邮件生产率的不同方法。这个问题主要是作为监督的学习问题,其中提出了不同复杂性的模型将电子邮件分类为预定义的意图或课程分类法。对标记数据的需求一直是训练监督模型中最大的瓶颈之一。对于许多现实世界中的任务,例如电子邮件意图分类,尤其是这种情况,由于隐私或数据访问约束,大规模注释的示例很难获取或无法获取。电子邮件用户通常会根据电子邮件中表达的意图采取措施(例如,设置有关带有调度请求的电子邮件的会议)。可以从用户交互日志中推断出此类操作。在本文中,我们建议除了有限的注释示例外,还要利用用户行动作为弱监督的来源,以检测电子邮件中的意图。我们开发了一种端到端的强大深度神经网络模型,用于电子邮件意图识别,该模型既利用清晰的注释数据,又利用嘈杂的弱监督,以及一个自定进度的学习机制。对三个不同意图检测任务的广泛实验表明,我们的方法可以有效利用弱监督的数据来改善电子邮件中的意图检测。

Email remains one of the most frequently used means of online communication. People spend a significant amount of time every day on emails to exchange information, manage tasks and schedule events. Previous work has studied different ways for improving email productivity by prioritizing emails, suggesting automatic replies or identifying intents to recommend appropriate actions. The problem has been mostly posed as a supervised learning problem where models of different complexities were proposed to classify an email message into a predefined taxonomy of intents or classes. The need for labeled data has always been one of the largest bottlenecks in training supervised models. This is especially the case for many real-world tasks, such as email intent classification, where large scale annotated examples are either hard to acquire or unavailable due to privacy or data access constraints. Email users often take actions in response to intents expressed in an email (e.g., setting up a meeting in response to an email with a scheduling request). Such actions can be inferred from user interaction logs. In this paper, we propose to leverage user actions as a source of weak supervision, in addition to a limited set of annotated examples, to detect intents in emails. We develop an end-to-end robust deep neural network model for email intent identification that leverages both clean annotated data and noisy weak supervision along with a self-paced learning mechanism. Extensive experiments on three different intent detection tasks show that our approach can effectively leverage the weakly supervised data to improve intent detection in emails.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源