强化指导的低资源刻板印象检测的多任务学习框架

论文标题

强化指导的低资源刻板印象检测的多任务学习框架

Reinforcement Guided Multi-Task Learning Framework for Low-Resource Stereotype Detection

论文作者

Pujari, Rajkumar, Oveson, Erik, Kulkarni, Priyanka, Nouri, Elnaz

论文摘要

随着大型预训练的语言模型（PLM）以无监督的方式对大量数据进行培训变得更加无处不在，因此，识别文本中各种类型的偏见已引起人们的重点。现有的“刻板印象检测”数据集主要对大型PLM采用诊断方法。 Blodgett等。 AL（2021a）表明，现有基准数据集存在重大可靠性问题。注释可靠的数据集需要精确理解刻板印象在文本中如何表现的细微差别。在本文中，我们注释了针对“刻板印象检测”的重点评估集，该设置通过删除刻板印象在文本中表现出的各种方式来解决这些陷阱。此外，我们提出了一个多任务模型，该模型利用了诸如仇恨语音检测，进攻性语言检测，厌女症检测等大量邻近任务，以提高“刻板印象检测”的经验性表现。然后，我们提出了一种加强学习代理，该代理通过学习从相邻任务中识别培训示例来指导多任务学习模型，从而帮助目标任务最大。我们表明，所提出的模型对所有任务的现有基准实现了显着的经验收益。

As large Pre-trained Language Models (PLMs) trained on large amounts of data in an unsupervised manner become more ubiquitous, identifying various types of bias in the text has come into sharp focus. Existing "Stereotype Detection" datasets mainly adopt a diagnostic approach toward large PLMs. Blodgett et. al (2021a) show that there are significant reliability issues with the existing benchmark datasets. Annotating a reliable dataset requires a precise understanding of the subtle nuances of how stereotypes manifest in text. In this paper, we annotate a focused evaluation set for "Stereotype Detection" that addresses those pitfalls by de-constructing various ways in which stereotypes manifest in text. Further, we present a multi-task model that leverages the abundance of data-rich neighboring tasks such as hate speech detection, offensive language detection, misogyny detection, etc., to improve the empirical performance on "Stereotype Detection". We then propose a reinforcement-learning agent that guides the multi-task learning model by learning to identify the training examples from the neighboring tasks that help the target task the most. We show that the proposed models achieve significant empirical gains over existing baselines on all the tasks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题