新闻中的域名自适应前训练方法用于语言偏见检测

论文标题

新闻中的域名自适应前训练方法用于语言偏见检测

A Domain-adaptive Pre-training Approach for Language Bias Detection in News

论文作者

Krieger, Jan-David, Spinde, Timo, Ruas, Terry, Kulshrestha, Juhi, Gipp, Bela

论文摘要

媒体偏见是一种影响个人行为和集体决策的多方面结构。倾斜的新闻报道是单方面和两极分化的结果，可以以各种形式出现。在这项工作中，我们专注于一种重要的媒体偏见形式，即按单词选择偏见。由于其语言复杂性和缺乏代表性的金标准语料库，检测有偏见的单词选择是一项具有挑战性的任务。我们提出了Da-Roberta，这是一种基于媒体偏见域的新型基于最先进的变压器模型，该模型以F1分数为0.814，标识了句子级别的偏差。此外，我们还训练Da-Bert和Da-Bart，这是又有两个适合偏置域的变压器模型。我们提出的域适应模型在相同数据上超过了先前的偏差检测方法。

Media bias is a multi-faceted construct influencing individual behavior and collective decision-making. Slanted news reporting is the result of one-sided and polarized writing which can occur in various forms. In this work, we focus on an important form of media bias, i.e. bias by word choice. Detecting biased word choices is a challenging task due to its linguistic complexity and the lack of representative gold-standard corpora. We present DA-RoBERTa, a new state-of-the-art transformer-based model adapted to the media bias domain which identifies sentence-level bias with an F1 score of 0.814. In addition, we also train, DA-BERT and DA-BART, two more transformer models adapted to the bias domain. Our proposed domain-adapted models outperform prior bias detection approaches on the same data.

下载PDF全文

下载文献需遵守相关版权规定

论文标题