DeepHateExplainer：可解释的仇恨言论检测孟加拉语的语言不足

论文标题

DeepHateExplainer：可解释的仇恨言论检测孟加拉语的语言不足

DeepHateExplainer: Explainable Hate Speech Detection in Under-resourced Bengali Language

论文作者

Karim, Md. Rezaul, Dey, Sumon Kanti, Islam, Tanhim, Sarker, Sagor, Menon, Mehadi Hasan, Hossain, Kabir, Chakravarthi, Bharathi Raja, Hossain, Md. Azam, Decker, Stefan

论文摘要

社交媒体和微博网站的指数增长不仅为赋予表达自由和个人声音的能力提供平台，而且还使人们能够表达反社会行为，例如在线骚扰，网络欺凌和仇恨言论。已经提出了许多著作，用于通过预测主要资源（如英语）的语言来利用文本数据进行社会和反社会行为分析。但是，某些语言的资源不足，例如孟加拉语，例如孟加拉语，缺乏准确的自然语言处理（NLP）的计算资源。在本文中，我们提出了一种可解释的方法，从资源不足的孟加拉语中检测出来，我们称之为DeepHateSplainer。孟加拉语文本首先是对孟加拉语的全面预处理，然后使用基于变压器的神经体系结构的神经合奏方法将其分为政治，个人，地缘政治和宗教讨厌（即单语Bangla Bert-Bass，多语言bert bert casted/Uncasted/Uncasted/Uncasted/Uncasted/Uncasted和XLM-Roberta）。然后使用灵敏度分析和层次相关性传播（LRP）在提供人解释的解释之前，使用敏感性分析和层次相关性传播（LRP）确定重要的术语。最后，我们计算综合性和足够得分，以衡量W.R.T忠诚的解释质量。针对机器学习的评估〜（基于线性和树的模型）和神经网络（即CNN，BI-LSTM和带有单词嵌入的Conv-LSTM）基线产生78％，91％，89％和84％的F1评分，以分别为政治，个人，个人，地球政治和宗教化的基础，并远远超出了Myl和Dnn的基础。

The exponential growths of social media and micro-blogging sites not only provide platforms for empowering freedom of expressions and individual voices, but also enables people to express anti-social behaviour like online harassment, cyberbullying, and hate speech. Numerous works have been proposed to utilize textual data for social and anti-social behaviour analysis, by predicting the contexts mostly for highly-resourced languages like English. However, some languages are under-resourced, e.g., South Asian languages like Bengali, that lack computational resources for accurate natural language processing (NLP). In this paper, we propose an explainable approach for hate speech detection from the under-resourced Bengali language, which we called DeepHateExplainer. Bengali texts are first comprehensively preprocessed, before classifying them into political, personal, geopolitical, and religious hates using a neural ensemble method of transformer-based neural architectures (i.e., monolingual Bangla BERT-base, multilingual BERT-cased/uncased, and XLM-RoBERTa). Important(most and least) terms are then identified using sensitivity analysis and layer-wise relevance propagation(LRP), before providing human-interpretable explanations. Finally, we compute comprehensiveness and sufficiency scores to measure the quality of explanations w.r.t faithfulness. Evaluations against machine learning~(linear and tree-based models) and neural networks (i.e., CNN, Bi-LSTM, and Conv-LSTM with word embeddings) baselines yield F1-scores of 78%, 91%, 89%, and 84%, for political, personal, geopolitical, and religious hates, respectively, outperforming both ML and DNN baselines.

下载PDF全文

下载文献需遵守相关版权规定

论文标题