论文标题
通过未指定的问题不解决刻板印象偏见
UnQovering Stereotyping Biases via Underspecified Questions
论文作者
论文摘要
尽管已经证明语言嵌入具有刻板印象的偏见,但这些偏见如何影响下游问题回答(QA)模型仍未开发。我们提出了不合格的框架,这是一个通用框架,可通过规定的问题探测和量化偏见。我们表明,由于两种形式的推理误差:位置依赖性和问题独立性,对模型得分的幼稚使用可能导致错误的偏差估计。我们设计了一种形式主义,可以隔离上述错误。作为案例研究,我们使用该指标分析了四个重要类别的刻板印象:性别,国籍,种族和宗教。我们探究了在两个质量检查数据集中训练的五个基于变压器的质量检查模型以及其基础语言模型。我们的广泛研究表明,(1)所有这些模型,无论有没有微调,在这些类别中都有明显的陈规定型偏见; (2)较大的模型通常具有较高的偏见; (3)微调对偏差的影响随数据集和模型大小而变化很大。
While language embeddings have been shown to have stereotyping biases, how these biases affect downstream question answering (QA) models remains unexplored. We present UNQOVER, a general framework to probe and quantify biases through underspecified questions. We show that a naive use of model scores can lead to incorrect bias estimates due to two forms of reasoning errors: positional dependence and question independence. We design a formalism that isolates the aforementioned errors. As case studies, we use this metric to analyze four important classes of stereotypes: gender, nationality, ethnicity, and religion. We probe five transformer-based QA models trained on two QA datasets, along with their underlying language models. Our broad study reveals that (1) all these models, with and without fine-tuning, have notable stereotyping biases in these classes; (2) larger models often have higher bias; and (3) the effect of fine-tuning on bias varies strongly with the dataset and the model size.