论文标题
通过对抗性学习缓解机器翻译中的性别偏见
Mitigating Gender Bias in Machine Translation through Adversarial Learning
论文作者
论文摘要
机器翻译和其他NLP系统通常包含有关敏感属性(例如性别或种族)的巨大偏见,这些属性会使系统性能恶化并永久存在有害刻板印象。最近的初步研究表明,对抗性学习可以用作不需要数据修改的模型不足偏置缓解方法的一部分。但是,将此策略适应机器翻译和其他现代NLP域需要(1)在微调预验证的大语言模型的背景下重组培训目标,以及(2)为性别或其他受保护变量制定措施,以使这些属性必须从数据本身中限制。 我们提出了一个对抗性学习框架,该框架解决了这些挑战以减轻SEQ2SEQ机器翻译中的性别偏见。我们的框架将男性与女性实体的句子的翻译质量差异提高了86%的英语 - 德语翻译,而英语 - 法语翻译的差异为91%,对翻译质量的影响最小。结果表明,对抗性学习是减轻机器翻译中性别偏见的有前途的技术。
Machine translation and other NLP systems often contain significant biases regarding sensitive attributes, such as gender or race, that worsen system performance and perpetuate harmful stereotypes. Recent preliminary research suggests that adversarial learning can be used as part of a model-agnostic bias mitigation method that requires no data modifications. However, adapting this strategy for machine translation and other modern NLP domains requires (1) restructuring training objectives in the context of fine-tuning pretrained large language models and (2) developing measures for gender or other protected variables for tasks in which these attributes must be deduced from the data itself. We present an adversarial learning framework that addresses these challenges to mitigate gender bias in seq2seq machine translation. Our framework improves the disparity in translation quality for sentences with male vs. female entities by 86% for English-German translation and 91% for English-French translation, with minimal effect on translation quality. The results suggest that adversarial learning is a promising technique for mitigating gender bias in machine translation.