通过交互式注意对齐，使人与深度神经网络之间的眼睛对齐

论文标题

通过交互式注意对齐，使人与深度神经网络之间的眼睛对齐

Aligning Eyes between Humans and Deep Neural Network through Interactive Attention Alignment

论文作者

Gao, Yuyang, Sun, Tong, Zhao, Liang, Hong, Sungsoo

论文摘要

尽管深层神经网络（DNNS）几乎通过其强大的自动化得出了几乎每个领域的主要创新，但我们也目睹了自动化背后的危险，这是一种偏见的一种形式，例如自动种族主义，性别偏见和对抗性偏见。随着DNNS的社会影响的增长，找到一种有效的方法来指导DNN与人类心理模型保持一致，这对于实现公平和负责的模型是必不可少的。我们提出了一个新型的交互式注意对齐（IAA）的框架，旨在实现可达人类发展的深度神经网络（DNNS）。 IAA利用DNN模型解释方法作为一种交互式介质，人类可以用来揭示有偏见的模型注意力并直接调整注意力的情况。在使用人类生成的调整后的注意力改善DNN时，我们介绍了Gradia，这是一种新型的计算管道，共同提高了注意力质量和预测准确性。我们在研究1中评估了研究1中的IAA框架在性别分类问题中。研究1发现应用IAA可以显着提高人眼中的模型注意力的感知质量。在研究2中，我们发现使用Gradia可以（1）显着提高模型注意力的感知质量，并且（2）在训练样本有限的情况下显着改善了模型性能。我们对未来的交互式用户界面设计对人类可吻合的AI产生了影响。

While Deep Neural Networks (DNNs) are deriving the major innovations in nearly every field through their powerful automation, we are also witnessing the peril behind automation as a form of bias, such as automated racism, gender bias, and adversarial bias. As the societal impact of DNNs grows, finding an effective way to steer DNNs to align their behavior with the human mental model has become indispensable in realizing fair and accountable models. We propose a novel framework of Interactive Attention Alignment (IAA) that aims at realizing human-steerable Deep Neural Networks (DNNs). IAA leverages DNN model explanation method as an interactive medium that humans can use to unveil the cases of biased model attention and directly adjust the attention. In improving the DNN using human-generated adjusted attention, we introduce GRADIA, a novel computational pipeline that jointly maximizes attention quality and prediction accuracy. We evaluated IAA framework in Study 1 and GRADIA in Study 2 in a gender classification problem. Study 1 found applying IAA can significantly improve the perceived quality of model attention from human eyes. In Study 2, we found using GRADIA can (1) significantly improve the perceived quality of model attention and (2) significantly improve model performance in scenarios where the training samples are limited. We present implications for future interactive user interfaces design towards human-alignable AI.

下载PDF全文

下载文献需遵守相关版权规定

论文标题