论文标题
当地解释的拓扑表示
Topological Representations of Local Explanations
论文作者
论文摘要
局部解释性方法(试图为每个预测产生解释的方法)由于从业人员需要合理化其模型输出而变得越来越普遍。但是,比较局部解释性方法是困难的,因为它们每个都以各种尺度和维度生成输出。此外,由于某些解释性方法的随机性质,一种方法的不同运行可能会产生给定观察结果的矛盾解释。在本文中,我们提出了一个基于拓扑的框架,以从一组本地解释中提取简化的表示。我们首先将解释空间与模型预测之间的关系作为标量函数进行建模。然后,我们计算此功能的拓扑骨架。该拓扑骨骼充当此类功能的签名,我们用来比较不同的解释方法。我们证明,我们的框架不仅可以可靠地确定解释性技术之间的差异,而且还提供了稳定的表示。然后,我们展示了如何使用框架来确定适当的参数以用于局部解释方法。我们的框架很简单,不需要复杂的优化,并且可以广泛应用于大多数本地解释方法。我们认为,我们方法的实用性和多功能性将有助于促进基于拓扑的方法作为理解和比较解释方法的工具。
Local explainability methods -- those which seek to generate an explanation for each prediction -- are becoming increasingly prevalent due to the need for practitioners to rationalize their model outputs. However, comparing local explainability methods is difficult since they each generate outputs in various scales and dimensions. Furthermore, due to the stochastic nature of some explainability methods, it is possible for different runs of a method to produce contradictory explanations for a given observation. In this paper, we propose a topology-based framework to extract a simplified representation from a set of local explanations. We do so by first modeling the relationship between the explanation space and the model predictions as a scalar function. Then, we compute the topological skeleton of this function. This topological skeleton acts as a signature for such functions, which we use to compare different explanation methods. We demonstrate that our framework can not only reliably identify differences between explainability techniques but also provides stable representations. Then, we show how our framework can be used to identify appropriate parameters for local explainability methods. Our framework is simple, does not require complex optimizations, and can be broadly applied to most local explanation methods. We believe the practicality and versatility of our approach will help promote topology-based approaches as a tool for understanding and comparing explanation methods.