单调神经网络的基数最小解释

论文标题

单调神经网络的基数最小解释

Cardinality-Minimal Explanations for Monotonic Neural Networks

论文作者

Harzli, Ouns El, Grau, Bernardo Cuenca, Horrocks, Ian

论文摘要

近年来，对提供精确正式保证的神经模型预测的解释方法的兴趣越来越大。这些方法包括绑架（分别为对比度）方法，旨在计算最小的输入特征子集，这些特征足以分别以给定的预测（分别更改给定的预测）。但是，相应的决策问题已知是棘手的。在本文中，我们研究是否可以通过关注实现单调功能的神经模型来恢复障碍。尽管相关的决策问题仍然很棘手，但我们可以证明，如果我们还假设激活函数在任何地方是连续的，并且几乎到处都可以区分。我们的实验表明我们的算法表现出色。

In recent years, there has been increasing interest in explanation methods for neural model predictions that offer precise formal guarantees. These include abductive (respectively, contrastive) methods, which aim to compute minimal subsets of input features that are sufficient for a given prediction to hold (respectively, to change a given prediction). The corresponding decision problems are, however, known to be intractable. In this paper, we investigate whether tractability can be regained by focusing on neural models implementing a monotonic function. Although the relevant decision problems remain intractable, we can show that they become solvable in polynomial time by means of greedy algorithms if we additionally assume that the activation functions are continuous everywhere and differentiable almost everywhere. Our experiments suggest favourable performance of our algorithms.

下载PDF全文

下载文献需遵守相关版权规定

论文标题