使用改进的本地回归的模型解释，重要性可变

论文标题

使用改进的本地回归的模型解释，重要性可变

Model interpretation using improved local regression with variable importance

论文作者

Shimizu, Gilson Y., Izbicki, Rafael, de Carvalho, Andre C. P. L. F.

论文摘要

使用ML模型的一个基本问题涉及其预测提高决策的预测。尽管已经出现了几种可解释性方法，但已经确定了有关其解释可靠性的一些差距。例如，大多数方法都是不稳定的（这意味着它们在数据中提供了截然不同的解释），并且不能很好地应对不相关的功能（即与标签无关的功能）。本文介绍了两种新的可解释性方法，即varimp和supclus，它们通过使用局部回归拟合的加权距离来克服这些问题，以考虑可变重要性。 varimp生成了每个实例的解释，并且可以应用于具有更复杂关系的数据集，而Supclus解释了具有相似说明的实例群体，并且可以应用于可以找到群集的较简单数据集。我们将我们的方法与最先进的方法进行了比较，并表明它可以根据几个指标产生更好的解释，尤其是在具有不相关特征的高维问题中，以及特征与目标之间的关系是非线性的。

A fundamental question on the use of ML models concerns the explanation of their predictions for increasing transparency in decision-making. Although several interpretability methods have emerged, some gaps regarding the reliability of their explanations have been identified. For instance, most methods are unstable (meaning that they give very different explanations with small changes in the data), and do not cope well with irrelevant features (that is, features not related to the label). This article introduces two new interpretability methods, namely VarImp and SupClus, that overcome these issues by using local regressions fits with a weighted distance that takes into account variable importance. Whereas VarImp generates explanations for each instance and can be applied to datasets with more complex relationships, SupClus interprets clusters of instances with similar explanations and can be applied to simpler datasets where clusters can be found. We compare our methods with state-of-the art approaches and show that it yields better explanations according to several metrics, particularly in high-dimensional problems with irrelevant features, as well as when the relationship between features and target is non-linear.

下载PDF全文

下载文献需遵守相关版权规定

论文标题