论文标题
机器学习模型的模型 - 反应解释方法的一般陷阱
General Pitfalls of Model-Agnostic Interpretation Methods for Machine Learning Models
论文作者
论文摘要
用于机器学习(ML)模型(例如部分依赖图(PDP),置换特征重要性(PFI)和Shapley值等机器学习模型(ML)模型(ML)模型的越来越多的模型解释技术,如果不正确地应用,可能会导致错误的结论,但可能会导致错误的结论。我们重点介绍了ML模型解释的许多一般陷阱,例如在错误的上下文中使用解释技术,解释模型不能很好地概括,忽略了特征依赖性,相互作用,不确定性估计和高维度设置中的问题,或者进行不合理的因果关系解释,并用示例来说明它们。我们专注于描述平均模型行为的全球方法的陷阱,但许多陷阱也适用于解释个人预测的局部方法。我们的论文通过提高对陷阱的认识并确定正确模型解释的解决方案来解决ML从业者,但也通过讨论开放问题以进行进一步研究来解决ML研究人员。
An increasing number of model-agnostic interpretation techniques for machine learning (ML) models such as partial dependence plots (PDP), permutation feature importance (PFI) and Shapley values provide insightful model interpretations, but can lead to wrong conclusions if applied incorrectly. We highlight many general pitfalls of ML model interpretation, such as using interpretation techniques in the wrong context, interpreting models that do not generalize well, ignoring feature dependencies, interactions, uncertainty estimates and issues in high-dimensional settings, or making unjustified causal interpretations, and illustrate them with examples. We focus on pitfalls for global methods that describe the average model behavior, but many pitfalls also apply to local methods that explain individual predictions. Our paper addresses ML practitioners by raising awareness of pitfalls and identifying solutions for correct model interpretation, but also addresses ML researchers by discussing open issues for further research.