论文标题
基于对行为和文本数据的分类器的基于荟萃的规则提取
Metafeatures-based Rule-Extraction for Classifiers on Behavioral and Textual Data
论文作者
论文摘要
关于行为和文本数据的机器学习模型可能会导致高度准确的预测模型,但通常很难解释。已经提出了规则萃取技术将复杂“黑盒”模型的所需预测准确性与全局解释性相结合。但是,在高维,稀疏数据的背景下,许多功能与预测相关的情况可能会具有挑战性,因为许多规则将黑框模型替换为挑战,这可能是具有挑战性的。为了解决这个问题,我们基于更高级别的,较少的元数据来开发和测试一种规则算法方法。我们分析的一个关键发现是,基于荟萃的解释更好地模仿了通过解释的保真度来衡量的黑框预测模型的行为。
Machine learning models on behavioral and textual data can result in highly accurate prediction models, but are often very difficult to interpret. Rule-extraction techniques have been proposed to combine the desired predictive accuracy of complex "black-box" models with global explainability. However, rule-extraction in the context of high-dimensional, sparse data, where many features are relevant to the predictions, can be challenging, as replacing the black-box model by many rules leaves the user again with an incomprehensible explanation. To address this problem, we develop and test a rule-extraction methodology based on higher-level, less-sparse metafeatures. A key finding of our analysis is that metafeatures-based explanations are better at mimicking the behavior of the black-box prediction model, as measured by the fidelity of explanations.