使用神经决策树进行分类的模型家庭选择

论文标题

使用神经决策树进行分类的模型家庭选择

Model family selection for classification using Neural Decision Trees

论文作者

de Oca, Anthea Mérida Montes, Kalogeratos, Argyris, Mougeot, Mathilde

论文摘要

模型选择在于根据要优化的度量标准比较几个候选模型。该过程通常涉及网格搜索或类似的交叉验证，这可能很耗时，并且没有提供有关数据集本身的太多信息。在本文中，我们提出了一种减少任务所需的探索范围的方法。这个想法是量化有必要从给定家庭的训练有素的实例，具有“僵化”决策边界（例如决策树）的训练有素的实例，以获得同等或更好的模型。在我们的方法中，只要这在分析的数据集中测得的性能方面是有益的，就可以通过逐步放松初始决策树（RMS）的决策边界来实现这一点。更具体地说，这种放松是通过使用神经决策树来执行的，该神经网络是由DTS构建的神经网络。我们方法产生的最终模型具有非线性决策边界。衡量最终模型的性能及其与播种RM的一致性可以帮助用户弄清楚他应该关注的模型家庭。

Model selection consists in comparing several candidate models according to a metric to be optimized. The process often involves a grid search, or such, and cross-validation, which can be time consuming, as well as not providing much information about the dataset itself. In this paper we propose a method to reduce the scope of exploration needed for the task. The idea is to quantify how much it would be necessary to depart from trained instances of a given family, reference models (RMs) carrying `rigid' decision boundaries (e.g. decision trees), so as to obtain an equivalent or better model. In our approach, this is realized by progressively relaxing the decision boundaries of the initial decision trees (the RMs) as long as this is beneficial in terms of performance measured on an analyzed dataset. More specifically, this relaxation is performed by making use of a neural decision tree, which is a neural network built from DTs. The final model produced by our method carries non-linear decision boundaries. Measuring the performance of the final model, and its agreement to its seeding RM can help the user to figure out on which family of models he should focus on.

下载PDF全文

下载文献需遵守相关版权规定

论文标题