通过预测准确性，我们可以学到什么？

论文标题

通过预测准确性，我们可以学到什么？

What can we Learn by Predicting Accuracy?

论文作者

Risser-Maroix, Olivier, Chamand, Benjamin

论文摘要

本文试图回答以下问题：\ textit {“我们可以通过预测准确性来学到什么？”}。确实，分类是机器学习中最受欢迎的任务之一，并且已经开发出许多损失功能来最大化这种非差异性目标函数。与过去的损失功能设计的工作不同，在通过实验验证之前，主要是由直觉和理论指导的，我们建议以相反的方式解决此问题：我们试图通过实验提取知识。这种数据驱动的方法类似于物理中用于从数据发现一般定律的方法。我们使用符号回归方法自动找到与线性分类器的精度高度相关的数学表达式。在260多个嵌入式数据集上发现的公式的Pearson相关性为0.96，$ R^2 $为0.93。更有趣的是，该公式是高度解释的，并确认了以前有关损失设计的各种论文的见解。我们希望这项工作能够开放新的观点，以寻求新的启发式方法，从而深入了解机器学习理论。

This paper seeks to answer the following question: \textit{"What can we learn by predicting accuracy?"}. Indeed, classification is one of the most popular tasks in machine learning, and many loss functions have been developed to maximize this non-differentiable objective function. Unlike past work on loss function design, which was guided mainly by intuition and theory before being validated by experimentation, here we propose to approach this problem in the opposite way: we seek to extract knowledge by experimentation. This data-driven approach is similar to that used in physics to discover general laws from data. We used a symbolic regression method to automatically find a mathematical expression highly correlated with a linear classifier's accuracy. The formula discovered on more than 260 datasets of embeddings has a Pearson's correlation of 0.96 and a $r^2$ of 0.93. More interestingly, this formula is highly explainable and confirms insights from various previous papers on loss design. We hope this work will open new perspectives in the search for new heuristics leading to a deeper understanding of machine learning theory.

下载PDF全文

下载文献需遵守相关版权规定

论文标题