论文标题
Auto-Sklearn 2.0:通过元学习的免提汽车
Auto-Sklearn 2.0: Hands-free AutoML via Meta-Learning
论文作者
论文摘要
自动化的机器学习(AUTOML)为设计机器学习管道的繁琐任务支持从业者和研究人员,最近取得了巨大的成功。在本文中,我们介绍了由第二次Chalearn Automl挑战的获胜提交所激发的新型Automl方法。我们开发了豪华的自动扫描,它可以通过使用一种新的,简单且无元能力的元学习技术并采用成功的预算分配策略来使Automl Systems在严格的时间限制下在大型数据集上正常运行。但是,Posh Auto-Sklearn引入了更多运行AutoML的方法,并且可能使用户更难正确设置它。因此,我们还迈出了一步,研究了汽车本身的设计空间,并提出了一种用于真正免提汽车的解决方案。这些变化共同引起了下一代我们的汽车系统,即自动 - 扫描2.0。我们在对39个Automl基准数据集的广泛实验研究中验证了这些添加的改进。我们通过与其他流行的Automl框架和自动扫描1.0进行比较来结束论文,将相对误差降低到4.5倍,并在10分钟内产生比自动扫描1.0在一个小时内实现的效果要好得多。
Automated Machine Learning (AutoML) supports practitioners and researchers with the tedious task of designing machine learning pipelines and has recently achieved substantial success. In this paper, we introduce new AutoML approaches motivated by our winning submission to the second ChaLearn AutoML challenge. We develop PoSH Auto-sklearn, which enables AutoML systems to work well on large datasets under rigid time limits by using a new, simple and meta-feature-free meta-learning technique and by employing a successful bandit strategy for budget allocation. However, PoSH Auto-sklearn introduces even more ways of running AutoML and might make it harder for users to set it up correctly. Therefore, we also go one step further and study the design space of AutoML itself, proposing a solution towards truly hands-free AutoML. Together, these changes give rise to the next generation of our AutoML system, Auto-sklearn 2.0. We verify the improvements by these additions in an extensive experimental study on 39 AutoML benchmark datasets. We conclude the paper by comparing to other popular AutoML frameworks and Auto-sklearn 1.0, reducing the relative error by up to a factor of 4.5, and yielding a performance in 10 minutes that is substantially better than what Auto-sklearn 1.0 achieves within an hour.