论文标题

Kaggle预测比赛:一个被忽视的学习机会

Kaggle forecasting competitions: An overlooked learning opportunity

论文作者

Bojer, Casper Solheim, Meldgaard, Jens Peder

论文摘要

在最近的M4比赛中,比赛在预测领域中起着非常宝贵的作用。竞争受到了学者和从业人员的关注,并引发了有关数据预测数据的代表性的讨论。然而,学术界在Kaggle平台上进行了几项由现实生活中的预测任务进行了预测。我们认为,这些比赛中的学习能够为预测社区提供很多东西,并对六场Kaggle比赛的结果进行了回顾。我们发现,大多数Kaggle数据集的特征是比M型比M-coptitions更高的间歇性和熵,并且全局合奏模型倾向于优于本地单个模型。此外,我们发现梯度提高决策树的强劲表现,增加神经网络对预测的成功以及将机器学习模型调整为预测任务的各种技术。

Competitions play an invaluable role in the field of forecasting, as exemplified through the recent M4 competition. The competition received attention from both academics and practitioners and sparked discussions around the representativeness of the data for business forecasting. Several competitions featuring real-life business forecasting tasks on the Kaggle platform has, however, been largely ignored by the academic community. We believe the learnings from these competitions have much to offer to the forecasting community and provide a review of the results from six Kaggle competitions. We find that most of the Kaggle datasets are characterized by higher intermittence and entropy than the M-competitions and that global ensemble models tend to outperform local single models. Furthermore, we find the strong performance of gradient boosted decision trees, increasing success of neural networks for forecasting, and a variety of techniques for adapting machine learning models to the forecasting task.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源