论文标题
用MNIST-1D缩减深度学习
Scaling Down Deep Learning with MNIST-1D
论文作者
论文摘要
尽管深度学习模型已经采用了商业和政治相关性,但其培训和运作的关键方面仍然知之甚少。这引发了人们对深度学习项目科学的兴趣,其中许多需要大量的时间,金钱和电力。但是,这项研究中确实需要大规模进行多少研究?在本文中,我们介绍了MNIST-1D:一种简约,程序生成,低内存和低计算替代品,可替代经典深度学习基准。尽管MNIST-1D的尺寸仅为40,其默认训练集大小仅为4000,但MNIST-1D可用于研究不同深层体系结构的电感偏见,查找彩票票,观察深度下降,Metalearn激活功能,并在自学学习学习中展示Guillotine正常化。所有这些实验都可以在几分钟之内在GPU上进行,也可以在CPU上进行,从而可以进行快速原型制作,教育用例以及低预算的尖端研究。
Although deep learning models have taken on commercial and political relevance, key aspects of their training and operation remain poorly understood. This has sparked interest in science of deep learning projects, many of which require large amounts of time, money, and electricity. But how much of this research really needs to occur at scale? In this paper, we introduce MNIST-1D: a minimalist, procedurally generated, low-memory, and low-compute alternative to classic deep learning benchmarks. Although the dimensionality of MNIST-1D is only 40 and its default training set size only 4000, MNIST-1D can be used to study inductive biases of different deep architectures, find lottery tickets, observe deep double descent, metalearn an activation function, and demonstrate guillotine regularization in self-supervised learning. All these experiments can be conducted on a GPU or often even on a CPU within minutes, allowing for fast prototyping, educational use cases, and cutting-edge research on a low budget.