论文标题
预先培训和转移学习的超为分数
Hyper-Representations for Pre-Training and Transfer Learning
论文作者
论文摘要
给定模型动物园的神经网络权重的学习表示是一个新兴而具有挑战性的领域,从模型检查到神经体系结构搜索或知识蒸馏,具有许多潜在的应用。最近,一个在模型动物园进行训练的自动编码器能够学习一个超代理,该代理捕获了动物园中模型的内在和外在特性。在这项工作中,我们扩展了超代表性的生成用途,以将新的模型权重作为预训练。我们提出的是层损失归一化,我们证明,这是基于高分数的经验密度生成高性能模型和采样方法的关键。使用我们的方法生成的模型是多种多样的,性能的,并且能够超过传统基线的转移学习。我们的结果表明,知识聚集从模型动物园到新模型的潜力通过超代理,从而为新的研究方向铺平了途径。
Learning representations of neural network weights given a model zoo is an emerging and challenging area with many potential applications from model inspection, to neural architecture search or knowledge distillation. Recently, an autoencoder trained on a model zoo was able to learn a hyper-representation, which captures intrinsic and extrinsic properties of the models in the zoo. In this work, we extend hyper-representations for generative use to sample new model weights as pre-training. We propose layer-wise loss normalization which we demonstrate is key to generate high-performing models and a sampling method based on the empirical density of hyper-representations. The models generated using our methods are diverse, performant and capable to outperform conventional baselines for transfer learning. Our results indicate the potential of knowledge aggregation from model zoos to new models via hyper-representations thereby paving the avenue for novel research directions.