论文标题
学习模块化结构,使分布不足
Learning Modular Structures That Generalize Out-of-Distribution
论文作者
论文摘要
对于现实世界机器学习系统,分发(O.O.D.)的概括仍然是一个关键挑战。我们描述了O.O.D.的方法通过培训,概括鼓励模型仅保留网络中的功能,这些功能在多个培训领域都充分重复了。我们的方法结合了两个互补的神经元级正规化器与网络上的概率可区分二进制掩码,以提取一个模块化子网络,从而实现更好的O.O.D.性能比原始网络。两个基准数据集的初步评估证实了我们方法的承诺。
Out-of-distribution (O.O.D.) generalization remains to be a key challenge for real-world machine learning systems. We describe a method for O.O.D. generalization that, through training, encourages models to only preserve features in the network that are well reused across multiple training domains. Our method combines two complementary neuron-level regularizers with a probabilistic differentiable binary mask over the network, to extract a modular sub-network that achieves better O.O.D. performance than the original network. Preliminary evaluation on two benchmark datasets corroborates the promise of our method.