论文标题
在神经网络中学习电路扩展的新角色
A new role for circuit expansion for learning in neural networks
论文作者
论文摘要
大脑中的许多感觉途径依赖于输入刺激下游的神经元的稀疏活性群体。大脑中发生扩展结构的生物学原因尚不清楚,但可能是因为扩展可以增加神经网络的表达能力。在这项工作中,我们表明,即使在学习期后修剪扩展结构的情况下,扩大神经网络也可以改善其泛化性能。为了研究此环境,我们使用教师学生的框架,在该框架中,Perceptron教师网络产生的标签被少量的噪音损坏。然后,我们训练与老师结构匹配的学生网络,如果鉴于教师的突触体重,可以实现最佳的准确性。我们发现,学生感知到的网络的输入的稀疏扩展都会提高其能力,并在学习后从教师中学习嘈杂的规则时,可以提高网络的概括性能。当扩展的单元是随机的并且与输入不相关并在平均场限制中分析该网络时,我们发现类似的行为。我们通过求解平均场方程表明,随机扩展的学生网络的概括误差随着网络大小的增加而继续下降。尽管学生网络相对于它试图学习的老师的复杂性增加,但概括性能的改善仍在发生。我们表明,这种效果与人工神经网络中的松弛变量的添加密切相关,并提出了对人工和生物神经网络的可能影响。
Many sensory pathways in the brain rely on sparsely active populations of neurons downstream from the input stimuli. The biological reason for the occurrence of expanded structure in the brain is unclear, but may be because expansion can increase the expressive power of a neural network. In this work, we show that expanding a neural network can improve its generalization performance even in cases in which the expanded structure is pruned after the learning period. To study this setting we use a teacher-student framework where a perceptron teacher network generates labels which are corrupted with small amounts of noise. We then train a student network that is structurally matched to the teacher and can achieve optimal accuracy if given the teacher's synaptic weights. We find that sparse expansion of the input of a student perceptron network both increases its capacity and improves the generalization performance of the network when learning a noisy rule from a teacher perceptron when these expansions are pruned after learning. We find similar behavior when the expanded units are stochastic and uncorrelated with the input and analyze this network in the mean field limit. We show by solving the mean field equations that the generalization error of the stochastic expanded student network continues to drop as the size of the network increases. The improvement in generalization performance occurs despite the increased complexity of the student network relative to the teacher it is trying to learn. We show that this effect is closely related to the addition of slack variables in artificial neural networks and suggest possible implications for artificial and biological neural networks.