论文标题

三重记忆网络:一种脑启发的方法,用于持续学习

Triple Memory Networks: a Brain-Inspired Method for Continual Learning

论文作者

Wang, Liyuan, Lei, Bo, Li, Qian, Su, Hang, Zhu, Jun, Zhong, Yi

论文摘要

持续获得新的经验而没有干扰先前学习的知识,即持续学习,对于人工神经网络至关重要,但受灾难性遗忘的限制。神经网络在学习新任务时会调整其参数,但然后无法很好地执行旧任务。相比之下,大脑具有在不灾难性干扰的情况下不断学习新体验的强大能力。潜在的神经机制可能归因于海马依赖性记忆系统和新皮层依赖性记忆系统的相互作用,该系统由前额叶皮层介导。具体而言,这两个内存系统开发了专门的机制,将信息分别合并为更具体的形式和更广泛的形式,并补充相互作用中的两种形式的信息形式。受这种大脑策略的启发,我们提出了一种名为三重记忆网络(TMN)的新型方法,用于持续学习。 TMNS模拟海马,前额叶皮层和感觉皮质(新皮层区域)的相互作用,作为生成对抗网络(GAN)的三个网络结构。输入信息被编码为在发电机中数据分布的特定表示,或在歧视器和分类器中求解任务的广义知识,并通过实现适当的大脑启发算法来减轻每个模块中的灾难性遗忘。尤其是,发电机将学习任务的生成数据重新为歧视器和分类器,这两者都使用重量合并正规器实现,以补充生成过程中丢失的信息。与强基线方法相比,TMN在MNIST,SVHN,CIFAR-10和IMAGENET-50上的各种类别的学习基准上实现了新的最新性能。

Continual acquisition of novel experience without interfering previously learned knowledge, i.e. continual learning, is critical for artificial neural networks, but limited by catastrophic forgetting. A neural network adjusts its parameters when learning a new task, but then fails to conduct the old tasks well. By contrast, the brain has a powerful ability to continually learn new experience without catastrophic interference. The underlying neural mechanisms possibly attribute to the interplay of hippocampus-dependent memory system and neocortex-dependent memory system, mediated by prefrontal cortex. Specifically, the two memory systems develop specialized mechanisms to consolidate information as more specific forms and more generalized forms, respectively, and complement the two forms of information in the interplay. Inspired by such brain strategy, we propose a novel approach named triple memory networks (TMNs) for continual learning. TMNs model the interplay of hippocampus, prefrontal cortex and sensory cortex (a neocortex region) as a triple-network architecture of generative adversarial networks (GAN). The input information is encoded as specific representation of the data distributions in a generator, or generalized knowledge of solving tasks in a discriminator and a classifier, with implementing appropriate brain-inspired algorithms to alleviate catastrophic forgetting in each module. Particularly, the generator replays generated data of the learned tasks to the discriminator and the classifier, both of which are implemented with a weight consolidation regularizer to complement the lost information in generation process. TMNs achieve new state-of-the-art performance on a variety of class-incremental learning benchmarks on MNIST, SVHN, CIFAR-10 and ImageNet-50, comparing with strong baseline methods.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源