通过布局构造式重播，在迷宫中学习灵活导航的计算模型

论文标题

通过布局构造式重播，在迷宫中学习灵活导航的计算模型

A Computational Model of Learning Flexible Navigation in a Maze by Layout-Conforming Replay of Place Cells

论文作者

Gao, Yuanxiang

论文摘要

最近的实验观察结果表明，在睡眠或固定性过程中，海马放置细胞（PC）的重新激活描绘了可以围绕障碍物的轨迹，并且可以灵活地适应不断变化的迷宫布局。这种布局结合的重播阐明了位置单元的活动如何支持动态变化的迷宫中动物的灵活导航。但是，现有的重播计算模型未能生成符合布局的重播，将它们的用法限制在简单环境中，例如线性轨道或开放式磁场。在本文中，我们提出了一个计算模型，该模型会生成符合布局的重播，并解释了这种重播如何驱动迷宫中灵活导航的学习。首先，我们提出了一个类似于Hebbian的规则，以在探索迷宫时学习PC间突触力量。然后，我们使用具有反馈抑制的连续吸引力网络（CAN）来模拟位置细胞和海马中间神经元之间的相互作用。位置细胞的活性凸起沿迷宫中的路径漂移，该路径模拟了布局构造的重播。在休息重播期间，一种新型多巴胺调节的三因素规则来掌握了从位置细胞到纹状体培养基神经元（MSN）的突触强度，以存储位置奖励关联。在导向导航期间，该罐会定期生成动物位置进行路径规划的重放轨迹，并且导致最大MSN活性的轨迹之后是动物。我们已经将模型实现为Mujoco物理模拟器中的高保真虚拟大鼠。广泛的实验表明，在迷宫中导航期间的出色灵活性是由于PC Inter-PC和PC-MSN突触强度的连续重新学习。

Recent experimental observations have shown that the reactivation of hippocampal place cells (PC) during sleep or immobility depicts trajectories that can go around barriers and can flexibly adapt to a changing maze layout. Such layout-conforming replay sheds a light on how the activity of place cells supports the learning of flexible navigation of an animal in a dynamically changing maze. However, existing computational models of replay fall short of generating layout-conforming replay, restricting their usage to simple environments, like linear tracks or open fields. In this paper, we propose a computational model that generates layout-conforming replay and explains how such replay drives the learning of flexible navigation in a maze. First, we propose a Hebbian-like rule to learn the inter-PC synaptic strength during exploring a maze. Then we use a continuous attractor network (CAN) with feedback inhibition to model the interaction among place cells and hippocampal interneurons. The activity bump of place cells drifts along a path in the maze, which models layout-conforming replay. During replay in rest, the synaptic strengths from place cells to striatal medium spiny neurons (MSN) are learned by a novel dopamine-modulated three-factor rule to store place-reward associations. During goal-directed navigation, the CAN periodically generates replay trajectories from the animal's location for path planning, and the trajectory leading to a maximal MSN activity is followed by the animal. We have implemented our model into a high-fidelity virtual rat in the MuJoCo physics simulator. Extensive experiments have demonstrated that its superior flexibility during navigation in a maze is due to a continuous re-learning of inter-PC and PC-MSN synaptic strength.

下载PDF全文

下载文献需遵守相关版权规定

论文标题