论文标题
不要停止培训:不断更新的自我监督算法对皮质中的听觉响应的最佳帐户
Don't stop the training: continuously-updating self-supervised algorithms best account for auditory responses in the cortex
论文作者
论文摘要
在过去的十年中,大量研究表明,深神经网络表现出与哺乳动物大脑相似的感觉表示,因为它们的激活线性地映射到对相同感觉输入的皮质反应。但是,这些人造网络是否也像大脑一样学习仍然未知。为了解决这个问题,我们分析了用功能性超声成像(FUS)记录的两个雪貂听觉皮层的大脑反应,同时向动物呈现320 10 \,s声音。我们将这些大脑的反应与WAV2VEC 2.0的激活进行了比较,Wav2Vec 2.0是一种自我监督的神经网络,用960 \,语音h预处理,并以相同的320声音输入。至关重要的是,我们在两种不同的模式下评估了WAV2VEC 2.0:(i)“预审计”,其中所有声音都使用相同的模型,以及(ii)“连续更新”,其中预处理模型的重量在每种声音后都以与雪貂相同的顺序进行后反向进行修改。我们的结果表明,连续更新模式导致WAV2VEC 2.0产生的激活比使用不同的训练模式的WAV2VEC 2.0或其他对照模型更与大脑更相似。这些结果表明,由后传播诱导的自我监督算法的试验修改与对声音的皮质反应的相应波动一致。因此,我们的发现提供了自我监督模型与声音处理过程中哺乳动物皮层之间共同学习机制的经验证据。
Over the last decade, numerous studies have shown that deep neural networks exhibit sensory representations similar to those of the mammalian brain, in that their activations linearly map onto cortical responses to the same sensory inputs. However, it remains unknown whether these artificial networks also learn like the brain. To address this issue, we analyze the brain responses of two ferret auditory cortices recorded with functional UltraSound imaging (fUS), while the animals were presented with 320 10\,s sounds. We compare these brain responses to the activations of Wav2vec 2.0, a self-supervised neural network pretrained with 960\,h of speech, and input with the same 320 sounds. Critically, we evaluate Wav2vec 2.0 under two distinct modes: (i) "Pretrained", where the same model is used for all sounds, and (ii) "Continuous Update", where the weights of the pretrained model are modified with back-propagation after every sound, presented in the same order as the ferrets. Our results show that the Continuous-Update mode leads Wav2Vec 2.0 to generate activations that are more similar to the brain than a Pretrained Wav2Vec 2.0 or than other control models using different training modes. These results suggest that the trial-by-trial modifications of self-supervised algorithms induced by back-propagation aligns with the corresponding fluctuations of cortical responses to sounds. Our finding thus provides empirical evidence of a common learning mechanism between self-supervised models and the mammalian cortex during sound processing.