论文标题
比较神经形态视觉数据集上的SNN和RNN:相似性和差异
Comparing SNNs and RNNs on Neuromorphic Vision Datasets: Similarities and Differences
论文作者
论文摘要
神经形态数据,记录无框尖峰事件,引起了人们对时空信息组件和事件驱动的处理方式的极大关注。尖峰神经网络(SNNS)代表了一个以时空动力学用于神经形态计算的事件驱动模型的家族,该模型在神经形态数据上得到了广泛的标准。有趣的是,机器学习社区中的研究人员可以说,经常性(人造)神经网络(RNN)也具有提取时空特征,尽管它们不是事件驱动的。因此,“如果我们将这两种模型一起在神经形态数据上进行基准,那么将会发生什么”,但尚不清楚。在这项工作中,我们进行了一项系统的研究,以比较神经形态数据的SNN和RNN,将视觉数据集作为案例研究。首先,我们从建模和学习的角度确定了SNN和RNN(包括香草RNN和LSTM)之间的相似性和差异。为了提高可比性和公平性,我们根据时间(BPTT)统一监督的学习算法,在所有时间段中利用输出,网络结构,具有堆叠完全连接或卷积层的网络结构,以及在训练过程中的超参数。尤其是,鉴于RNN中使用的主流损失函数,我们将其修改为速率编码方案以接近SNN的启发。此外,我们调整数据集的时间分辨率来测试模型的鲁棒性和概括。最后,对两种类型的神经形态数据集进行了一系列对比实验:转换DVS(N-MNIST)和DVS捕获(DVS手势)。
Neuromorphic data, recording frameless spike events, have attracted considerable attention for the spatiotemporal information components and the event-driven processing fashion. Spiking neural networks (SNNs) represent a family of event-driven models with spatiotemporal dynamics for neuromorphic computing, which are widely benchmarked on neuromorphic data. Interestingly, researchers in the machine learning community can argue that recurrent (artificial) neural networks (RNNs) also have the capability to extract spatiotemporal features although they are not event-driven. Thus, the question of "what will happen if we benchmark these two kinds of models together on neuromorphic data" comes out but remains unclear. In this work, we make a systematic study to compare SNNs and RNNs on neuromorphic data, taking the vision datasets as a case study. First, we identify the similarities and differences between SNNs and RNNs (including the vanilla RNNs and LSTM) from the modeling and learning perspectives. To improve comparability and fairness, we unify the supervised learning algorithm based on backpropagation through time (BPTT), the loss function exploiting the outputs at all timesteps, the network structure with stacked fully-connected or convolutional layers, and the hyper-parameters during training. Especially, given the mainstream loss function used in RNNs, we modify it inspired by the rate coding scheme to approach that of SNNs. Furthermore, we tune the temporal resolution of datasets to test model robustness and generalization. At last, a series of contrast experiments are conducted on two types of neuromorphic datasets: DVS-converted (N-MNIST) and DVS-captured (DVS Gesture).