论文标题
SIMVPV2:迈向简单而强大的时空预测学习
SimVPv2: Towards Simple yet Powerful Spatiotemporal Predictive Learning
论文作者
论文摘要
近年来,在时空预测学习方面取得了显着进步,并结合了辅助输入,复杂的神经体系结构和复杂的培训策略。尽管SIMVP针对此任务引入了一个更简单的基于CNN的基线,但它仍然依靠用于空间和时间建模的繁重的UNET样体系结构,这仍然遭受了高复杂性和计算开销。在本文中,我们提出了一种简化的模型SimVPV2,它消除了对UNET体系结构的需求,并证明了通过有效的封闭式时空注意机制增强的卷积层堆叠,可以提供最先进的性能。 SIMVPV2不仅简化了模型体系结构,而且还提高了性能和计算效率。在标准移动MNIST基准测试中,SIMVPV2与SIMVP相比,其性能较高,拖鞋较少,训练时间的一半和更快的推理效率。跨八个不同数据集进行的广泛实验,包括现实世界的任务,例如交通预测和气候预测,进一步证明了SIMVPV2提供了强大而直接的解决方案,从而在各种时空学习方案中实现了强大的概括性。我们认为,拟议的SIMVPV2可以作为稳固的基准,以使时空预测学习社区受益。
Recent years have witnessed remarkable advances in spatiotemporal predictive learning, with methods incorporating auxiliary inputs, complex neural architectures, and sophisticated training strategies. While SimVP has introduced a simpler, CNN-based baseline for this task, it still relies on heavy Unet-like architectures for spatial and temporal modeling, which still suffers from high complexity and computational overhead. In this paper, we propose SimVPv2, a streamlined model that eliminates the need for Unet architectures and demonstrates that plain stacks of convolutional layers, enhanced with an efficient Gated Spatiotemporal Attention mechanism, can deliver state-of-the-art performance. SimVPv2 not only simplifies the model architecture but also improves both performance and computational efficiency. On the standard Moving MNIST benchmark, SimVPv2 achieves superior performance compared to SimVP, with fewer FLOPs, about half the training time, and 60% faster inference efficiency. Extensive experiments across eight diverse datasets, including real-world tasks such as traffic forecasting and climate prediction, further demonstrate that SimVPv2 offers a powerful yet straightforward solution, achieving robust generalization across various spatiotemporal learning scenarios. We believe the proposed SimVPv2 can serve as a solid baseline to benefit the spatiotemporal predictive learning community.