论文标题
稳定且富有表现力的复发模型
Stable and expressive recurrent vision models
论文作者
论文摘要
灵长类动物的视觉取决于可靠的感知的复发处理。越来越多的文献还表明,反复的联系提高了对经典计算机视觉挑战的视觉模型的学习效率和概括。为什么当前的大规模挑战是由前馈网络主导的?我们认为,经常性视觉模型的有效性是由用于训练它们的标准算法“通过时间的反向传播”(BPTT)瓶颈的,该算法具有O(n)内存 - 复杂性用于训练N步骤模型。因此,循环视觉模型设计受记忆约束的界定,迫使选择领先的前馈模型的巨大能力或试图通过颗粒状和复杂的动态来弥补这种赤字之间的选择。在这里,我们开发了一种新的学习算法,即“承包商经常出现的背部传播”(C-RBP),该算法通过通过经常处理的步骤实现恒定的O(1)记忆复杂性来减轻这些问题。我们证明,经过C-RBP训练的经常性视觉模型可以在BPTT训练的模型无法使用的合成轮廓跟踪任务中检测长期空间依赖性。我们进一步表明,经过C-RBP训练的反复视觉模型,以解决大规模的全盘分段MS-Coco挑战的效果优于领先的前馈方法,而自由参数较少。 C-RBP是一种通用学习算法,用于任何可以从广泛的复发动力学中受益的应用。代码和数据可在https://github.com/c-rbp上找到。
Primate vision depends on recurrent processing for reliable perception. A growing body of literature also suggests that recurrent connections improve the learning efficiency and generalization of vision models on classic computer vision challenges. Why then, are current large-scale challenges dominated by feedforward networks? We posit that the effectiveness of recurrent vision models is bottlenecked by the standard algorithm used for training them, "back-propagation through time" (BPTT), which has O(N) memory-complexity for training an N step model. Thus, recurrent vision model design is bounded by memory constraints, forcing a choice between rivaling the enormous capacity of leading feedforward models or trying to compensate for this deficit through granular and complex dynamics. Here, we develop a new learning algorithm, "contractor recurrent back-propagation" (C-RBP), which alleviates these issues by achieving constant O(1) memory-complexity with steps of recurrent processing. We demonstrate that recurrent vision models trained with C-RBP can detect long-range spatial dependencies in a synthetic contour tracing task that BPTT-trained models cannot. We further show that recurrent vision models trained with C-RBP to solve the large-scale Panoptic Segmentation MS-COCO challenge outperform the leading feedforward approach, with fewer free parameters. C-RBP is a general-purpose learning algorithm for any application that can benefit from expansive recurrent dynamics. Code and data are available at https://github.com/c-rbp.