用于视觉跟踪的暹罗盒自适应网络

论文标题

用于视觉跟踪的暹罗盒自适应网络

Siamese Box Adaptive Network for Visual Tracking

论文作者

Chen, Zedu, Zhong, Bineng, Li, Guorong, Zhang, Shengping, Ji, Rongrong

论文摘要

大多数现有的跟踪器通常依赖于多尺度搜索方案或预定义的锚点来准确估计目标的规模和纵横比。不幸的是，他们通常要求进行乏味和启发式配置。为了解决这个问题，我们通过利用完全卷积网络（FCN）的表达能力来提出一个简单而有效的视觉跟踪框架（名为Siamese Box自适应网络，Siamban）。 Siamban将视觉跟踪问题视为平行分类和回归问题，因此直接将对象分类并在统一的FCN中回归其边界框。 No-Prior Box Design避免了与候选盒相关的超参数，从而使Siamban更加灵活和一般。在视觉跟踪基准中进行的广泛实验，包括vot2018，vot2019，OTB100，NFS，UAV123和Lasot，这表明Siamban实现了最先进的性能并以40 fps的速度运行，证实了其效率和效率。该代码将在https://github.com/hqucv/siamban上找到。

Most of the existing trackers usually rely on either a multi-scale searching scheme or pre-defined anchor boxes to accurately estimate the scale and aspect ratio of a target. Unfortunately, they typically call for tedious and heuristic configurations. To address this issue, we propose a simple yet effective visual tracking framework (named Siamese Box Adaptive Network, SiamBAN) by exploiting the expressive power of the fully convolutional network (FCN). SiamBAN views the visual tracking problem as a parallel classification and regression problem, and thus directly classifies objects and regresses their bounding boxes in a unified FCN. The no-prior box design avoids hyper-parameters associated with the candidate boxes, making SiamBAN more flexible and general. Extensive experiments on visual tracking benchmarks including VOT2018, VOT2019, OTB100, NFS, UAV123, and LaSOT demonstrate that SiamBAN achieves state-of-the-art performance and runs at 40 FPS, confirming its effectiveness and efficiency. The code will be available at https://github.com/hqucv/siamban.

下载PDF全文

下载文献需遵守相关版权规定

论文标题