无隐性运动的网络，用于无监督的视频对象细分

论文标题

无隐性运动的网络，用于无监督的视频对象细分

Implicit Motion-Compensated Network for Unsupervised Video Object Segmentation

论文作者

Xi, Lin, Chen, Weihai, Wu, Xingming, Liu, Zhong, Li, Zhengguo

论文摘要

无监督的视频对象分割（UVO）旨在将主要前景对象与视频序列中的背景自动分开。当视觉上相似的环境（基于外观）的环境（由于动态背景和流动不正确（基于流动））而遭受预测质量（基于外观）的质量（基于外观）时，现有的UVOS方法要么缺乏鲁棒性。为了克服局限性，我们提出了一个隐式运动补偿网络（IMCNET），将互补线索（$ \ textit {i.e。} $，外观和运动）与从相邻帧到当前帧的特征级别的互补提示（$ \ textit {i.e。} $）相结合，而无需估算光学流量。提出的IMCNET由亲和力计算模块（ACM），注意传播模块（APM）和运动补偿模块（MCM）组成。轻巧的ACM根据外观特征提取相邻输入框架之间的共同点。然后，APM以自上而下的方式传输全局相关性。通过粗到最新的迭代鼓舞人心，APM将从多种决议中完善对象区域，从而有效避免丢失细节。最后，MCM将运动信息从时间相邻帧与当前帧保持一致，该帧在功能级别上实现了隐式运动补偿。我们在$ \ textit {davis} _ {\ textit {16}} $和$ \ textit {youtube-objects} $上执行广泛的实验。与最先进的方法相比，我们的网络在以更快的速度运行时取得了有利的性能。

Unsupervised video object segmentation (UVOS) aims at automatically separating the primary foreground object(s) from the background in a video sequence. Existing UVOS methods either lack robustness when there are visually similar surroundings (appearance-based) or suffer from deterioration in the quality of their predictions because of dynamic background and inaccurate flow (flow-based). To overcome the limitations, we propose an implicit motion-compensated network (IMCNet) combining complementary cues ($\textit{i.e.}$, appearance and motion) with aligned motion information from the adjacent frames to the current frame at the feature level without estimating optical flows. The proposed IMCNet consists of an affinity computing module (ACM), an attention propagation module (APM), and a motion compensation module (MCM). The light-weight ACM extracts commonality between neighboring input frames based on appearance features. The APM then transmits global correlation in a top-down manner. Through coarse-to-fine iterative inspiring, the APM will refine object regions from multiple resolutions so as to efficiently avoid losing details. Finally, the MCM aligns motion information from temporally adjacent frames to the current frame which achieves implicit motion compensation at the feature level. We perform extensive experiments on $\textit{DAVIS}_{\textit{16}}$ and $\textit{YouTube-Objects}$. Our network achieves favorable performance while running at a faster speed compared to the state-of-the-art methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题