自我阐明的单眼深度估计的自截至的特征聚合

论文标题

自我阐明的单眼深度估计的自截至的特征聚合

Self-distilled Feature Aggregation for Self-supervised Monocular Depth Estimation

论文作者

Zhou, Zhengming, Dong, Qiulei

论文摘要

自我监督的单眼深度估计最近在计算机视觉上受到了很多关注。文献中的大多数现有作品通过直接串联或元素添加来进行深度预测的多尺度特征，但是，这种特征聚合操作通常忽略了多尺度特征之间的上下文一致性。在解决这个问题时，我们提出了自缩放的功能聚合（SDFA）模块，用于同时汇总一对低规模和高规模的功能并保持其上下文一致性。 SDFA分别使用三个分支来学习三个功能偏移映射：一个偏移映射，用于完善输入低尺度功能，另外两个用于在设计的自我介绍方式下完善输入高尺度功能。然后，我们提出了一个基于SDFA的网络，用于自我监督的单眼深度估计，并设计一种自缩训练策略，以使用SDFA模块训练拟议的网络。 Kitti数据集的实验结果表明，在大多数情况下，所提出的方法优于比较最新方法。该代码可在https://github.com/zm-zhou/sdfa-net_pytorch上找到。

Self-supervised monocular depth estimation has received much attention recently in computer vision. Most of the existing works in literature aggregate multi-scale features for depth prediction via either straightforward concatenation or element-wise addition, however, such feature aggregation operations generally neglect the contextual consistency between multi-scale features. Addressing this problem, we propose the Self-Distilled Feature Aggregation (SDFA) module for simultaneously aggregating a pair of low-scale and high-scale features and maintaining their contextual consistency. The SDFA employs three branches to learn three feature offset maps respectively: one offset map for refining the input low-scale feature and the other two for refining the input high-scale feature under a designed self-distillation manner. Then, we propose an SDFA-based network for self-supervised monocular depth estimation, and design a self-distilled training strategy to train the proposed network with the SDFA module. Experimental results on the KITTI dataset demonstrate that the proposed method outperforms the comparative state-of-the-art methods in most cases. The code is available at https://github.com/ZM-Zhou/SDFA-Net_pytorch.

下载PDF全文

下载文献需遵守相关版权规定

论文标题