论文标题
一致的直接飞行时间视频深度超分辨率
Consistent Direct Time-of-Flight Video Depth Super-Resolution
论文作者
论文摘要
直接飞行时间(DTOF)传感器对于下一代内部设备3D传感有希望。但是,受到紧凑模块中的制造功能的限制,DTOF数据具有低空间分辨率(例如,iPhone DTOF的$ \ sim 20 \ times30 $),并且需要在传递到下游任务之前进行超分辨率步骤。在本文中,我们通过将低分辨率DTOF数据与相应的高分辨率RGB指导融合来解决这个超分辨率问题。与以人均方式进行融合的常规RGB引导深度增强方法不同,我们提出了第一个多帧融合方案,以减轻低分辨率DTOF成像引起的空间歧义。此外,DTOF传感器为每个局部贴片提供了独特的深度直方图信息,我们将此特定于DTOF特定功能纳入网络设计中,以进一步减轻空间歧义。为了在复杂的动态室内环境上评估我们的模型并提供大规模的DTOF传感器数据集,我们介绍了Dydtof,这是第一个合成的RGB-DTOF视频数据集,该数据集具有动态对象和逼真的DTOF模拟器,遵循物理成像过程。我们认为,随着DTOF深度感应成为移动设备的主流,方法和数据集对广泛的社区有益。我们的代码和数据公开可用:https://github.com/facebookresearch/dvsr/
Direct time-of-flight (dToF) sensors are promising for next-generation on-device 3D sensing. However, limited by manufacturing capabilities in a compact module, the dToF data has a low spatial resolution (e.g., $\sim 20\times30$ for iPhone dToF), and it requires a super-resolution step before being passed to downstream tasks. In this paper, we solve this super-resolution problem by fusing the low-resolution dToF data with the corresponding high-resolution RGB guidance. Unlike the conventional RGB-guided depth enhancement approaches, which perform the fusion in a per-frame manner, we propose the first multi-frame fusion scheme to mitigate the spatial ambiguity resulting from the low-resolution dToF imaging. In addition, dToF sensors provide unique depth histogram information for each local patch, and we incorporate this dToF-specific feature in our network design to further alleviate spatial ambiguity. To evaluate our models on complex dynamic indoor environments and to provide a large-scale dToF sensor dataset, we introduce DyDToF, the first synthetic RGB-dToF video dataset that features dynamic objects and a realistic dToF simulator following the physical imaging process. We believe the methods and dataset are beneficial to a broad community as dToF depth sensing is becoming mainstream on mobile devices. Our code and data are publicly available: https://github.com/facebookresearch/DVSR/