不确定性意识深度多视图光度立体声

论文标题

不确定性意识深度多视图光度立体声

Uncertainty-Aware Deep Multi-View Photometric Stereo

论文作者

Kaya, Berk, Kumar, Suryansh, Oliveira, Carlos, Ferrari, Vittorio, Van Gool, Luc

论文摘要

本文为长期有效的经典多视图立体声（MVP）问题提供了一种简单有效的解决方案。众所周知，光度立体声（PS）在恢复高频表面细节方面非常出色，而多视图立体声（MVS）可以帮助消除由于PS而导致的低频失真并保留形状的全局几何形状。本文提出了一种可以有效利用PS和MV的互补优势的方法。我们的关键想法是在考虑其估计值的每个像素不确定性的同时适当地组合它们。为此，我们分别使用不确定性意识的DEEP-PS网络和Deep-MVS网络估算每个像素表面正常和深度。不确定性建模有助于在每个像素上选择可靠的表面正常和深度估计，然后充当致密表面几何形状的真正代表。在每个像素上，我们的方法要么根据预测不确定性度量选择或丢弃深-PS和深MVS网络预测。对于对象表面剖面的密集，详细和精确的推断，我们建议通过多层感知器（MLP）学习隐式神经形状表示。我们的方法鼓励MLP使用深-PS和Deep-MVS网络的自信预测来融合自然的零级集表面，从而提供了较高的致密表面重建。在勤奋的MV基准数据集上进行的广泛实验表明，我们的方法提供了高质量的形状恢复，并且在几乎所有现有方法的情况下都具有较低的内存足迹。

This paper presents a simple and effective solution to the longstanding classical multi-view photometric stereo (MVPS) problem. It is well-known that photometric stereo (PS) is excellent at recovering high-frequency surface details, whereas multi-view stereo (MVS) can help remove the low-frequency distortion due to PS and retain the global geometry of the shape. This paper proposes an approach that can effectively utilize such complementary strengths of PS and MVS. Our key idea is to combine them suitably while considering the per-pixel uncertainty of their estimates. To this end, we estimate per-pixel surface normals and depth using an uncertainty-aware deep-PS network and deep-MVS network, respectively. Uncertainty modeling helps select reliable surface normal and depth estimates at each pixel which then act as a true representative of the dense surface geometry. At each pixel, our approach either selects or discards deep-PS and deep-MVS network prediction depending on the prediction uncertainty measure. For dense, detailed, and precise inference of the object's surface profile, we propose to learn the implicit neural shape representation via a multilayer perceptron (MLP). Our approach encourages the MLP to converge to a natural zero-level set surface using the confident prediction from deep-PS and deep-MVS networks, providing superior dense surface reconstruction. Extensive experiments on the DiLiGenT-MV benchmark dataset show that our method provides high-quality shape recovery with a much lower memory footprint while outperforming almost all of the existing approaches.

下载PDF全文

下载文献需遵守相关版权规定

论文标题