论文标题
Level-S $^2 $ FM:在神经级别的隐式表面上的运动结构
Level-S$^2$fM: Structure from Motion on Neural Level Set of Implicit Surfaces
论文作者
论文摘要
本文介绍了一种神经增量结构(SFM)方法,即级$^2 $ fm,该方法通过学习隐式表面的MLP以及从已建立的Kepoint信号的一组未启用的MLP来估计摄像机的构成和场景几何形状。我们的新颖配方在增量SFM管道中不可避免的两视图和几乎没有视图的配置引起了一些新的挑战,这使得通过未知相机姿势优化了用于体积神经渲染的坐标MLP的优化。然而,我们证明了2D对应中的强大归纳基础传达有望通过利用射线采样方案之间的关系来应对这些挑战。基于此,我们重新审视增量SFM的管道并续订关键组件,包括两种视角初始化,摄像机构成注册,3D点三角测量和束调整,并以神经隐式表面为基础的新透视图。通过通过坐标MLP在小型MLP网络中统一场景几何形状,我们的S级$^2 $ fm将隐式表面的零级集合作为一个信息的自上而下的正规化,以管理重建的3D点,拒绝通过查询SDF的对应关系中的对应关系中的异常值,并通过询问SDF,并通过Neural Ba(Neural BA)(Neural Ba)进行估计的差距。我们的级别$^2 $ fm不仅会在相机姿势估计和场景几何重建方面带来有希望的结果,而且还显示了一种有希望的神经隐式渲染的方法,而无需事先知道相机外部。
This paper presents a neural incremental Structure-from-Motion (SfM) approach, Level-S$^2$fM, which estimates the camera poses and scene geometry from a set of uncalibrated images by learning coordinate MLPs for the implicit surfaces and the radiance fields from the established keypoint correspondences. Our novel formulation poses some new challenges due to inevitable two-view and few-view configurations in the incremental SfM pipeline, which complicates the optimization of coordinate MLPs for volumetric neural rendering with unknown camera poses. Nevertheless, we demonstrate that the strong inductive basis conveying in the 2D correspondences is promising to tackle those challenges by exploiting the relationship between the ray sampling schemes. Based on this, we revisit the pipeline of incremental SfM and renew the key components, including two-view geometry initialization, the camera poses registration, the 3D points triangulation, and Bundle Adjustment, with a fresh perspective based on neural implicit surfaces. By unifying the scene geometry in small MLP networks through coordinate MLPs, our Level-S$^2$fM treats the zero-level set of the implicit surface as an informative top-down regularization to manage the reconstructed 3D points, reject the outliers in correspondences via querying SDF, and refine the estimated geometries by NBA (Neural BA). Not only does our Level-S$^2$fM lead to promising results on camera pose estimation and scene geometry reconstruction, but it also shows a promising way for neural implicit rendering without knowing camera extrinsic beforehand.