高迪：沉浸式3D场景的神经建筑师

论文标题

高迪：沉浸式3D场景的神经建筑师

GAUDI: A Neural Architect for Immersive 3D Scene Generation

论文作者

Bautista, Miguel Angel, Guo, Pengsheng, Abnar, Samira, Talbott, Walter, Toshev, Alexander, Chen, Zhuoyuan, Dinh, Laurent, Zhai, Shuangfei, Goh, Hanlin, Ulbricht, Daniel, Dehghan, Afshin, Susskind, Josh

论文摘要

我们介绍了Gaudi，Gaudi是一种生成模型，能够捕获可以从移动的相机中沉浸式的复杂和现实3D场景的分布。我们通过可扩展但功能强大的方法解决了这个具有挑战性的问题，我们首先优化了一个潜在的表示，该图表散布了辐射范围和相机姿势。然后，该潜在表示可以学习一个生成模型，该模型可以使3D场景的无条件生成和条件生成。我们的模型通过删除可以在样本中共享相机姿势分布的假设来概括了以前的作品，该作品专注于单个对象。我们表明，高迪（Gaudi）在多个数据集的无条件生成设置中获得了最先进的性能，并允许有条件地生成给定调节变量的3D场景，例如稀疏图像观测值或描述场景的文本。

We introduce GAUDI, a generative model capable of capturing the distribution of complex and realistic 3D scenes that can be rendered immersively from a moving camera. We tackle this challenging problem with a scalable yet powerful approach, where we first optimize a latent representation that disentangles radiance fields and camera poses. This latent representation is then used to learn a generative model that enables both unconditional and conditional generation of 3D scenes. Our model generalizes previous works that focus on single objects by removing the assumption that the camera pose distribution can be shared across samples. We show that GAUDI obtains state-of-the-art performance in the unconditional generative setting across multiple datasets and allows for conditional generation of 3D scenes given conditioning variables like sparse image observations or text that describes the scene.

下载PDF全文

下载文献需遵守相关版权规定

论文标题