论文标题

了解体素电网NERF模型的纯夹指南

Understanding Pure CLIP Guidance for Voxel Grid NeRF Models

论文作者

Lee, Han-Hung, Chang, Angel X.

论文摘要

我们使用剪辑探讨了对3D对象生成的文本任务。具体来说,我们使用剪辑进行指导,无需访问任何数据集,这是我们称为纯夹指南的设置。尽管先前的工作已经采用了此设置,但没有系统地研究用于防止剪辑中的对抗性的力学。我们说明了不同的基于图像的增强如何阻止对抗性生成问题,以及如何影响生成的结果。我们测试了不同的剪辑模型体系结构,并表明将不同的模型进行指导可以防止在较大模型中的对抗性并产生更清晰的结果。此外,我们实施了一个隐式体素电网模型,以显示神经网络如何提供额外的正则化层,从而导致更好的几何结构和生成对象的相干性。与先前的工作相比,我们以更高的记忆效率和更快的训练速度取得了更连贯的结果。

We explore the task of text to 3D object generation using CLIP. Specifically, we use CLIP for guidance without access to any datasets, a setting we refer to as pure CLIP guidance. While prior work has adopted this setting, there is no systematic study of mechanics for preventing adversarial generations within CLIP. We illustrate how different image-based augmentations prevent the adversarial generation problem, and how the generated results are impacted. We test different CLIP model architectures and show that ensembling different models for guidance can prevent adversarial generations within bigger models and generate sharper results. Furthermore, we implement an implicit voxel grid model to show how neural networks provide an additional layer of regularization, resulting in better geometrical structure and coherency of generated objects. Compared to prior work, we achieve more coherent results with higher memory efficiency and faster training speeds.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源