论文标题
从单个2D图像中解开3D属性:人姿势,形状和服装
Disentangling 3D Attributes from a Single 2D Image: Human Pose, Shape and Garment
论文作者
论文摘要
对于视觉操作任务,我们旨在表示具有语义上有意义的功能的图像内容。但是,从图像中学习隐式表示通常缺乏解释性,尤其是当属性交织在一起时。我们专注于仅从2D图像数据中提取删除的3D属性的具有挑战性的任务。具体而言,我们专注于人类外观,并从RGB图像中学习隐性姿势,形状和服装表示。我们的方法学习了这三个图像属性的分离潜在表示的嵌入,并通过2到3D编码器码头结构来实现有意义的特征和属性控制。 3D模型仅从学到的嵌入空间中的特征图中推断出来。据我们所知,我们的方法是第一个实现这一高度不足问题的跨域分解的方法。我们在定性和定量上证明了框架在虚拟数据上转移姿势,形状和服装的能力,并显示隐性形状损失如何使模型恢复细粒度重建细节的能力有益。
For visual manipulation tasks, we aim to represent image content with semantically meaningful features. However, learning implicit representations from images often lacks interpretability, especially when attributes are intertwined. We focus on the challenging task of extracting disentangled 3D attributes only from 2D image data. Specifically, we focus on human appearance and learn implicit pose, shape and garment representations of dressed humans from RGB images. Our method learns an embedding with disentangled latent representations of these three image properties and enables meaningful re-assembling of features and property control through a 2D-to-3D encoder-decoder structure. The 3D model is inferred solely from the feature map in the learned embedding space. To the best of our knowledge, our method is the first to achieve cross-domain disentanglement for this highly under-constrained problem. We qualitatively and quantitatively demonstrate our framework's ability to transfer pose, shape, and garments in 3D reconstruction on virtual data and show how an implicit shape loss can benefit the model's ability to recover fine-grained reconstruction details.