使用正面观看的嵌入有效的有条件的面部动画

论文标题

使用正面观看的嵌入有效的有条件的面部动画

Efficient conditioned face animation using frontally-viewed embedding

论文作者

Oquab, Maxime, Haziza, Daniel, Schwartz, Ludovic, Xu, Tao, Zand, Katayoun, Wang, Rui, Liu, Peirong, Couprie, Camille

论文摘要

随着地标的少数拍摄面部动画质量的提高，新的应用变得可能发生了，例如超低带宽视频聊天压缩，具有高度的现实主义。但是，为了改善现实世界中的经验，需要解决一些重要的挑战。特别是，当前方法在低计算机制下运行时无法表示没有扭曲的配置文件视图。我们通过引入多框架嵌入称为额叶的器以改善配置文件视图渲染来关注这个关键问题。除了这种核心改进外，我们还探索了潜在代码条件的学习以及地标，以更好地传达面部表情。我们的密集模型可实现22％的感知质量改善，而在包含头部移动的DFDC视频子集上，一阶模型基线的里程碑误差减少了73％。随着移动体系结构的限制，我们的模型优于先前的最先进（将感知质量提高了16％以上，并在两个数据集中将具有里程碑意义的错误降低了47％以上），而在iPhone 8上实时运行具有非常低的带宽要求。

As the quality of few shot facial animation from landmarks increases, new applications become possible, such as ultra low bandwidth video chat compression with a high degree of realism. However, there are some important challenges to tackle in order to improve the experience in real world conditions. In particular, the current approaches fail to represent profile views without distortions, while running in a low compute regime. We focus on this key problem by introducing a multi-frames embedding dubbed Frontalizer to improve profile views rendering. In addition to this core improvement, we explore the learning of a latent code conditioning generations along with landmarks to better convey facial expressions. Our dense models achieves 22% of improvement in perceptual quality and 73% reduction of landmark error over the first order model baseline on a subset of DFDC videos containing head movements. Declined with mobile architectures, our models outperform the previous state-of-the-art (improving perceptual quality by more than 16% and reducing landmark error by more than 47% on two datasets) while running on real time on iPhone 8 with very low bandwidth requirements.

下载PDF全文

下载文献需遵守相关版权规定

论文标题