体素：朝着野生环境中的多相机3D人姿势估计

论文标题

体素：朝着野生环境中的多相机3D人姿势估计

VoxelPose: Towards Multi-Camera 3D Human Pose Estimation in Wild Environment

论文作者

Tu, Hanyue, Wang, Chunyu, Zeng, Wenjun

论文摘要

我们提出了一种从多个相机视图中估算多个人的3D姿势的方法。与以前需要基于嘈杂且不完整的2D姿势估算的跨视图对应的努力相反，我们提出了一种直接在$ 3 $ d空间中运行的端到端解决方案，因此避免在2D空间中做出错误的决策。为了实现这一目标，所有相机视图中的功能均在共同的3D空间中扭曲和汇总，并将其馈入Cuboid提案网络（CPN），以使所有人都定位。然后，我们提出姿势回归网络（PRN），以估算每个建议的详细3D姿势。在实践中经常发生的闭塞方法是可靠的。没有铃铛和哨子，它的表现就超过了公共数据集上的最先进。代码将在https://github.com/microsoft/multiperson-pose-esimation-pytorch上发布。

We present an approach to estimate 3D poses of multiple people from multiple camera views. In contrast to the previous efforts which require to establish cross-view correspondence based on noisy and incomplete 2D pose estimations, we present an end-to-end solution which directly operates in the $3$D space, therefore avoids making incorrect decisions in the 2D space. To achieve this goal, the features in all camera views are warped and aggregated in a common 3D space, and fed into Cuboid Proposal Network (CPN) to coarsely localize all people. Then we propose Pose Regression Network (PRN) to estimate a detailed 3D pose for each proposal. The approach is robust to occlusion which occurs frequently in practice. Without bells and whistles, it outperforms the state-of-the-arts on the public datasets. Code will be released at https://github.com/microsoft/multiperson-pose-estimation-pytorch.

下载PDF全文

下载文献需遵守相关版权规定

论文标题