论文标题
体素:朝着野生环境中的多相机3D人姿势估计
VoxelPose: Towards Multi-Camera 3D Human Pose Estimation in Wild Environment
论文作者
论文摘要
我们提出了一种从多个相机视图中估算多个人的3D姿势的方法。与以前需要基于嘈杂且不完整的2D姿势估算的跨视图对应的努力相反,我们提出了一种直接在$ 3 $ d空间中运行的端到端解决方案,因此避免在2D空间中做出错误的决策。为了实现这一目标,所有相机视图中的功能均在共同的3D空间中扭曲和汇总,并将其馈入Cuboid提案网络(CPN),以使所有人都定位。然后,我们提出姿势回归网络(PRN),以估算每个建议的详细3D姿势。在实践中经常发生的闭塞方法是可靠的。没有铃铛和哨子,它的表现就超过了公共数据集上的最先进。代码将在https://github.com/microsoft/multiperson-pose-esimation-pytorch上发布。
We present an approach to estimate 3D poses of multiple people from multiple camera views. In contrast to the previous efforts which require to establish cross-view correspondence based on noisy and incomplete 2D pose estimations, we present an end-to-end solution which directly operates in the $3$D space, therefore avoids making incorrect decisions in the 2D space. To achieve this goal, the features in all camera views are warped and aggregated in a common 3D space, and fed into Cuboid Proposal Network (CPN) to coarsely localize all people. Then we propose Pose Regression Network (PRN) to estimate a detailed 3D pose for each proposal. The approach is robust to occlusion which occurs frequently in practice. Without bells and whistles, it outperforms the state-of-the-arts on the public datasets. Code will be released at https://github.com/microsoft/multiperson-pose-estimation-pytorch.