论文标题
基于变压器的全局3D手姿势估计在两个手中操纵对象方案
Transformer-based Global 3D Hand Pose Estimation in Two Hands Manipulating Objects Scenarios
论文作者
论文摘要
该报告描述了我们从Egentric和Multi-View摄像机(手姿势估计)中对人体,手和活动(HBHA)挑战ECCV 2022的第一名。在这一挑战中,我们旨在估算全局3D手从输入图像中构成的,其中两只手和一个物体在以自我为中心的角度进行交互。我们提出的方法通过变压器体系结构执行端到端的多手姿势估计。特别是,我们的方法在两只手相互作用的情况下强有力地估计手摆姿势。此外,我们提出了一种考虑手部尺度以鲁棒估计绝对深度的算法。即使每个人的手尺寸都不同,提议的算法也可以很好地工作。我们的方法在测试集中的每只手中达到14.4毫米(左)和15.9毫米(右)错误。
This report describes our 1st place solution to ECCV 2022 challenge on Human Body, Hands, and Activities (HBHA) from Egocentric and Multi-view Cameras (hand pose estimation). In this challenge, we aim to estimate global 3D hand poses from the input image where two hands and an object are interacting on the egocentric viewpoint. Our proposed method performs end-to-end multi-hand pose estimation via transformer architecture. In particular, our method robustly estimates hand poses in a scenario where two hands interact. Additionally, we propose an algorithm that considers hand scales to robustly estimate the absolute depth. The proposed algorithm works well even when the hand sizes are various for each person. Our method attains 14.4 mm (left) and 15.9 mm (right) errors for each hand in the test set.