Hand-Centric Motion Refinement for 3D Hand-Object Interaction via Hierarchical Spatial-Temporal Modeling
DOI:
https://doi.org/10.1609/aaai.v38i3.27979Keywords:
CV: Biometrics, Face, Gesture & Pose, CV: 3D Computer Vision, CV: Motion & TrackingAbstract
Hands are the main medium when people interact with the world. Generating proper 3D motion for hand-object interaction is vital for applications such as virtual reality and robotics. Although grasp tracking or object manipulation synthesis can produce coarse hand motion, this kind of motion is inevitably noisy and full of jitter. To address this problem, we propose a data-driven method for coarse motion refinement. First, we design a hand-centric representation to describe the dynamic spatial-temporal relation between hands and objects. Compared to the object-centric representation, our hand-centric representation is straightforward and does not require an ambiguous projection process that converts object-based prediction into hand motion. Second, to capture the dynamic clues of hand-object interaction, we propose a new architecture that models the spatial and temporal structure in a hierarchical manner. Extensive experiments demonstrate that our method outperforms previous methods by a noticeable margin.Downloads
Published
2024-03-24
How to Cite
Hao, Y., Zhang, J., Zhuo, T., Wen, F., & Fan, H. (2024). Hand-Centric Motion Refinement for 3D Hand-Object Interaction via Hierarchical Spatial-Temporal Modeling. Proceedings of the AAAI Conference on Artificial Intelligence, 38(3), 2076-2084. https://doi.org/10.1609/aaai.v38i3.27979
Issue
Section
AAAI Technical Track on Computer Vision II