CoEvoer: Collaborative Evolution Transformer for Upper-Body Expressive Human Pose and Shape Estimation

Authors

  • Yuxiang Zhao Shenzhen Campus of Sun Yat-sen University Alibaba Group
  • Wei Huang Shenzhen Campus of Sun Yat-sen University
  • Yujie Song Shenzhen Campus of Sun Yat-sen University
  • Liu Wang Shenzhen Campus of Sun Yat-sen University
  • Huan Zhao Shenzhen Campus of Sun Yat-sen University

DOI:

https://doi.org/10.1609/aaai.v40i22.38952

Abstract

Expressive Human Pose and Shape Estimation (EHPS) plays a crucial role in various AR/VR applications and has witnessed significant progress in recent years. However, current state-of-the-art methods still struggle with accurate parameter estimation for facial and hand regions and exhibit limited generalization to wild images. To address these challenges, we present CoEvoer, a novel one-stage synergistic cross-dependency transformer framework tailored for upper-body EHPS. CoEvoer enables explicit feature-level interaction across different body parts, allowing for mutual enhancement through contextual information exchange. Specifically, larger and more easily estimated regions such as the torso provide global semantics and positional priors to guide the estimation of finer, more complex regions like the face and hands. Conversely, the localized details captured in facial and hand regions help refine and calibrate adjacent body parts. To the best of our knowledge, CoEvoer is the first framework designed specifically for upper-body EHPS, with the goal of capturing the strong coupling and semantic dependencies among the face, hands, and torso through joint parameter regression. Extensive experiments demonstrate that CoEvoer achieves state-of-the-art performance on upper-body benchmarks and exhibits strong generalization capability even on unseen wild images.

Downloads

Published

2026-03-14

How to Cite

Zhao, Y., Huang, W., Song, Y., Wang, L., & Zhao, H. (2026). CoEvoer: Collaborative Evolution Transformer for Upper-Body Expressive Human Pose and Shape Estimation. Proceedings of the AAAI Conference on Artificial Intelligence, 40(22), 18827–18835. https://doi.org/10.1609/aaai.v40i22.38952

Issue

Section

AAAI Technical Track on Intelligent Robotics