HyperSign: Hierarchical Hypergraph-based Co-occurrence Modeling for Sign Language Recognition and Translation

Authors

  • Qianren Guo Jilin University
  • Yuehang Wang Jilin University
  • Yongji Zhang Jilin University
  • Qi Chu Jilin University
  • Sen Liu Jilin University
  • Yu Jiang Jilin University

DOI:

https://doi.org/10.1609/aaai.v40i6.42440

Abstract

Effectively capturing co-occurrence signals, such as hand shapes, facial expressions, and body postures, is critical for semantic understanding in sign language recognition (SLR) and translation (SLT). Although skeleton data offer greater efficiency and robustness than RGB inputs, existing methods typically rely on pairwise graph structures, limiting their ability to model complex high-order interactions across body regions. To address this limitation, we propose HyperSign, a hierarchical hypergraph neural network that systematically captures high-order co-occurrence patterns among diverse body parts. The Co-occurrence Graph Perception Module jointly learns relational structures via three complementary pathways: (1) traditional graph convolutions for modeling physical joint connections, (2) dynamic geometric hypergraphs constructed via k-nearest neighbors to encode local spatial patterns, and (3) soft hypergraphs generated by learnable prototypes to reveal latent semantic associations. To further enhance structural modeling and semantic consistency, a Meta-Part Hypergraph Fusion Module abstracts feature streams from the hands, face, and body into unified hypergraph nodes, while leveraging empirically derived co-occurrence priors to model high-order cross-part dependencies. Moreover, an uncertainty-aware collaborative distillation mechanism guides the model to focus on critical body regions. Extensive experiments on standard SLR and SLT benchmarks (e.g., PHOENIX-2014, PHOENIX-2014T, and CSL-Daily) demonstrate that HyperSign not only outperforms existing skeleton-based approaches in both speed and accuracy but also achieves competitive or superior results compared to several state-of-the-art RGB-based methods across multiple evaluation metrics.

Downloads

Published

2026-03-14

How to Cite

Guo, Q., Wang, Y., Zhang, Y., Chu, Q., Liu, S., & Jiang, Y. (2026). HyperSign: Hierarchical Hypergraph-based Co-occurrence Modeling for Sign Language Recognition and Translation. Proceedings of the AAAI Conference on Artificial Intelligence, 40(6), 4421–4429. https://doi.org/10.1609/aaai.v40i6.42440

Issue

Section

AAAI Technical Track on Computer Vision III