MotivDance: Fine-Grained Text-Guided Motivation Choreography with Music Synchronization
DOI:
https://doi.org/10.1609/aaai.v40i8.37528Abstract
Realistic choreography demands simultaneous attention to rhythm and motivation. Prevailing automated dance generation methods mainly depend on musical input, overlooking the motivations that drive meaningful dance creation. Inspired by the motivation choreography, we aim to articulate dance motivations through textual guidance. However, the absence of high-quality datasets concurrently containing music, textual descriptions, and motion data presents a challenge in achieving accurate fine-grained textual control. To address this limitation, we present MotivDance, a novel framework integrating fine-grained textual guidance with music to synthesize semantically coherent dance sequences. Our approach first synthesizes text-guided key poses as motivations. We then introduce an Adaptive Keyframe Locator that dynamically positions these motivations within the musical context through beat-aware synchronization and cross-modal latent space alignment. Finally, a Transformer-based U-Net diffusion model performs the motion in-betweening while preserving motivational integrity. Extensive qualitative and quantitative experiments demonstrate that MotivDance effectively integrates music with fine-grained text control to generate high-fidelity dance motions.Downloads
Published
2026-03-14
How to Cite
Li, C., Wen, Y.-H., & Jing, L. (2026). MotivDance: Fine-Grained Text-Guided Motivation Choreography with Music Synchronization. Proceedings of the AAAI Conference on Artificial Intelligence, 40(8), 6046–6054. https://doi.org/10.1609/aaai.v40i8.37528
Issue
Section
AAAI Technical Track on Computer Vision V