Image Conductor: Precision Control for Interactive Video Synthesis

Authors

  • Yaowei Li School of Electronic and Computer Engineering, Peking University Guangdong Provincial Key Laboratory of Ultra High Definition Immersive Media Technology, Peking University Shenzhen Graduate School
  • Xintao Wang ARC Lab, Tencent PCG
  • Zhaoyang Zhang ARC Lab, Tencent PCG
  • Zhouxia Wang Nanyang Technological University
  • Ziyang Yuan Tsinghua University
  • Liangbin Xie University of Macau Shenzhen Institute of Advanced Technology (SIAT)
  • Ying Shan ARC Lab, Tencent PCG
  • Yuexian Zou School of Electronic and Computer Engineering, Peking University Guangdong Provincial Key Laboratory of Ultra High Definition Immersive Media Technology, Peking University Shenzhen Graduate School

DOI:

https://doi.org/10.1609/aaai.v39i5.32533

Abstract

Filmmaking and animation production often require sophisticated techniques for coordinating camera transitions and object movements, typically involving labor-intensive real-world capturing. Despite advancements in generative AI for video creation, achieving precise control over motion for interactive video asset generation remains challenging. To this end, we propose Image Conductor, a method for precise control of camera transitions and object movements to generate video assets from a single image. An well-cultivated training strategy is proposed to separate distinct camera and object motion by camera LoRA weights and object LoRA weights. To further eliminate motion ambiguity from ill-posed trajectories, we introduce a camera-free guidance technique during inference process, enhancing object movements while eliminating camera transitions. Additionally, we develop a trajectory-oriented video motion data curation pipeline for training. Quantitative and qualitative experiments demonstrate our method's precision and fine-grained control in generating motion-controllable videos from images, advancing the practical application of interactive video synthesis.

Downloads

Published

2025-04-11

How to Cite

Li, Y., Wang, X., Zhang, Z., Wang, Z., Yuan, Z., Xie, L., … Zou, Y. (2025). Image Conductor: Precision Control for Interactive Video Synthesis. Proceedings of the AAAI Conference on Artificial Intelligence, 39(5), 5031–5038. https://doi.org/10.1609/aaai.v39i5.32533

Issue

Section

AAAI Technical Track on Computer Vision IV