Infinite-Canvas: Higher-Resolution Video Outpainting with Extensive Content Generation

Authors

  • Qihua Chen University of Science and Technology of China Tencent
  • Yue Ma The Hong Kong University of Science and Technology Tencent
  • Hongfa Wang Tsinghua University Tencent
  • Junkun Yuan Tencent
  • Wenzhe Zhao Tencent
  • Qi Tian Tencent
  • Hongmei Wang Tencent
  • Shaobo Min Tencent
  • Qifeng Chen Hong Kong University of Science and Technology
  • Wei Liu Tencent

DOI:

https://doi.org/10.1609/aaai.v39i2.32213

Abstract

This paper explores higher-resolution video outpainting with extensive content generation. We point out common issues faced by existing methods when attempting to largely outpaint videos: the generation of low-quality content and limitations imposed by GPU memory. To address these challenges, we propose a diffusion-based method called Infinite-Canvas. It builds upon two core designs. First, instead of employing the common practice of "single-shot" outpainting, we distribute the task across spatial windows and seamlessly merge them. It allows us to outpaint videos of any size and resolution without being constrained by GPU memory. Second, the source video and its relative positional relation are injected into the generation process of each window. It makes the generated spatial layout within each window harmonize with the source video. Coupling with these two designs enables us to generate higher-resolution outpainting videos with rich content while keeping spatial and temporal consistency. Infinite-Canvas excels in large-scale video outpainting, e.g., from 512 × 512 to 1152 × 2048 (9×), while producing high-quality and aesthetically pleasing results. It achieves the best quantitative results across various resolution and scale setups. The code is available at https://github.com/mayuelala/FollowYourCanvas.

Downloads

Published

2025-04-11

How to Cite

Chen, Q., Ma, Y., Wang, H., Yuan, J., Zhao, W., Tian, Q., Wang, H., Min, S., Chen, Q., & Liu, W. (2025). Infinite-Canvas: Higher-Resolution Video Outpainting with Extensive Content Generation. Proceedings of the AAAI Conference on Artificial Intelligence, 39(2), 2150-2158. https://doi.org/10.1609/aaai.v39i2.32213

Issue

Section

AAAI Technical Track on Computer Vision I