S²Flow: Towards Fast and Authentic Training-Free High-Resolution Video Generation

Authors

  • Chaoqun Wang School of Artificial Intelligence, South China Normal University
  • Shaobo Min University of Science and Technology of China
  • Xu Yang School of Electronic Engineering, Xidian University

DOI:

https://doi.org/10.1609/aaai.v40i12.37932

Abstract

Rectified flow models have shown strong potential in high-fidelity video generation, yet extending them to high-resolution remains challenging due to the high cost of full attention and error accumulation in the ODE-solving process. In this paper, we propose S^2Flow, a training-free framework that enables efficient and authentic high-resolution video generation by jointly exploring Flow-guided Sparse attention and Second-order ODE solution. Specifically, S^2Flow exploits and transfers the semantic and structural information from the low-resolution flow trajectory to guide the high-resolution flow in two aspects. First, S^2Flow dynamically captures the sparse patterns of the spatio-temporal attention maps from low-resolution videos to construct localized 3D windows, enabling efficient window attention in high-resolution inference. This can significantly reduce redundant computation while preserving contextual dependencies. Second, S^2Flow adopts a second-order ODE solver based on Taylor expansion, where the high-order derivative is approximated via central difference from the low-resolution flow, facilitating accurate high-resolution denoising. Extensive experiments on VBench dataset demonstrate that S^2Flow outperforms prior methods in both visual quality and inference speed, enabling 4x acceleration on 2560x1536 video generation.

Downloads

Published

2026-03-14

How to Cite

Wang, C., Min, S., & Yang, X. (2026). S²Flow: Towards Fast and Authentic Training-Free High-Resolution Video Generation. Proceedings of the AAAI Conference on Artificial Intelligence, 40(12), 9693–9701. https://doi.org/10.1609/aaai.v40i12.37932

Issue

Section

AAAI Technical Track on Computer Vision IX