StyleFM: Frequency Manipulation Empowered by Recursive Attention on Diffusion Models for Arbitrary Style Transfer

Authors

  • Yingnan Ma University of Alberta
  • Zhenye Liu University of Alberta
  • Siying Liu University of Alberta
  • Anup Basu University of Alberta

DOI:

https://doi.org/10.1609/aaai.v40i10.37730

Abstract

Given the remarkable performance of diffusion models in image generation, recent research has been exploring their adaptation to style transfer. However, current diffusion-based approaches encounter persistent challenges, such as style distortions and the reliance on textual prompts for content preservation. To address these limitations, we introduce StyleFM, a novel training-free diffusion-based style transfer approach that incorporates optimization strategies into both the frequency and temporal domains. The proposed method provides two core innovations: (1) Tripartite Frequency Manipulation: To more precisely tailor frequency manipulation, StyleFM introduces a tripartite frequency design with a buffer band accounting for the overlap of content and style representations. In addition, StyleFM designs a frequency superposition editing method to achieve frequency enhancement. (2) Recursive Attention: StyleFM proposes the recursive attention strategy within the diffusion process, which facilitates the progressive and consistent injection of style information throughout the temporal process without reliance on text guidance. Experiments demonstrate that StyleFM outperforms state-of-the-art methods. It effectively preserves content fidelity while achieving sufficient style embedding.

Downloads

Published

2026-03-14

How to Cite

Ma, Y., Liu, Z., Liu, S., & Basu, A. (2026). StyleFM: Frequency Manipulation Empowered by Recursive Attention on Diffusion Models for Arbitrary Style Transfer. Proceedings of the AAAI Conference on Artificial Intelligence, 40(10), 7865-7873. https://doi.org/10.1609/aaai.v40i10.37730

Issue

Section

AAAI Technical Track on Computer Vision VII