StyleFM: Frequency Manipulation Empowered by Recursive Attention on Diffusion Models for Arbitrary Style Transfer

Yingnan Ma; Zhenye Liu; Siying Liu; Anup Basu

doi:10.1609/aaai.v40i10.37730

Authors

Yingnan Ma University of Alberta
Zhenye Liu University of Alberta
Siying Liu University of Alberta
Anup Basu University of Alberta

DOI:

https://doi.org/10.1609/aaai.v40i10.37730

Abstract

Given the remarkable performance of diffusion models in image generation, recent research has been exploring their adaptation to style transfer. However, current diffusion-based approaches encounter persistent challenges, such as style distortions and the reliance on textual prompts for content preservation. To address these limitations, we introduce StyleFM, a novel training-free diffusion-based style transfer approach that incorporates optimization strategies into both the frequency and temporal domains. The proposed method provides two core innovations: (1) Tripartite Frequency Manipulation: To more precisely tailor frequency manipulation, StyleFM introduces a tripartite frequency design with a buffer band accounting for the overlap of content and style representations. In addition, StyleFM designs a frequency superposition editing method to achieve frequency enhancement. (2) Recursive Attention: StyleFM proposes the recursive attention strategy within the diffusion process, which facilitates the progressive and consistent injection of style information throughout the temporal process without reliance on text guidance. Experiments demonstrate that StyleFM outperforms state-of-the-art methods. It effectively preserves content fidelity while achieving sufficient style embedding.

StyleFM: Frequency Manipulation Empowered by Recursive Attention on Diffusion Models for Arbitrary Style Transfer

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information