Training-and-Prompt-Free General Painterly Harmonization via Zero-Shot Disentenglement on Style and Content References

Authors

  • Teng-Fang Hsiao National Yang Ming Chiao Tung University
  • Bo-Kai Ruan National Yang Ming Chiao Tung University
  • Hong-Han Shuai National Yang Ming Chiao Tung University

DOI:

https://doi.org/10.1609/aaai.v39i4.32368

Abstract

Painterly image harmonization aims at seamlessly blending disparate visual elements within a single image. However, previous approaches often struggle due to limitations in training data or reliance on additional prompts, leading to inharmonious and content-disrupted output. To surmount these hurdles, we design a Training-and-prompt-Free General Painterly Harmonization method (TF-GPH). TF-GPH incorporates a novel “Similarity Disentangle Mask”, which disentangles the foreground content and background image by redirecting their attention to corresponding reference images, enhancing the attention mechanism for multi-image inputs. Additionally, we propose a “Similarity Reweighting” mechanism to balance harmonization between stylization and content preservation. This mechanism minimizes content disruption by prioritizing the content-similar features within the given background style reference. Finally, we address the deficiencies in existing benchmarks by proposing novel range-based evaluation metrics and a new benchmark to better reflect real-world applications. Extensive experiments demonstrate the efficacy of our method across benchmarks.

Published

2025-04-11

How to Cite

Hsiao, T.-F., Ruan, B.-K., & Shuai, H.-H. (2025). Training-and-Prompt-Free General Painterly Harmonization via Zero-Shot Disentenglement on Style and Content References. Proceedings of the AAAI Conference on Artificial Intelligence, 39(4), 3545-3553. https://doi.org/10.1609/aaai.v39i4.32368

Issue

Section

AAAI Technical Track on Computer Vision III