Spin: Diffusion-based Semantic Image Painting Through Independent Information Injection

Authors

  • Dantong Wu Shenzhen International Graduate School, Tsinghua University, China
  • Zhiqiang Chen Institute of Automation, Chinese Academy of Science, China
  • Tianjiao Du Shenzhen International Graduate School, Tsinghua University, China
  • Peipei Ran Media Technology Lab, Huawei, China
  • Mengchao Bai Media Technology Lab, Huawei, China
  • Kai Zhang Shenzhen International Graduate School, Tsinghua University, China

DOI:

https://doi.org/10.1609/aaai.v39i8.32901

Abstract

Diffusion models have been utilized as powerful tools for various image editing tasks, including semantic image painting (SIP), which aims to generate content within masked regions conditioned on a reference image or text. SIP, especially those using images as conditions, often suffers from three issues: semantic inconsistency, unnatural transitions, and style inconsistency, which significantly hinder its practical application. To address these challenges, we propose a novel Semantic Image Painting framework with INdependent INformation INjection (Spin). Specifically, we compute a saliency map to segregate the reference image into salient and non-salient components. We then filter out the non-salient information during the semantic embedding extraction phase and precisely inject the semantic embedding into the masked region instead of the whole image during the semantic generation phase. Furthermore, we impose an additional style guidance to promote style consistency between background and foreground. Experimental results demonstrate that Spin achieve superior semantic similarity and image coherence across various styles, including realistic, pencil drawings, cartoon, and oil painting. Additionally, Spin offers diversity and editability, and can be integrated into other models that meet our prerequisites.

Downloads

Published

2025-04-11

How to Cite

Wu, D., Chen, Z., Du, T., Ran, P., Bai, M., & Zhang, K. (2025). Spin: Diffusion-based Semantic Image Painting Through Independent Information Injection. Proceedings of the AAAI Conference on Artificial Intelligence, 39(8), 8351-8358. https://doi.org/10.1609/aaai.v39i8.32901

Issue

Section

AAAI Technical Track on Computer Vision VII