CareCom: Generative Image Composition with Calibrated Reference Features

Authors

  • Jiaxuan Chen MoE Key Lab of Artificial Intelligence, Shanghai Jiao Tong University
  • Bo Zhang MoE Key Lab of Artificial Intelligence, Shanghai Jiao Tong University
  • Qingdong He Youtu Lab, Tencent
  • Jinlong Peng Youtu Lab, Tencent
  • Li Niu MoE Key Lab of Artificial Intelligence, Shanghai Jiao Tong University miguo.ai

DOI:

https://doi.org/10.1609/aaai.v40i4.37278

Abstract

Image composition aims to seamlessly insert foreground object into background. Despite the huge progress in generative image composition, the existing methods are still struggling with simultaneous detail preservation and foreground pose/view adjustment. To address this issue, we extend the existing generative composition model to multi-reference version, which allows using arbitrary number of foreground reference images. Furthermore, we propose to calibrate the global and local features of foreground reference images to make them compatible with the background information. The calibrated reference features can supplement the original reference features with useful global and local information of proper pose/view. Extensive experiments on MVImgNet and MureCom demonstrate that the generative model can greatly benefit from the calibrated reference features.

Downloads

Published

2026-03-14

How to Cite

Chen, J., Zhang, B., He, Q., Peng, J., & Niu, L. (2026). CareCom: Generative Image Composition with Calibrated Reference Features. Proceedings of the AAAI Conference on Artificial Intelligence, 40(4), 2877-2885. https://doi.org/10.1609/aaai.v40i4.37278

Issue

Section

AAAI Technical Track on Computer Vision I