LayerEdit: Disentangled Multi-Object Editing via Conflict-Aware Multi-Layer Learning

Authors

  • Fengyi Fu University of Science and Technology of China, Hefei, China
  • Mengqi Huang University of Science and Technology of China, Hefei, China
  • Lei Zhang University of Science and Technology of China, Hefei, China
  • Zhendong Mao University of Science and Technology of China, Hefei, China Institute of Artificial intelligence, Hefei Comprehensive National Science Center, Hefei, China

DOI:

https://doi.org/10.1609/aaai.v40i5.37403

Abstract

Text-driven multi-object image editing which aims to precisely modify multiple objects within an image based on text descriptions, has recently attracted considerable interest. Existing works primarily follow the localize-editing paradigm, focusing on independent object localization and editing while neglecting critical inter-object interactions. However, this work points out that the neglected attention entanglements in inter-object conflict regions, inherently hinder disentangled multi-object editing, leading to either inter-object editing leakage or intra-object editing constraints. We thereby propose a novel multi-layer disentangled editing framework LayerEdit, a training-free method which, for the first time, through precise object-layered decomposition and coherent fusion, enables conflict-free object-layered editing. Specifically, LayerEdit introduces a novel “decompose-editing-fusion” framework, consisting of: (1) Conflict-aware Layer Decomposition module, which utilizes an attention-aware IoU scheme and time-dependent region removing, to enhance conflict awareness and suppression for layer decomposition. (2) Object-layered Editing module, to establish coordinated intra-layer text guidance and cross-layer geometric mapping, achieving disentangled semantic and structural modifications. (3) Transparency-guided Layer Fusion module, to facilitate structure-coherent inter-object layer fusion through precise transparency guidance learning. Extensive experiments verify the superiority of LayerEdit over existing methods, showing unprecedented intra-object controllability and inter-object coherence in complex multi-object scenarios.

Downloads

Published

2026-03-14

How to Cite

Fu, F., Huang, M., Zhang, L., & Mao, Z. (2026). LayerEdit: Disentangled Multi-Object Editing via Conflict-Aware Multi-Layer Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 40(5), 4003–4011. https://doi.org/10.1609/aaai.v40i5.37403

Issue

Section

AAAI Technical Track on Computer Vision II