TweezeEdit: Consistent and Efficient Image Editing with Path Regularization

Authors

  • Jianda Mao The Hong Kong University of Science and Technology
  • Kaibo Wang The Hong Kong University of Science and Technology
  • Yang Xiang The Hong Kong University of Science and Technology
  • Kani Chen The Hong Kong University of Science and Technology

DOI:

https://doi.org/10.1609/aaai.v40i10.37738

Abstract

Recent progress in training-free image editing has enabled existing text-to-image diffusion models to be directly adapted into text-guided image editors without additional training. However, existing methods often over-align with target prompts while inadequately preserving source image semantics. These approaches generate target images explicitly or implicitly from the inversion noise of the source images, termed the inversion anchors. We identify this strategy as suboptimal for semantic preservation and inefficient due to elongated editing paths. We propose TweezeEdit, a tuning- and inversion-free framework for consistent and efficient image editing. Our method addresses these limitations by regularizing the entire denoising path rather than relying solely on the inversion anchors, ensuring source semantic retention and shortening editing paths. Guided by gradient-driven regularization, we efficiently inject target prompt semantics along a direct path using a consistency model. Extensive experiments demonstrate TweezeEdit's superior performance in semantic preservation and target alignment, outperforming existing methods. Remarkably, it requires only 12 steps (1.6 seconds per edit), underscoring its potential for real-time applications. The appendix is available in the extended version.

Downloads

Published

2026-03-14

How to Cite

Mao, J., Wang, K., Xiang, Y., & Chen, K. (2026). TweezeEdit: Consistent and Efficient Image Editing with Path Regularization. Proceedings of the AAAI Conference on Artificial Intelligence, 40(10), 7936-7944. https://doi.org/10.1609/aaai.v40i10.37738

Issue

Section

AAAI Technical Track on Computer Vision VII