RetouchGPT: LLM-based Interactive High-Fidelity Face Retouching via Imperfection Prompting

Authors

  • Wen Xue School of Computer Science and Engineering, South China University of Technology
  • Chun Ding School of Computer Science and Engineering, South China University of Technology
  • Ruotao Xu Institute of Super Robotics (Huangpu)
  • Si Wu School of Computer Science and Engineering, South China University of Technology Institute of Super Robotics (Huangpu)
  • Yong Xu School of Computer Science and Engineering, South China University of Technology
  • Hau-San Wong Department of Computer Science, City University of Hong Kong

DOI:

https://doi.org/10.1609/aaai.v39i9.32980

Abstract

Face retouching aims to remove facial imperfections from image and videos while at the same time preserving face attributes. The existing methods are designed to perform non-interactive end-to-end retouching, while the ability to interact with users is highly demanded in downstream applications. In this paper, we propose RetouchGPT, a novel framework that leverages Large Language Models (LLMs) to guide the interactive retouching process. Towards this end, we design an instruction-driven imperfection prediction module to accurately identify imperfections by integrating textual and visual features. To learn imperfection prompts, we further incorporate a LLM-based embedding module to fuse multi-modal conditioning information. The prompt-based feature modification is performed in each transformer block, such that the imperfection features are suppressed and replaced with the features of normal skin progressively. Extensive experiments have been performed to verify effectiveness of our design elements and demonstrate that RetouchGPT is a useful tool for interactive face retouching and achieves superior performance over state-of-the-arts.

Published

2025-04-11

How to Cite

Xue, W., Ding, C., Xu, R., Wu, S., Xu, Y., & Wong, H.-S. (2025). RetouchGPT: LLM-based Interactive High-Fidelity Face Retouching via Imperfection Prompting. Proceedings of the AAAI Conference on Artificial Intelligence, 39(9), 9059–9067. https://doi.org/10.1609/aaai.v39i9.32980

Issue

Section

AAAI Technical Track on Computer Vision VIII