User-Controllable Arbitrary Style Transfer via Entropy Regularization

Authors

  • Jiaxin Cheng USC Information Sciences Institute
  • Yue Wu Amazon Alexa Natural Understanding
  • Ayush Jaiswal Amazon Alexa Natural Understanding
  • Xu Zhang Amazon Alexa Natural Understanding
  • Pradeep Natarajan Amazon Alexa Natural Understanding
  • Prem Natarajan Amazon Alexa Natural Understanding

DOI:

https://doi.org/10.1609/aaai.v37i1.25117

Keywords:

CV: Applications, CV: Computational Photography, Image & Video Synthesis

Abstract

Ensuring the overall end-user experience is a challenging task in arbitrary style transfer (AST) due to the subjective nature of style transfer quality. A good practice is to provide users many instead of one AST result. However, existing approaches require to run multiple AST models or inference a diversified AST (DAST) solution multiple times, and thus they are either slow in speed or limited in diversity. In this paper, we propose a novel solution ensuring both efficiency and diversity for generating multiple user-controllable AST results by systematically modulating AST behavior at run-time. We begin with reformulating three prominent AST methods into a unified assign-and-mix problem and discover that the entropies of their assignment matrices exhibit a large variance. We then solve the unified problem in an optimal transport framework using the Sinkhorn-Knopp algorithm with a user input ε to control the said entropy and thus modulate stylization. Empirical results demonstrate the superiority of the proposed solution, with speed and stylization quality comparable to or better than existing AST and significantly more diverse than previous DAST works. Code is available at https://github.com/cplusx/eps-Assign-and-Mix.

Downloads

Published

2023-06-26

How to Cite

Cheng, J., Wu, Y., Jaiswal, A., Zhang, X., Natarajan, P., & Natarajan, P. (2023). User-Controllable Arbitrary Style Transfer via Entropy Regularization. Proceedings of the AAAI Conference on Artificial Intelligence, 37(1), 433-441. https://doi.org/10.1609/aaai.v37i1.25117

Issue

Section

AAAI Technical Track on Computer Vision I