Proxy Learning of Visual Concepts of Fine Art Paintings from Styles through Language Models


  • Diana Kim Rutgers University
  • Ahmed Elgammal Rutgers University
  • Marian Mazzone College of Charleston



Domain(s) Of Application (APP), Machine Learning (ML), Computer Vision (CV)


We present a machine learning system that can quantify fine art paintings with a set of visual elements and principles of art. The formal analysis is fundamental for understanding art, but developing such a system is challenging. Paintings have high visual complexities, but it is also difficult to collect enough training data with direct labels. To resolve these practical limitations, we introduce a novel mechanism, called proxy learning, which learns visual concepts in paintings through their general relation to styles. This framework does not require any visual annotation, but only uses style labels and a general relationship between visual concepts and style. In this paper, we propose a novel proxy model and reformulate four pre-existing methods in the context of proxy learning. Through quantitative and qualitative comparison, we evaluate these methods and compare their effectiveness in quantifying the artistic visual concepts, where the general relationship is estimated by language models; GloVe or BERT. The language modeling is a practical and scalable solution requiring no labeling, but it is inevitably imperfect. We demonstrate how the new proxy model is robust to the imperfection, while the other methods are sensitively affected by it.




How to Cite

Kim, D., Elgammal, A., & Mazzone, M. (2022). Proxy Learning of Visual Concepts of Fine Art Paintings from Styles through Language Models. Proceedings of the AAAI Conference on Artificial Intelligence, 36(4), 4513-4522.



AAAI Technical Track on Domain(s) Of Application