Mañas, O., Krojer, B., & Agrawal, A. (2024). Improving Automatic VQA Evaluation Using Large Language Models. Proceedings of the AAAI Conference on Artificial Intelligence, 38(5), 4171–4179. https://doi.org/10.1609/aaai.v38i5.28212