Hu W, Xu Y, Li Y, Li W, Chen Z, Tu Z. BLIVA: A Simple Multimodal LLM for Better Handling of Text-Rich Visual Questions. AAAI [Internet]. 2024 Mar. 24 [cited 2026 May 9];38(3):2256-64. Available from: https://ojs.aaai.org/index.php/AAAI/article/view/27999