Yu J, Li S, Han M, Yin Y, Song W, Jia C, et al. Activating Visual Context and Commonsense Reasoning Through Masked Prediction in VLMs. AAAI [Internet]. 2026 Mar. 14 [cited 2026 May 11];40(33):27952-60. Available from: https://ojs.aaai.org/index.php/AAAI/article/view/40019