Yu, Jiaao, Shenwei Li, Mingjie Han, Yifei Yin, Wenzheng Song, Chenghao Jia, and Man Lan. 2026. “Activating Visual Context and Commonsense Reasoning Through Masked Prediction in VLMs”. Proceedings of the AAAI Conference on Artificial Intelligence 40 (33):27952-60. https://doi.org/10.1609/aaai.v40i33.40019.