Chen, Junyi, et al. “EVE: Efficient Vision-Language Pre-Training With Masked Prediction and Modality-Aware MoE”. Proceedings of the AAAI Conference on Artificial Intelligence, vol. 38, no. 2, Mar. 2024, pp. 1110-9, doi:10.1609/aaai.v38i2.27872.