Zhang, Taolin, Sunan He, Tao Dai, Zhi Wang, Bin Chen, and Shu-Tao Xia. 2024. “Vision-Language Pre-Training With Object Contrastive Learning for 3D Scene Understanding”. Proceedings of the AAAI Conference on Artificial Intelligence 38 (7):7296-7304. https://doi.org/10.1609/aaai.v38i7.28559.