Zhang, D., Li, C., Zhang, R., Xie, S., Xue, W., Xie, X., & Zhang, S. (2024). FM-OV3D: Foundation Model-Based Cross-Modal Knowledge Blending for Open-Vocabulary 3D Detection. Proceedings of the AAAI Conference on Artificial Intelligence, 38(15), 16723–16731. https://doi.org/10.1609/aaai.v38i15.29612