Tian, Y., Ma, T., Xie, L., & Ye, Q. (2025). ChatterBox: Multimodal Referring and Grounding with Chain-of-Questions. Proceedings of the AAAI Conference on Artificial Intelligence, 39(7), 7401–7409. https://doi.org/10.1609/aaai.v39i7.32796