Meng, G., Wang, J., Wang, Q.-W., Ren, X., & Zhao, D. (2026). Imagine with Layout and Sketch: Enhancing Vision-Language Retrieval with Dual-Stream Multi-Modal Query Refinement. Proceedings of the AAAI Conference on Artificial Intelligence, 40(10), 7972-7980. https://doi.org/10.1609/aaai.v40i10.37742