Bi, H. (2026) “F2RVLM: Boosting Fine-grained Fragment Retrieval for Multi-Modal Long-form Dialogue with Vision Language Model”, Proceedings of the AAAI Conference on Artificial Intelligence, 40(17), pp. 14493–14501. doi: 10.1609/aaai.v40i17.38466.