Zhong, C., Hou, Q., Zhou, Z., Zhang, Y., Hao, S., Lu, H., … Bai, X. (2026). OwlCap: Harmonizing Motion-Detail for Video Captioning via HMD-270K and Caption Set Equivalence Reward. Proceedings of the AAAI Conference on Artificial Intelligence, 40(16), 13503–13511. https://doi.org/10.1609/aaai.v40i16.38355