[1]
C. Zhong, “OwlCap: Harmonizing Motion-Detail for Video Captioning via HMD-270K and Caption Set Equivalence Reward”, AAAI, vol. 40, no. 16, pp. 13503–13511, Mar. 2026.