[1]
J. S. Lee, J. Kim, J. Na, J. Park, and H. J. Kim, “VidChain: Chain-of-Tasks with Metric-based Direct Preference Optimization for Dense Video Captioning”, AAAI, vol. 39, no. 4, pp. 4499–4507, Apr. 2025.