VQ-Insight: Teaching VLMs for AI-Generated Video Quality Understanding via Progressive Visual Reinforcement Learning

Xuanyu Zhang; Weiqi Li; Shijie Zhao; Junlin Li; Li Zhang; Jian Zhang

doi:10.1609/aaai.v40i15.38285

Authors

Xuanyu Zhang School of Electronic and Computer Engineering, Peking University ByteDance Inc.
Weiqi Li School of Electronic and Computer Engineering, Peking University
Shijie Zhao ByteDance Inc.
Junlin Li ByteDance Inc.
Li Zhang ByteDance Inc.
Jian Zhang School of Electronic and Computer Engineering, Peking University

DOI:

https://doi.org/10.1609/aaai.v40i15.38285

Abstract

Recent advances in AI-generated content (AIGC) have led to the emergence of powerful text-to-video generation models. Despite these successes, evaluating the quality of AIGC-generated videos remains challenging due to limited generalization, lack of temporal awareness, heavy reliance on large-scale annotated datasets, and the lack of effective interaction with generation models. Most current approaches rely on supervised fine-tuning of vision-language models (VLMs), which often require large-scale annotated datasets and tend to decouple understanding and generation. To address these shortcomings, we propose VQ-Insight, a novel reasoning-style VLM framework for AIGC video quality assessment. Our approach features: (1) a progressive video quality learning scheme that combines image quality warm-up, general task-specific temporal learning, and joint optimization with the video generation model; (2) the design of multi-dimension scoring rewards, preference comparison rewards, and temporal modeling rewards to enhance both generalization and specialization in video quality evaluation. Extensive experiments demonstrate that VQ-Insight consistently outperforms state-of-the-art baselines in preference comparison, multi-dimension scoring, and natural video scoring, bringing significant improvements for video generation tasks.

VQ-Insight: Teaching VLMs for AI-Generated Video Quality Understanding via Progressive Visual Reinforcement Learning

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information