Hierarchical Frequency-Guided Alignment Transformer for Compressed Video Quality Enhancement
DOI:
https://doi.org/10.1609/aaai.v40i10.37784Abstract
During the video encoding process, the original spatial domain signal is first transformed into the frequency domain, followed by quantization and compression. As a result, the quality degradation in compressed videos primarily stems from distortions in the frequency domain information. However, existing video enhancement methods typically directly fuse information from adjacent frames in the spatial domain, making it difficult for models to effectively compensate for frequency domain distortions, which leads to suboptimal detail restoration. To address this issue, we propose a Hierarchical Frequency-Guided Alignment Transformer. Additionally, by analyzing the characteristics of the frequency domain, we find that different frequency bands exhibit both correlations and a certain degree of independence. Based on this, we introduce a Frequency-Aware Transformer module that employs a combination of independent and mixed processing to optimize information exchange across different frequency domains, effectively mitigating cross-interference from irrelevant information. Experimental results demonstrate that, compared to existing methods, our approach achieves state-of-the-art performance in objective metrics (PSNR/SSIM), perceptual quality (LPIPS), and subjective visual effects, while reducing model complexity.Downloads
Published
2026-03-14
How to Cite
Peng, L., Li, S., Gao, Y., Ye, M., & Lv, C. (2026). Hierarchical Frequency-Guided Alignment Transformer for Compressed Video Quality Enhancement. Proceedings of the AAAI Conference on Artificial Intelligence, 40(10), 8349-8357. https://doi.org/10.1609/aaai.v40i10.37784
Issue
Section
AAAI Technical Track on Computer Vision VII