Firing Bits Where It Matters: Spiking-Guided Just Recognizable Distortion Modeling for Machine-Centric Video Coding

Wuyuan Xie; Zhenming Li; Yuwu Lu; Di Lin; Yun Song; Miaohui Wang

doi:10.1609/aaai.v40i22.38935

Authors

Wuyuan Xie College of Computer Science & Software Engineering, Shenzhen University
Zhenming Li College of Computer Science & Software Engineering, Shenzhen University
Yuwu Lu School of Artificial Intelligence, South China Normal University
Di Lin College of Intelligence and Computing, Tianjing University
Yun Song School of Computer Science and Technology, Changsha University of Science and Technology
Miaohui Wang Guangdong Key Laboratory of Intelligent Information Processing, Shenzhen University

DOI:

https://doi.org/10.1609/aaai.v40i22.38935

Abstract

Just recognizable distortion (JRD) has emerged as a promising paradigm for machine-centric video coding. However, existing JRD-guided coding methods are limited by coarse annotation granularity and high computational cost, which hinder their deployment. In this paper, we first investigate the impact of different JRD annotation strategies on downstream task performance. By incorporating both instance-level and contextual information, we construct a new JRD dataset with fine-grained annotations compatible with object detection and instance segmentation tasks. To enhance quantization parameter (QP) map prediction while maintaining computational efficiency, we propose a novel spiking neural network (SNN)-based framework that decomposes video frames into spatial structures, channel interactions, and temporal patterns. Furthermore, we introduce a spiking attention mechanism to aggregate task-relevant features and employ adaptive scaling vectors to suppress machine-perceived redundancy, enabling targeted bitrate allocation aligned with task-critical content. Extensive experiments on multiple datasets and backbones demonstrate that our approach consistently outperforms state-of-the-art codec-based and JRD-guided methods in maintaining task performance at ultra-low bitrates, while significantly reducing computational overhead.

Firing Bits Where It Matters: Spiking-Guided Just Recognizable Distortion Modeling for Machine-Centric Video Coding

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information