Firing Bits Where It Matters: Spiking-Guided Just Recognizable Distortion Modeling for Machine-Centric Video Coding

Authors

  • Wuyuan Xie College of Computer Science & Software Engineering, Shenzhen University
  • Zhenming Li College of Computer Science & Software Engineering, Shenzhen University
  • Yuwu Lu School of Artificial Intelligence, South China Normal University
  • Di Lin College of Intelligence and Computing, Tianjing University
  • Yun Song School of Computer Science and Technology, Changsha University of Science and Technology
  • Miaohui Wang Guangdong Key Laboratory of Intelligent Information Processing, Shenzhen University

DOI:

https://doi.org/10.1609/aaai.v40i22.38935

Abstract

Just recognizable distortion (JRD) has emerged as a promising paradigm for machine-centric video coding. However, existing JRD-guided coding methods are limited by coarse annotation granularity and high computational cost, which hinder their deployment. In this paper, we first investigate the impact of different JRD annotation strategies on downstream task performance. By incorporating both instance-level and contextual information, we construct a new JRD dataset with fine-grained annotations compatible with object detection and instance segmentation tasks. To enhance quantization parameter (QP) map prediction while maintaining computational efficiency, we propose a novel spiking neural network (SNN)-based framework that decomposes video frames into spatial structures, channel interactions, and temporal patterns. Furthermore, we introduce a spiking attention mechanism to aggregate task-relevant features and employ adaptive scaling vectors to suppress machine-perceived redundancy, enabling targeted bitrate allocation aligned with task-critical content. Extensive experiments on multiple datasets and backbones demonstrate that our approach consistently outperforms state-of-the-art codec-based and JRD-guided methods in maintaining task performance at ultra-low bitrates, while significantly reducing computational overhead.

Published

2026-03-14

How to Cite

Xie, W., Li, Z., Lu, Y., Lin, D., Song, Y., & Wang, M. (2026). Firing Bits Where It Matters: Spiking-Guided Just Recognizable Distortion Modeling for Machine-Centric Video Coding. Proceedings of the AAAI Conference on Artificial Intelligence, 40(22), 18674-18682. https://doi.org/10.1609/aaai.v40i22.38935

Issue

Section

AAAI Technical Track on Intelligent Robotics