Rep Deep & Machine Learning: Exemplar-Free Continual Video Action Recognition via Slow-Fast Collaborative Learning

Authors

  • Xueyi Zhang National University of Defense Technology Shenzhen Loop Area Institute The Chinese University of Hong Kong, Shenzhen
  • Chengwei Zhang University of the Chinese Academy of Sciences
  • Zheng Li National University of Defense Technology
  • Xiyu Wang The Chinese University of Hong Kong, Shenzhen
  • Siqi Cai Harbin Institute of Technology, Shenzhen Shenzhen Loop Area Institute
  • Mingrui Lao National University of Defense Technology
  • Yanming Guo National University of Defense Technology
  • Huiping Zhuang South China University of Technology

DOI:

https://doi.org/10.1609/aaai.v40i42.40924

Abstract

In real-world applications, video action recognition models must continuously learn new action categories while retaining previously acquired knowledge. However, most existing approaches rely on storing historical data for replay, which introduces storage burdens and raises data privacy concerns. To address these challenges, we investigate the problem of Exemplar-Free Continual Video Action Recognition (EF-CVAR) and propose a novel framework named Slow-Fast Collaborative Learning (SFCL). SFCL integrates two complementary learning paradigms: a slow branch based on gradient-driven deep learning, which provides strong adaptability to new tasks, and a fast branch based on analytic learning (e.g., Recursive Least Squares), which efficiently preserves old knowledge without requiring access to past samples. To enable effective collaboration between the two branches, we design the Slow-Fast Dynamic Re-parameterization (SFDR) mechanism for adaptive fusion, and the Knowledge Reflection Mechanism (KRM), which mitigates forgetting and task-recency bias via pseudo-feature generation and dual-level knowledge distillation. Extensive experiments on UCF101, HMDB51, and Something-Something V2 demonstrate that SFCL achieves superior performance compared to existing replay-based methods, despite being exemplar-free. Notably, in long-duration continual learning scenarios, SFCL exhibits remarkable robustness, achieving up to a 30.39\% improvement in accuracy over baselines while maintaining a low forgetting rate, highlighting its scalability and effectiveness in real-world video recognition tasks.

Downloads

Published

2026-03-14

How to Cite

Zhang, X., Zhang, C., Li, Z., Wang, X., Cai, S., Lao, M., … Zhuang, H. (2026). Rep Deep & Machine Learning: Exemplar-Free Continual Video Action Recognition via Slow-Fast Collaborative Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 40(42), 36075–36083. https://doi.org/10.1609/aaai.v40i42.40924

Issue

Section

AAAI Technical Track on Philosophy and Ethics of AI