Generative Model-Based Feature Knowledge Distillation for Action Recognition

Authors

  • Guiqin Wang Xi'an Jiaotong University
  • Peng Zhao Xi'an Jiaotong University
  • Yanjiang Shi Xi'an Jiaotong University
  • Cong Zhao Xi'an Jiaotong University
  • Shusen Yang Xi'an Jiaotong University

DOI:

https://doi.org/10.1609/aaai.v38i14.29473

Keywords:

ML: Learning on the Edge & Model Compression, CV: Applications, CV: Other Foundations of Computer Vision, CV: Video Understanding & Activity Analysis, KRR: Applications, ML: Applications

Abstract

Knowledge distillation (KD), a technique widely employed in computer vision, has emerged as a de facto standard for improving the performance of small neural networks. However, prevailing KD-based approaches in video tasks primarily focus on designing loss functions and fusing cross-modal information. This overlooks the spatial-temporal feature semantics, resulting in limited advancements in model compression. Addressing this gap, our paper introduces an innovative knowledge distillation framework, with the generative model for training a lightweight student model. In particular, the framework is organized into two steps: the initial phase is Feature Representation, wherein a generative model-based attention module is trained to represent feature semantics; Subsequently, the Generative-based Feature Distillation phase encompasses both Generative Distillation and Attention Distillation, with the objective of transferring attention-based feature semantics with the generative model. The efficacy of our approach is demonstrated through comprehensive experiments on diverse popular datasets, proving considerable enhancements in video action recognition task. Moreover, the effectiveness of our proposed framework is validated in the context of more intricate video action detection task. Our code is available at https://github.com/aaai-24/Generative-based-KD.

Published

2024-03-24

How to Cite

Wang, G., Zhao, P., Shi, Y., Zhao, C., & Yang, S. (2024). Generative Model-Based Feature Knowledge Distillation for Action Recognition. Proceedings of the AAAI Conference on Artificial Intelligence, 38(14), 15474–15482. https://doi.org/10.1609/aaai.v38i14.29473

Issue

Section

AAAI Technical Track on Machine Learning V