Robust Knowledge Transfer via Hybrid Forward on the Teacher-Student Model


  • Liangchen Song University at Buffalo
  • Jialian Wu University at Buffalo
  • Ming Yang Horizon Robotics
  • Qian Zhang Horizon Robotics
  • Yuan Li Google
  • Junsong Yuan University at Buffalo


Learning & Optimization for CV, Transfer/Adaptation/Multi-task/Meta/Automated Learning


When adopting deep neural networks for a new vision task, a common practice is to start with fine-tuning some off-the-shelf well-trained network models from the community. Since a new task may require training a different network architecture with new domain data, taking advantage of off-the-shelf models is not trivial and generally requires considerable try-and-error and parameter tuning. In this paper, we denote a well-trained model as a teacher network and a model for the new task as a student network. We aim to ease the efforts of transferring knowledge from the teacher to the student network, robust to the gaps between their network architectures, domain data, and task definitions. Specifically, we propose a hybrid forward scheme in training the teacher-student models, alternately updating layer weights of the student model. The key merit of our hybrid forward scheme is on the dynamical balance between the knowledge transfer loss and task specific loss in training. We demonstrate the effectiveness of our method on a variety of tasks, e.g., model compression, segmentation, and detection, under a variety of knowledge transfer settings.




How to Cite

Song, L., Wu, J., Yang, M., Zhang, Q., Li, Y., & Yuan, J. (2021). Robust Knowledge Transfer via Hybrid Forward on the Teacher-Student Model. Proceedings of the AAAI Conference on Artificial Intelligence, 35(3), 2558-2566. Retrieved from



AAAI Technical Track on Computer Vision II