Rocket Launching: A Universal and Efficient Framework for Training Well-Performing Light Net

Authors

  • Guorui Zhou Alibaba Inc
  • Ying Fan Alibaba Inc
  • Runpeng Cui Tsinghua University
  • Weijie Bian Alibaba Inc
  • Xiaoqiang Zhu Alibaba Inc
  • Kun Gai Alibaba Inc

DOI:

https://doi.org/10.1609/aaai.v32i1.11601

Keywords:

Model Compressing, Neural Networks, Knowledge Distillation

Abstract

Models applied on real time response tasks, like click-through rate (CTR) prediction model, require high accuracy and rigorous response time. Therefore, top-performing deep models of high depth and complexity are not well suited for these applications with the limitations on the inference time. In order to get neural networks of better performance given the time limitations, we propose a universal framework that exploits a booster net to help train the lightweight net for prediction. We dub the whole process rocket launching, where the booster net is used to guide the learning of our light net throughout the whole training process. We analyze different loss functions aiming at pushing the light net to behave similarly to the booster net. Besides, we use one technique called gradient block to improve the performance of light net and booster net further. Experiments on benchmark datasets and real-life industrial advertisement data show the effectiveness of our proposed method.

Downloads

Published

2018-04-29

How to Cite

Zhou, G., Fan, Y., Cui, R., Bian, W., Zhu, X., & Gai, K. (2018). Rocket Launching: A Universal and Efficient Framework for Training Well-Performing Light Net. Proceedings of the AAAI Conference on Artificial Intelligence, 32(1). https://doi.org/10.1609/aaai.v32i1.11601