Action Recognition With Coarse-to-Fine Deep Feature Integration and Asynchronous Fusion


  • Weiyao Lin Shanghai Jiao Tong University
  • Chongyang Zhang Shanghai Jiao Tong University
  • Ke Lu University of Chinese Academy of Sciences
  • Bin Sheng Shanghai Jiao Tong University
  • Jianxin Wu Nanjing University
  • Bingbing Ni Shanghai Jiao Tong University
  • Xin Liu Shenzhen Tencent Computer System Co.
  • Hongkai Xiong Shanghai Jiao Tong University



Action Recognition, Action Granularity, Asychronuous Fusion


Action recognition is an important yet challenging task in computer vision. In this paper, we propose a novel deep-based framework for action recognition, which improves the recognition accuracy by: 1) deriving more precise features for representing actions, and 2) reducing the asynchrony between different information streams. We first introduce a coarse-to-fine network which extracts shared deep features at different action class granularities and progressively integrates them to obtain a more accurate feature representation for input actions. We further introduce an asynchronous fusion network. It fuses information from different streams by asynchronously integrating stream-wise features at different time points, hence better leveraging the complementary information in different streams. Experimental results on action recognition benchmarks demonstrate that our approach achieves the state-of-the-art performance.




How to Cite

Lin, W., Zhang, C., Lu, K., Sheng, B., Wu, J., Ni, B., Liu, X., & Xiong, H. (2018). Action Recognition With Coarse-to-Fine Deep Feature Integration and Asynchronous Fusion. Proceedings of the AAAI Conference on Artificial Intelligence, 32(1).