He, Dongliang, Xiang Zhao, Jizhou Huang, Fu Li, Xiao Liu, and Shilei Wen. 2019. “Read, Watch, and Move: Reinforcement Learning for Temporally Grounding Natural Language Descriptions in Videos”. Proceedings of the AAAI Conference on Artificial Intelligence 33 (01):8393-8400. https://doi.org/10.1609/aaai.v33i01.33018393.