Zhang, X.-Y., Shi, H., Li, C., & Li, P. (2020). Multi-Instance Multi-Label Action Recognition and Localization Based on Spatio-Temporal Pre-Trimming for Untrimmed Videos. Proceedings of the AAAI Conference on Artificial Intelligence, 34(07), 12886-12893. https://doi.org/10.1609/aaai.v34i07.6986