Zhang, Y. (2022) “Contrastive Spatio-Temporal Pretext Learning for Self-Supervised Video Representation”, Proceedings of the AAAI Conference on Artificial Intelligence, 36(3), pp. 3380–3389. doi: 10.1609/aaai.v36i3.20248.