[1]

P. Sarkar and A. Etemad, “XKD: Cross-Modal Knowledge Distillation with Domain Alignment for Video Representation Learning”, AAAI, vol. 38, no. 13, pp. 14875–14885, Mar. 2024.