(1)

Yu, Y.; Cao, C.; Zhang, Y.; Lv, Q.; Min, L.; Zhang, Y. Building a Multi-Modal Spatiotemporal Expert for Zero-Shot Action Recognition With CLIP. AAAI 2025, 39, 9689-9697.