(1)

Yariv, G.; Gat, I.; Benaim, S.; Wolf, L.; Schwartz, I.; Adi, Y. Diverse and Aligned Audio-to-Video Generation via Text-to-Video Model Adaptation. AAAI 2024, 38, 6639-6647.