Holla, M., & Lourentzou, I. (2024). Commonsense for Zero-Shot Natural Language Video Localization. Proceedings of the AAAI Conference on Artificial Intelligence, 38(3), 2166–2174. https://doi.org/10.1609/aaai.v38i3.27989