Geng, S., Gao, P., Chatterjee, M., Hori, C., Le Roux, J., Zhang, Y., Li, H. and Cherian, A. (2021) “Dynamic Graph Representation Learning for Video Dialog via Multi-Modal Shuffled Transformers”, Proceedings of the AAAI Conference on Artificial Intelligence, 35(2), pp. 1415-1423. doi: 10.1609/aaai.v35i2.16231.