Zhang, W. (2020) “Learning Long- and Short-Term User Literal-Preference with Multimodal Hierarchical Transformer Network for Personalized Image Caption”, Proceedings of the AAAI Conference on Artificial Intelligence, 34(05), pp. 9571–9578. doi: 10.1609/aaai.v34i05.6503.