[1]

Wang, D. and Xiong, D. 2021. Efficient Object-Level Visual Context Modeling for Multimodal Machine Translation: Masking Irrelevant Objects Helps Grounding. Proceedings of the AAAI Conference on Artificial Intelligence. 35, 4 (May 2021), 2720-2728. DOI:https://doi.org/10.1609/aaai.v35i4.16376.