(1)

Chen, T.; Luo, J. Expressing Objects Just Like Words: Recurrent Visual Embedding for Image-Text Matching. AAAI 2020, 34, 10583-10590.