Cross-Lingual Low-Resource Set-to-Description Retrieval for Global E-Commerce


  • Juntao Li Peking University
  • Chang Liu Peking University
  • Jian Wang Alibaba Group
  • Lidong Bing Alibaba Group
  • Hongsong Li Alibaba Group
  • Xiaozhong Liu Indiana University
  • Dongyan Zhao Peking University
  • Rui Yan Peking University



With the prosperous of cross-border e-commerce, there is an urgent demand for designing intelligent approaches for assisting e-commerce sellers to offer local products for consumers from all over the world. In this paper, we explore a new task of cross-lingual information retrieval, i.e., cross-lingual set-to-description retrieval in cross-border e-commerce, which involves matching product attribute sets in the source language with persuasive product descriptions in the target language. We manually collect a new and high-quality paired dataset, where each pair contains an unordered product attribute set in the source language and an informative product description in the target language. As the dataset construction process is both time-consuming and costly, the new dataset only comprises of 13.5k pairs, which is a low-resource setting and can be viewed as a challenging testbed for model development and evaluation in cross-border e-commerce. To tackle this cross-lingual set-to-description retrieval task, we propose a novel cross-lingual matching network (CLMN) with the enhancement of context-dependent cross-lingual mapping upon the pre-trained monolingual BERT representations. Experimental results indicate that our proposed CLMN yields impressive results on the challenging task and the context-dependent cross-lingual mapping on BERT yields noticeable improvement over the pre-trained multi-lingual BERT model.




How to Cite

Li, J., Liu, C., Wang, J., Bing, L., Li, H., Liu, X., Zhao, D., & Yan, R. (2020). Cross-Lingual Low-Resource Set-to-Description Retrieval for Global E-Commerce. Proceedings of the AAAI Conference on Artificial Intelligence, 34(05), 8212-8219.



AAAI Technical Track: Natural Language Processing