Han, Zhizhong, Mingyang Shang, Xiyang Wang, Yu-Shen Liu, and Matthias Zwicker. “Y2Seq2Seq: Cross-Modal Representation Learning for 3D Shape and Text by Joint Reconstruction and Prediction of View and Word Sequences”. Proceedings of the AAAI Conference on Artificial Intelligence 33, no. 01 (July 17, 2019): 126-133. Accessed August 31, 2024. https://ojs.aaai.org/index.php/AAAI/article/view/3777.