(1)
Huang, Y.; Tang, J.; Chen, Z.; Zhang, R.; Zhang, X.; Chen, W.; Zhao, Z.; Zhao, Z.; Lv, T.; Hu, Z. Structure-CLIP: Towards Scene Graph Knowledge to Enhance Multi-Modal Structured Representations. AAAI 2024, 38, 2417-2425.