(1)
Wu, D.; Jiang, L.; Fang, R.; , B.; Xie, H.; Su, H.; Huang, H.; He, Z.; Song, S.; Li, X. Introducing Visual Scenes and Reasoning: A More Realistic Benchmark for Spoken Language Understanding. AAAI 2026, 40, 33899-33907.