Liu, Le, Yuhao Wang, Bohan Shen, Wei Zeng, Shizhou Zhang, Di Xu, and Peng Wang. 2026. “Do Large Language Models Reason About Uncertainty Like Humans? A Benchmark on Hurricane Forecast Visualization Comprehension”. Proceedings of the AAAI Conference on Artificial Intelligence 40 (21):17571-79. https://doi.org/10.1609/aaai.v40i21.38812.