Liu L, Wang Y, Shen B, Zeng W, Zhang S, Xu D, et al. Do Large Language Models Reason About Uncertainty Like Humans? A Benchmark on Hurricane Forecast Visualization Comprehension. AAAI [Internet]. 2026 Mar. 14 [cited 2026 May 26];40(21):17571-9. Available from: https://ojs.aaai.org/index.php/AAAI/article/view/38812