Liu, L. (2026) “Do Large Language Models Reason About Uncertainty Like Humans? A Benchmark on Hurricane Forecast Visualization Comprehension”, Proceedings of the AAAI Conference on Artificial Intelligence, 40(21), pp. 17571–17579. doi: 10.1609/aaai.v40i21.38812.