Toward Controllable and Trustworthy LLM Reasoning: From Failure Mapping to Cognition-inspired Control and Real-world Impact
DOI:
https://doi.org/10.1609/aaai.v40i47.41366Abstract
Large Language Models (LLMs) have advanced rapidly and raised the bar for what AI is expected to do. However, accompanied with such progress is a stronger consensus that these models consistently fail in out-of-distribution reasoning, especially on tasks that require abstraction, transfer, or long-horizon planning. While acceptable for most consumer use, these issues prevent AI from being safely deployed in high-stakes settings (e.g., healthcare), where stakeholders cannot trust AI models that exhibit uncontrollable and unpredictable failures. In this talk, I will discuss our work and insights on how to make LLM reasoning controllable and trustworthy, by 1) understanding the mechanisms of LLM reasoning and predicting when LLM will fail; 2) improving model reasoning and generalization based on such insights; and 3) moving towards trustworthy AI applications through such improvements, and identifying new problems to form a healthy positive-feedback loop.Downloads
Published
2026-03-14
How to Cite
Zhou, B. (2026). Toward Controllable and Trustworthy LLM Reasoning: From Failure Mapping to Cognition-inspired Control and Real-world Impact. Proceedings of the AAAI Conference on Artificial Intelligence, 40(47), 39849–39850. https://doi.org/10.1609/aaai.v40i47.41366
Issue
Section
New Faculty Highlights