Deep Hidden Cognition Facilitates Reliable Chain-of-Thought Reasoning

Authors

  • Zijun Chen Hefei University of Technology
  • Wenbo Hu Hefei University of Technology
  • Richang Hong Hefei University of Technology

DOI:

https://doi.org/10.1609/aaai.v40i44.41061

Abstract

Chain of Thought (CoT) reasoning has demonstrated remarkable deep reasoning capabilities in both large language models (LLMs) and multimodal large language models (MLLMs). However, its reliability is often undermined by the accumulation of errors in intermediate steps. This paper proposes a novel approach to calibrating CoT reasoning accuracy by leveraging the model’s internal cognition of truthfulness. Our findings suggest that the model implicitly tracks the evolving veracity of intermediate steps throughout the dynamic, progressive reasoning process. We train a confidence predictor to quantify the model’s internal cognition of truthfulness at each reasoning step, enabling dynamic selection of the most plausible reasoning path through beam search. Experimental results demonstrate that our method significantly outperforms the state-of-the-art baselines (e.g., Self-Consistency, and PRM Guided Search) across the mathematical, symbolic, and commonsense reasoning tasks, exhibiting superior accuracy and reliability in both unimodal and multimodal settings. This study proposes a novel path toward improving the reliability of CoT reasoning, demonstrating strong potential for wide-ranging applications.

Published

2026-03-14

How to Cite

Chen, Z., Hu, W., & Hong, R. (2026). Deep Hidden Cognition Facilitates Reliable Chain-of-Thought Reasoning. Proceedings of the AAAI Conference on Artificial Intelligence, 40(44), 37295–37304. https://doi.org/10.1609/aaai.v40i44.41061

Issue

Section

AAAI Special Track on AI Alignment