Deep Hidden Cognition Facilitates Reliable Chain-of-Thought Reasoning

Zijun Chen; Wenbo Hu; Richang Hong

doi:10.1609/aaai.v40i44.41061

Authors

Zijun Chen Hefei University of Technology
Wenbo Hu Hefei University of Technology
Richang Hong Hefei University of Technology

DOI:

https://doi.org/10.1609/aaai.v40i44.41061

Abstract

Chain of Thought (CoT) reasoning has demonstrated remarkable deep reasoning capabilities in both large language models (LLMs) and multimodal large language models (MLLMs). However, its reliability is often undermined by the accumulation of errors in intermediate steps. This paper proposes a novel approach to calibrating CoT reasoning accuracy by leveraging the model’s internal cognition of truthfulness. Our findings suggest that the model implicitly tracks the evolving veracity of intermediate steps throughout the dynamic, progressive reasoning process. We train a confidence predictor to quantify the model’s internal cognition of truthfulness at each reasoning step, enabling dynamic selection of the most plausible reasoning path through beam search. Experimental results demonstrate that our method significantly outperforms the state-of-the-art baselines (e.g., Self-Consistency, and PRM Guided Search) across the mathematical, symbolic, and commonsense reasoning tasks, exhibiting superior accuracy and reliability in both unimodal and multimodal settings. This study proposes a novel path toward improving the reliability of CoT reasoning, demonstrating strong potential for wide-ranging applications.

Deep Hidden Cognition Facilitates Reliable Chain-of-Thought Reasoning

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information