Efficient Thought Space Exploration Through Strategic Intervention

Authors

  • Ziheng Li State Key Laboratory of General Artificial Intelligence, School of Intelligence Science and Technology, Peking University
  • Hengyi Cai Baidu Inc.
  • Xiaochi Wei Baidu Inc.
  • Yuchen Li Baidu Inc.
  • Shuaiqiang Wang Baidu Inc.
  • Zhi-Hong Deng State Key Laboratory of General Artificial Intelligence, School of Intelligence Science and Technology, Peking University
  • Dawei Yin Baidu Inc.

DOI:

https://doi.org/10.1609/aaai.v40i38.40459

Abstract

While large language models (LLMs) demonstrate emerging reasoning capabilities, current inference-time expansion methods incur prohibitive computational costs through exhaustive sampling. Through analyzing decoding trajectories, we observe that most next-token predictions align well with the golden output, except for a few critical tokens that lead to deviations. Inspired by this phenomenon, we propose a novel Hint-Practice Reasoning (HPR) framework that operationalizes this insight through two synergistic components: 1) a hinter (powerful LLM) that provides probabilistic guidance at critical decision points, and 2) a practitioner (efficient smaller model) that executes major reasoning steps. The framework's core innovation lies in Distributional Inconsistency Reduction (DIR), a theoretically-grounded metric that dynamically identifies intervention points by quantifying the divergence between practitioner's reasoning trajectory and hinter's expected distribution in a tree-structured probabilistic space. Through iterative tree updates guided by DIR, HPR reweights promising reasoning paths while deprioritizing low-probability branches. Experiments across arithmetic and commonsense reasoning benchmarks demonstrate HPR's state-of-the-art efficiency-accuracy tradeoffs: it achieves comparable performance to self-consistency and MCTS baselines while decoding only 1/5 tokens, and outperforms existing methods by at most 5.1% absolute accuracy while maintaining similar or lower FLOPs.

Downloads

Published

2026-03-14

How to Cite

Li, Z., Cai, H., Wei, X., Li, Y., Wang, S., Deng, Z.-H., & Yin, D. (2026). Efficient Thought Space Exploration Through Strategic Intervention. Proceedings of the AAAI Conference on Artificial Intelligence, 40(38), 31897–31906. https://doi.org/10.1609/aaai.v40i38.40459

Issue

Section

AAAI Technical Track on Natural Language Processing III