Step Back to Leap Forward: Self-Backtracking for Symbolic Reasoning and Planning in Language Models
DOI:
https://doi.org/10.1609/aaai.v40i33.39986Abstract
Although autoregressive language models demonstrated remarkable performance across various tasks, their effectiveness in symbolic reasoning and decision-making scenarios remains constrained. Recent research indicates that training language models to emulate symbolic search algorithms (e.g. depth-first search or A* algorithm) can yield strong improvements in their symbolic reasoning and planning capabilities. However, existing methods only achieve superficial imitation of symbolic search trajectories, as their generation processes lack explicit backtracking mechanisms. This limitation prevents models from truly mastering symbolic search, often resulting in rigid and redundant outputs with poor solution quality. To address this issue, we propose a self-backtracking mechanism that enables LLMs to autonomously determine when to backtrack through specialized training, effectively utilizing this capability to scale during inference. By introducing a self-improvement strategy, the model can further refine its search process into optimal solution generation, improving problem-solving efficiency. Empirical evaluations demonstrate that our method boosts LLMs' reasoning on the Countdown task by 40% over optimal-path supervised fine-tuning (SFT) and improves both performance and efficiency on the Maze Navigation task.Downloads
Published
2026-03-14
How to Cite
Yang, X.-W., Zhu, X.-Y., Zhang, D.-C., Wei, . W.-D., Shao, J.-J., Zhou, Z., … Li, Y.-F. (2026). Step Back to Leap Forward: Self-Backtracking for Symbolic Reasoning and Planning in Language Models. Proceedings of the AAAI Conference on Artificial Intelligence, 40(33), 27657–27665. https://doi.org/10.1609/aaai.v40i33.39986
Issue
Section
AAAI Technical Track on Machine Learning X