Step Back to Leap Forward: Self-Backtracking for Symbolic Reasoning and Planning in Language Models

Authors

  • Xiao-Wen Yang Nanjing University
  • Xuan-Yi Zhu Nanjing University
  • Ding-Chu Zhang Nanjing University
  • Wen-Da Wei nanjing university
  • Jie-Jing Shao Nanjing University
  • Zhi Zhou Nanjing University
  • Lan-Zhe Guo Nanjing University
  • Yu-Feng Li Nanjing University

DOI:

https://doi.org/10.1609/aaai.v40i33.39986

Abstract

Although autoregressive language models demonstrated remarkable performance across various tasks, their effectiveness in symbolic reasoning and decision-making scenarios remains constrained. Recent research indicates that training language models to emulate symbolic search algorithms (e.g. depth-first search or A* algorithm) can yield strong improvements in their symbolic reasoning and planning capabilities. However, existing methods only achieve superficial imitation of symbolic search trajectories, as their generation processes lack explicit backtracking mechanisms. This limitation prevents models from truly mastering symbolic search, often resulting in rigid and redundant outputs with poor solution quality. To address this issue, we propose a self-backtracking mechanism that enables LLMs to autonomously determine when to backtrack through specialized training, effectively utilizing this capability to scale during inference. By introducing a self-improvement strategy, the model can further refine its search process into optimal solution generation, improving problem-solving efficiency. Empirical evaluations demonstrate that our method boosts LLMs' reasoning on the Countdown task by 40% over optimal-path supervised fine-tuning (SFT) and improves both performance and efficiency on the Maze Navigation task.

Downloads

Published

2026-03-14

How to Cite

Yang, X.-W., Zhu, X.-Y., Zhang, D.-C., Wei, . W.-D., Shao, J.-J., Zhou, Z., … Li, Y.-F. (2026). Step Back to Leap Forward: Self-Backtracking for Symbolic Reasoning and Planning in Language Models. Proceedings of the AAAI Conference on Artificial Intelligence, 40(33), 27657–27665. https://doi.org/10.1609/aaai.v40i33.39986

Issue

Section

AAAI Technical Track on Machine Learning X