[1]

G. Hadar, F. Agostinelli, and S. S. Shperberg, “Beyond Single-Step Updates: Reinforcement Learning of Heuristics with Limited-Horizon Search”, AAAI, vol. 40, no. 43, pp. 36955–36963, Mar. 2026.