https://ojs.aaai.org/index.php/ICAPS/issue/feedProceedings of the International Conference on Automated Planning and Scheduling2026-02-25T09:14:59+00:00Publications Departmentpublications21@aaai.orgOpen Journal Systems<p>The annual ICAPS conference series was formed in 2003 through the merger of two preexisting biennial conferences, the International Conference on Artificial Intelligence Planning and Scheduling (AIPS) and the European Conference on Planning (ECP). ICAPS continues the traditional high standards of AIPS and ECP as an archival forum for new research in the field of automated planning and scheduling. The Proceedings of the International Conference on Automated Planning and Scheduling contains the annual, archival published work of the ICAPS conference.</p>https://ojs.aaai.org/index.php/ICAPS/article/view/36956Frontmatter2026-02-25T09:14:59+00:00Daniel Harabordaniel.harabor@monash.eduNir Lipovetzkynir.lipovetzky@unimelb.edu.auMiquel Ramirezmiquel.ramirez@unimelb.edu.auSebastian Sardinasebastian.sardina@rmit.edu.au<p>This volume contains the papers accepted for presentation at ICAPS 2025, the Thirty-Fifth International Conference on Automated Planning and Scheduling, to be held in Melbourne, Australia, November 9-14, 2025. The annual ICAPS conference series was formed in 2003 through the merger of two pre-existing biennial conferences, the International Conference on Artificial Intelligence Planning and Scheduling (AIPS) and the European Conference on Planning (ECP). ICAPS continues the traditional high standards of AIPS and ECP as an archival forum for new research in the field of automated planning and scheduling.</p> <p>ICAPS 2025 was co-located with two other events: The International Conference on the Integration of Constraint Programming, Artificial Intelligence, and Operations Research (CPAIOR), and The International Conference on the Principles of Knowledge Representation and Reasoning (KR). Existing research into methods and representations for Automated Planning and Scheduling has drawn heavily from the research conducted by these communities. We believe that the co-location of these conferences with ICAPS can only boost these beneficial relationships.<br><br>The frontmatter contains a Preface and lists both the ICAPS 2025 Organising Committee and the ICAPS 2025 Program Committee.</p>2026-02-25T00:00:00+00:00Copyright (c) 2026 Proceedings of the International Conference on Automated Planning and Schedulinghttps://ojs.aaai.org/index.php/ICAPS/article/view/36095An Improved Lower Bound on the Length of Locally-Improving Policy Sequences in MDPs with Large Action Sets2025-09-16T15:20:40+00:00Pratyush Agarwal~Pratyush_Agarwal1@example.comMulinti Shaik Wajid~Mulinti_Shaik_Wajid1@example.comShivaram Kalyanakrishnanshivaram@cse.iitb.ac.inPopular algorithms to solve Markov Decision Problems (MDPs) include policy iteration and the Simplex method (executed on an induced linear program). Each run of these algorithms can be associated with a sequence of "locally-improving" policies for the input MDP. For integers n >= 2, k >= 2, let f(n, k) denote the longest possible sequence of locally-improving policies for any MDP with n states and k actions per state. An alternative view of f(n, k) is as a descriptive structural property of the policy space of MDPs: it is the largest possible "c-height" in an induced "LP-digraph" of any n-state, k-action MDP. How large can f(n, k) be? A trivial upper bound on f(n, k) is the total number of (Markovian, deterministic) policies, which is k^{n}. A construction from Melekopoglou and Condon (1994) shows that f(n, 2) = 2^{n}, implying that the trivial upper bound is tight for k = 2. For k >= 3, the tightest lower bound on f(n, k) in the current literature is only Omega(k^{n / 2}) (Ashutosh et al., 2020). In this paper, we propose a family of MDPs to show a lower bound of Omega( (floor(k / 2) )^{n}) on f(n, k)---giving a exponential-in-n tightening for each k >= 6. Our investigation brings out technical challenges that do not arise for k = 2. Our result still leaves open the important question of whether f(n, k) is indeed k^{n} for n >= 2, k >= 2. We furnish an affirmative answer for the special case of n = 2, k >= 2.2025-09-16T00:00:00+00:00Copyright (c) 2024 Association for the Advancement of Artificial Intelligencehttps://ojs.aaai.org/index.php/ICAPS/article/view/36096Is This Plan Necessarily Redundant? On the Computational Complexity of Unobserved Domain Learning2025-09-16T15:20:41+00:00Pascal Bachorbachorp@cs.uni-freiburg.deP. Maurice Dekkerp.m.dekker@uva.nlGregor Behnkeg.behnke@uva.nlDomain learning is the task of inferring actions' preconditions and effects (domains) from executed sequences of actions (plans) along with a varying detail of information about the corresponding world states. Remarkably, even if the state remains completely unobserved, as in this work, we can infer the existence of certain state features if we assume that the plans we learn from are non-redundant. Moreover, plans might be redundant regardless of the underlying domain. We study the computational complexity of deciding whether there exists a domain in which a given plan is justified in the sense that either no single action (well-justification) or no set of actions (perfect justification) can be removed without violating correctness of the plan. We allow either arbitrarily large domains or domains with a polynomial bound on the number of state variables. We show that the problem is in P for well-justified plans and arbitrary domains, NP-complete for well-justified plans and bounded domains, in coNP for perfectly justified plans and arbitrary domains, and in Σ₂ for perfectly justified plans and bounded domains.2025-09-16T00:00:00+00:00Copyright (c) 2024 Association for the Advancement of Artificial Intelligencehttps://ojs.aaai.org/index.php/ICAPS/article/view/36097Initial Condition Retrieving for Hybrid and Numeric Planning Problems2025-09-16T15:20:42+00:00Matteo Cardellini~Matteo_Cardellini1@example.comFrancesco Percassif.percassi@hud.ac.ukMarco Maratea~Marco_Maratea3@example.comMauro Vallatim.vallati@hud.ac.ukReal-world applications of planning techniques often deal with dynamic and noisy environments, where sensor readings are often inaccurate, and the world's states can evolve in unexpected ways. This is particularly challenging for hybrid discrete-continuous planning approaches, where processes and events can be strongly affected by even slightly different initial conditions of the world, and planning tasks are notoriously difficult to cope with. In this paper, we introduce the Initial Condition Retrieving (ICR) problem to foster hybrid planning in real-world applications. Given a knowledge model of a planning task and a trace, solving the ICR problem allows identifying the space of all the initial conditions from which the provided plan is guaranteed to reach a goal state. We define three tasks: (i) retrieving any valid initial condition, (ii) fixing only some desired initial values and retrieving a complete initial condition that fills in the unassigned values, or (iii) retrieving the closest achievable initial condition to a fully specified one from which the goal cannot be reached. Experiments on well-known hybrid planning domains demonstrate the efficacy of our approach in solving such tasks. Moreover, given that our approach can be applied to numeric planning without any change, we extend our analysis to numeric domains, where we obtain positive results.2025-09-16T00:00:00+00:00Copyright (c) 2024 Association for the Advancement of Artificial Intelligencehttps://ojs.aaai.org/index.php/ICAPS/article/view/36098A Formalism for Optimal Search with Dynamic Heuristics2025-09-16T15:20:43+00:00Remo Christenremo.christen@unibas.chFlorian Pommerening~Florian_Pommerening1@example.comClemens Büchner~Clemens_Buchner1@example.comMalte Helmert~Malte_Helmert1@example.comWhile most heuristics studied in heuristic search depend only on the state, some accumulate information during search and thus also depend on the search history. Multiple existing approaches use such dynamic heuristics in A*-like algorithms and appeal to classic results for A* to show that they return optimal solutions. However, doing so disregards the intricacies of searching with a mutable heuristic. We treat dynamic heuristics formally and propose a framework that defines how the information dynamic heuristics rely on can be modified. We use these transformations in a generic search algorithm and an instantiation that models A* with dynamic heuristics, allowing us to provide general conditions for optimality. We show that existing approaches fit our framework and apply our results. Doing so for future applications of dynamic heuristics may simplify formal arguments for optimality.2025-09-16T00:00:00+00:00Copyright (c) 2024 Association for the Advancement of Artificial Intelligencehttps://ojs.aaai.org/index.php/ICAPS/article/view/36099Delete-Free Planning with Object Creation is Undecidable2025-09-16T15:20:44+00:00Augusto B. Corrêaaugusto.blaascorrea@chch.ox.ac.ukIn planning with object creation, actions might introduce new objects as part of their effect. While this makes the formalism more expressive, it also renders the plan existence problem undecidable. A natural next step is to ask whether simpler fragments and relaxations are still undecidable when powered with object creation. Probably the most popular fragment is delete-free planning, where actions can only add but never delete atoms. In this work, we show that delete-free planning with object creation is still undecidable. We do so by reducing the problem of deciding whether a given atom is reached by the chase procedure to the plan existence problem. Our result implies that heuristics based on the delete relaxation may not be immediately useful for the object creation setting. We then highlight which restrictions we can apply to make delete-free planning with object creation practical.2025-09-16T00:00:00+00:00Copyright (c) 2024 Association for the Advancement of Artificial Intelligencehttps://ojs.aaai.org/index.php/ICAPS/article/view/36100Hardness of Chosen Length Planning Games and Regular Fixed Methods FOND HTN Planning2025-09-16T15:20:45+00:00P. Maurice Dekkerp.m.dekker@uva.nlGregor Behnkeg.behnke@uva.nlWe introduce a new version of general game-playing in which one of the players chooses the length of the game up front. Consider a classical planning problem and two players who take turns applying actions. Player 1 wins iff the goal is true after a predetermined number of moves has been made. Is there a number r such that player 1 has a winning strategy for the game of length r? We show that this problem is EXPSPACE-complete. Moreover, we show that the problem is equivalent to the plan existence problem for a class of fully observable non-deterministic hierarchical task network planning problems under the solution concept with fixed methods, which was introduced in prior work. This class consists of all regular loop-unrolling problems, where a problem is loop-unrolling if it has at most one compound task name and at most two methods. As a corollary, we obtain hardness for regular problems, solving an open problem.2025-09-16T00:00:00+00:00Copyright (c) 2024 Association for the Advancement of Artificial Intelligencehttps://ojs.aaai.org/index.php/ICAPS/article/view/36101Pseudo-Boolean Proof Logging for Optimal Classical Planning2025-09-16T15:20:46+00:00Simon Doldsimon.dold@unibas.chMalte Helmert~Malte_Helmert1@example.comJakob Nordström~Jakob_Nordstrom2@example.comGabriele Röger~Gabriele_Roger1@example.comTanja Schindlertanja.schindler@unibas.chWe introduce lower-bound certificates for classical planning tasks, which can be used to prove the unsolvability of a task or the optimality of a plan in a way that can be verified by an independent third party. We describe a general framework for generating lower-bound certificates based on pseudo-Boolean constraints, which is agnostic to the planning algorithm used. As a case study, we show how to modify the A* algorithm to produce proofs of optimality with modest overhead, using pattern database heuristics and hmax as concrete examples. The same proof logging approach works for any heuristic whose inferences can be efficiently expressed as reasoning over pseudo-Boolean constraints.2025-09-16T00:00:00+00:00Copyright (c) 2024 Association for the Advancement of Artificial Intelligencehttps://ojs.aaai.org/index.php/ICAPS/article/view/36102Tight Bounds for Lifted HTN Plan Verification and Bounded Plan Existence2025-09-16T15:20:47+00:00Pascal Lauerlauer@cs.uni-saarland.deSongtuan Lin~S.T._Lin1@example.comPascal Bercher~Pascal_Tobias_Bercher1@example.comPlan verification is a canonical problem within any planning setting to ensure correctness. This problem is closely linked to the bounded plan existence problem. We analyze the complexity of these problems on lifted representations for Hierarchical Task Network (HTN) Planning. On top of the general analysis, we impose constraints on method orderings and the amount of tasks that methods decompose to. This pinpoints subclasses with lower complexity. Our results confirm the existence of more efficient algorithms when operating on the lifted, instead of grounded, representation.2025-09-16T00:00:00+00:00Copyright (c) 2024 Association for the Advancement of Artificial Intelligencehttps://ojs.aaai.org/index.php/ICAPS/article/view/36103Continuing the Quest for Polynomial Time Heuristics in PDDL Input Size: Tractable Cases for Lifted hᵃᵈᵈ2025-09-16T15:20:48+00:00Pascal Lauerlauer@cs.uni-saarland.deÁlvaro Torralba~Alvaro_Torralba1@example.comDaniel Höller~Daniel_Holler1@example.comJörg Hoffmann~Jorg_Hoffmann2@example.comRecent interest in solving planning tasks, where full grounding is infeasible, has highlighted the need to compute heuristics at a lifted level. We turn our attention to the evaluation of the hᵃᵈᵈ heuristic, which is an important cornerstone in many classical planning approaches, including the best performing lifted planning approach. We show that hᵃᵈᵈ’s grounded efficiency does not extend to lifted tasks, where the computation is EXPTIME-complete. This prompts to identify tractability islands matching practical use cases. We identify two, where a lifted computation is feasible while grounding may fail: The first constraints to acyclic action schemata and bounds predicate arity. For the second case we introduce a novel computation, operating without grounding. Assuming the extraction encounters only acyclic conditions, and hᵃᵈᵈ values per subgoal are bounded, it remains tractable. (Even with unbounded predicate and action arity.) In an empirical evaluation of the new technique, we observe complementary behavior to the existing lifted forward hᵃᵈᵈ evaluation. Combining both sets a new state-of-the-art in pure-heuristic performance on the hard-to-ground benchmarks.2025-09-16T00:00:00+00:00Copyright (c) 2024 Association for the Advancement of Artificial Intelligencehttps://ojs.aaai.org/index.php/ICAPS/article/view/36104Howard's Policy Iteration is Subexponential for Deterministic Markov Decision Problems with Rewards of Fixed Bit-size and Arbitrary Discount Factor2025-09-16T15:20:49+00:00Dibyangshu Mukherjeedbnshu@cse.iitb.ac.inShivaram Kalyanakrishnanshivaram@cse.iitb.ac.inHoward's Policy Iteration (HPI) is a classic algorithm for solving Markov Decision Problems (MDPs). HPI uses a "greedy" switching rule to update from any non-optimal policy to a dominating one, iterating until an optimal policy is found. Despite its introduction over 60 years ago, the best-known upper bounds on HPI's running time remain exponential in the number of states---indeed even on the restricted class of MDPs with only deterministic transitions (DMDPs). Meanwhile, the tightest lower bound for HPI for MDPs with a constant number of actions per state is only linear. In this paper, we report a significant improvement: a subexponential upper bound for HPI on DMDPs, which is parameterised by the bit-size of the rewards, while independent of the discount factor. The same upper bound also applies to DMDPs with only two possible rewards (which may be of arbitrary size).2025-09-16T00:00:00+00:00Copyright (c) 2024 Association for the Advancement of Artificial Intelligencehttps://ojs.aaai.org/index.php/ICAPS/article/view/36105Platform-Aware Mission Planning2025-09-16T15:20:50+00:00Stefan Panjkovicspanjkovic@fbk.euAlessandro Cimatti~Alessandro_Cimatti1@example.comAndrea Micheli~Andrea_Micheli1@example.comStefano Tonetta~Stefano_Tonetta1@example.comPlanning for autonomous systems typically requires reasoning with models at different levels of abstraction, and the harmonization of two competing sets of objectives: high-level mission goals that refer to an interaction of the system with the external environment, and low-level platform constraints that aim to preserve the integrity and the correct interaction of the subsystems. The complicated interplay between these two models makes it very hard to reason on the system as a whole, especially when the objective is to find plans with robustness guarantees, considering the non-deterministic behavior of the lower layers of the system. In this paper, we introduce the problem of Platform-Aware Mission Planning (PAMP), addressing it in the setting of temporal durative actions. The PAMP problem differs from standard temporal planning for its exists-forall nature: the high-level plan dealing with mission goals is required to satisfy safety and executability constraints, for all the possible non-deterministic executions of the low-level model of the platform and the environment. We propose two approaches for solving PAMP. The first baseline approach amalgamates the mission and platform levels, while the second is based on an abstraction-refinement loop that leverages the combination of a planner and a verification engine. We prove the soundness and completeness of the proposed approaches and validate them experimentally, demonstrating the importance of heterogeneous modeling and the superiority of the technique based on abstraction-refinement.2025-09-16T00:00:00+00:00Copyright (c) 2024 Association for the Advancement of Artificial Intelligencehttps://ojs.aaai.org/index.php/ICAPS/article/view/36106On the Notion of Plan Quality for PDDL+2025-09-16T15:20:51+00:00Francesco Percassif.percassi@hud.ac.ukEnrico Scalaenrico.scala@unibs.itMauro Vallatim.vallati@hud.ac.ukPDDL+ is a planning formalism designed to model mixed continuous-discrete problems. Despite its expressiveness, the absence of a well-established framework for evaluating plan quality makes it challenging to use PDDL+ in applications where plan shape and quality are crucial. This paper addresses this issue by introducing a comprehensive set of plan cost functions tailored for discrete-time PDDL+, along with a cost-preserving translation for generating cost-aware PDDL2.1 planning tasks. The plan cost functions provide a theoretical ground for assessing plan quality, whereas the translation shows their practicability by leveraging the connection between PDDL+ and PDDL2.1.2025-09-16T00:00:00+00:00Copyright (c) 2024 Association for the Advancement of Artificial Intelligencehttps://ojs.aaai.org/index.php/ICAPS/article/view/36107How Good is Perfect? On the Incompleteness of A* for Total-Order HTN Planning2025-09-16T15:20:52+00:00Mohammad Yousefimohammad.yousefi@anu.edu.auMario Schmautzmario.schmautz@alumni.uni-ulm.dePatrik Haslum~Patrik_Haslum1@example.comPascal Bercher~Pascal_Tobias_Bercher1@example.comThis paper reveals the inherent limitations of A* in HTN planning by identifying various cycle types induced by the task hierarchy and analyzing their effects on the termination of the algorithm. We prove that A* even with the perfect heuristic, and for the special case of totally ordered problems, which are known to be decidable, is incomplete. An especially interesting results is that having a visited list (i.e., graph search) with the null heuristic has better termination guarantees than tree search with the perfect heuristic. We provide a polynomial-time test for detecting those cycles that render A* incomplete, and analyzed all existing benchmark domains from the most-recent international planning competition. Results show that in more than half of all domains, A* tree search would be incomplete even with the perfect heuristic, and in roughly 40% of cases A* graph search might also be incomplete depending on the provided heuristic function. We also point to a normal form that preserves semantics and guarantees completeness of the resulting models, though implementation and testing remains for future work.2025-09-16T00:00:00+00:00Copyright (c) 2024 Association for the Advancement of Artificial Intelligencehttps://ojs.aaai.org/index.php/ICAPS/article/view/36108Going Topological in Multi-risk Extended Markov Ratio Decision Processes2025-09-16T15:20:53+00:00Alexander Zadorojniyzalex@il.ibm.comOrit Davidovichorit.davidovich@ibm.comTakayuki Osogami~Takayuki_Osogami1@example.comIncorporating risk into decision making is natural, if one is to address safety concerns or operational limitations. In the context of risk-aware Markov Decision Processes (MDPs), one identifies a notion of risk which is uncertainty driven (e.g., CVaR). Risk, however, may also be inherent to the MDP setup itself, i.e., to taking certain types of actions. In that case, we would consider a decision policy to be better, if it either increases reward or reduces risk (or both). A simple mathematical formulation that expresses such a notion of improvement is the ratio of reward over risk. Though intuitive, this ratio is inherently non-linear, which introduces challenges for optimization. We provide an algorithm that solves this non-linear problem in the context of multiple risk aspects, extending upon single-risk Extended Markov Ratio Decision Processes (EMRDPs). We show that it is strongly polynomial under a monotonicity assumption over actions, satisfied, for example, in financial market applications (e.g., Quasi-Sharpe Ratio). We tackle non-linearity by integrating Walkup-Wets' topological view of parametric LPs. This topological framework highlights the non-trivial move from a single (EMRDP) to multiple risk aspects, once it is interpreted as moving from triangulations of 1-dimensional to those of m-dimensional polyhedra, with all the topological (and combinatorial) complexities this entails.2025-09-16T00:00:00+00:00Copyright (c) 2024 Association for the Advancement of Artificial Intelligencehttps://ojs.aaai.org/index.php/ICAPS/article/view/36109Parallelizing Multi-objective A* Search2025-09-16T15:20:54+00:00Saman Ahmadisaman-ahmadi@live.comNathan R. Sturtevant~Nathan_R._Sturtevant1@example.comAndrea Raitha.raith@auckland.ac.nzDaniel Harabor~Daniel_Harabor1@example.comMahdi Jalili~Mahdi_Jalili1@example.comThe Multi-objective Shortest Path (MOSP) problem is a classic network optimization problem that aims to find all Pareto-optimal paths between two points in a graph with multiple edge costs. Recent studies on multi-objective search with A* (MOA*) have demonstrated superior performance in solving difficult MOSP instances. This paper presents a novel search framework that allows efficient parallelization of MOA* with different objective orders. The framework incorporates a unique upper-bounding strategy that helps the search reduce the problem's dimensionality to one in certain cases. Experimental results demonstrate that the proposed framework can enhance the performance of recent A*-based solutions, with the speed-up proportional to the problem dimension.2025-09-16T00:00:00+00:00Copyright (c) 2024 Association for the Advancement of Artificial Intelligencehttps://ojs.aaai.org/index.php/ICAPS/article/view/36110Cost-Optimal FOND Planning as Bi-Objective Best-First Search2025-09-16T15:20:56+00:00Diego Ainetogarciag1987@gmail.comEnrico Scalaenrico.scala@unibs.itIn this paper, we tackle the problem of finding cost-optimal solutions in Fully-Observable Non-Deterministic (FOND) planning problems. First, we introduce metrics for FOND problems by interpreting solution policies under both their best and worst possible scenarios, leading to a bi-objective optimization problem. We then propose BOAND*, a novel heuristic search algorithm designed to seek Pareto-optimal solutions by navigating the space of possible policies. We conduct an empirical evaluation of the algorithm, alongside a qualitative comparison with cost-optimal solutions that consider only one objective at a time. Our findings validate this approach, paving the way for new methods of reasoning over FOND problems.2025-09-16T00:00:00+00:00Copyright (c) 2024 Association for the Advancement of Artificial Intelligencehttps://ojs.aaai.org/index.php/ICAPS/article/view/36111A Sampling Approach to Planning with Infinite Domain Control Variables2025-09-16T15:20:57+00:00Ángel Aso-Mollaraaso@vrain.upv.esDiego Ainetogarciag1987@gmail.comEnrico Scalaenrico.scala@unibs.itEva Onaindia~Eva_Onaindia2@example.comResearch in planning has sought to broaden the scope of planning problems by incorporating numeric parameters into action descriptions to condition both continuous and discrete change. Focusing on the latter, this work studies the problem of numeric planning with control variables, a reformulation of actions with infinite domain parameters. To tackle the challenge of handling an infinite decision space driven by control variables, we incorporate sampling into a forward state-space search. The resulting search framework (1) partially expands nodes by sampling their successors and (2) implements a re-expansion strategy to sample additional successors if a node shows promise in future evaluations. We perform a deep probe into this concept that materializes into a new algorithm called Sampling Greedy Best-First Search (S-GBFS). Our empirical evaluation of S-GBFS across various domains shows significant improvements over existing planning approaches.2025-09-16T00:00:00+00:00Copyright (c) 2024 Association for the Advancement of Artificial Intelligencehttps://ojs.aaai.org/index.php/ICAPS/article/view/36112Learning Efficiency Meets Symmetry Breaking2025-09-16T15:20:58+00:00Yingbin Baiyingbin.bai@anu.edu.auSylvie Thiébaux~Sylvie_Thiebaux1@example.comFelipe Trevizanfelipe.trevizan@gmail.comLearning-based planners leveraging Graph Neural Networks can learn search guidance applicable to large search spaces, yet their potential to address symmetries remains largely unexplored. In this paper, we introduce a graph representation of planning problems allying learning efficiency with the ability to detect symmetries, along with two pruning methods, action pruning and state pruning, designed to manage symmetries during search. The integration of these techniques into Fast Downward achieves a first-time success over LAMA on the latest IPC learning track dataset.2025-09-16T00:00:00+00:00Copyright (c) 2024 Association for the Advancement of Artificial Intelligencehttps://ojs.aaai.org/index.php/ICAPS/article/view/36113Instance-based Approximation Guarantees for Graph-based Nearest Neighbor Search2025-09-16T15:20:59+00:00Yannick Boschyannick.bosch@uni-konstanz.deSabine Storandt~Sabine_Storandt1@example.comNearest Neighbor Search (NNS) in high-dimensional point sets is an important building block in many application areas, including pattern recognition, machine learning, planning, data mining, and computational geometry. Graph-based approaches that offer approximate NNS (ANNS) are ubiquitously used for these applications, and a variety of suitable graph structures have been proposed for this purpose. However, these approaches do not come with a priori approximation guarantees, often not even in low dimensions. Thus, there may be query points for which the distance to the returned ANN is significantly larger than the distance to the true NN. A common way to assess the quality of graph-based search and to compare different variants is the evaluation of query point samples. However, since the space of potential query points is infinite, it is likely that the samples will give biased results and that critical points will be missed. To systematically evaluate the ANNS quality of a given graph structure, we propose an algorithm that identifies the query point with the worst ratio r between ANN distance and true NN distance. This ratio provides a tight instance-based approximation guarantee. Our algorithm relies on a new geometric data structure called search-path diagram. In our experiments on established base graphs, we demonstrate that sampling based evaluation heavily underestimates r, while our method provides a robust quality assessment.2025-09-16T00:00:00+00:00Copyright (c) 2024 Association for the Advancement of Artificial Intelligencehttps://ojs.aaai.org/index.php/ICAPS/article/view/36114On Generating Robust Plans and Linear Execution Strategies in Planning Against Nature2025-09-16T15:21:01+00:00Lukáš Chrpachrpaluk@cvut.czErez Karpas~Erez_Karpas1@example.comPlanning against nature is a recent concept describing planning and acting in environments in which nature can non-deterministically trigger exogenous events, where the agent has to consider that the state of the environment might change without its consent. Therefore, the agent has to make sure that it eventually achieves its goal (if possible) despite the acts of nature. In this paper, we leverage the recent concept of robust plans, which assumes that nature might act as an adversary, to design a method for generating linear execution strategies, which assume that nature acts randomly but fairly. In particular, we consider events that have to eventually occur and facts that even if deleted by events will be eventually reachieved by (other) events (because nature acts fairly). To improve the efficiency of both robust plan and linear execution strategy generation methods, we provide an approach allowing us to adopt delete-relaxed heuristics that are used in classical planning.2025-09-16T00:00:00+00:00Copyright (c) 2024 Association for the Advancement of Artificial Intelligencehttps://ojs.aaai.org/index.php/ICAPS/article/view/36115Alternation-Based Novelty Search2025-09-16T15:21:02+00:00Augusto B. Corrêaaugusto.blaascorrea@chch.ox.ac.ukJendrik Seippjendrik.seipp@liu.seOne key decision for heuristic search algorithms is how to balance exploration and exploitation. In classical planning, the two strongest approaches for this problem are to alternate between different heuristics and to enhance heuristics with novelty measures. The most well-known planner using alternation is LAMA, which cycles between different open-lists that are ordered using different heuristics. The strongest novelty-based algorithms use best-first width search (BFWS), which prefers states that contain previously unseen combinations of atoms. Considerable effort has been put into trying to combine these two approaches, but so far, no combination has been able to significantly improve over the individual planners. In this paper, we explore the simple idea of using BFWS as just another open-list for LAMA. Our results show that adding even the strongest BFWS version to LAMA is detrimental. However, combining only parts of each approach yields a new state-of-the-art agile planner.2025-09-16T00:00:00+00:00Copyright (c) 2024 Association for the Advancement of Artificial Intelligencehttps://ojs.aaai.org/index.php/ICAPS/article/view/36116On Picking Good Policies: Leveraging Action-Policy Testing in Policy Training2025-09-16T15:21:03+00:00Jan Eisenhuteisenhut@cs.uni-saarland.deDaniel Fišerdanfis@danfis.czIsabel Valera~Isabel_Valera1@example.comJörg Hoffmann~Jorg_Hoffmann2@example.comTesting is a natural approach to assess the quality of learned action policies π. Prior work introduced policy testing in AI planning as searching for bugs in π, that is, states where π is sub-optimal with respect to a given testing objective. Beyond quality assurance, an obvious application of these methods is policy selection: given several π to choose from, we can use testing to select the "least buggy" one. Here, we integrate testing-based policy selection into the training process. This includes making more informed decisions when selecting the final policy after training, as well as choosing more promising intermediate policies during the training process. Our experiments with ASNets action policies show that integrating testing allows us to more reliably obtain good-quality policies.2025-09-16T00:00:00+00:00Copyright (c) 2024 Association for the Advancement of Artificial Intelligencehttps://ojs.aaai.org/index.php/ICAPS/article/view/36117Learning Lifted STRIPS Models from Action Traces alone: A Simple, General, and Scalable Solution2025-09-16T15:21:04+00:00Jonas Gösgensjonas.goesgens@ml.rwth-aachen.deNiklas Jansenniklas.jansen1@rwth-aachen.deHector Geffner~Hector_Geffner2@example.comLearning STRIPS action models from action traces alone is a challenging problem as it involves learning the domain predicates as well. In this work, a novel approach is introduced which, like the well-known LOCM systems, is scalable, but like SAT approaches, is sound and complete. Furthermore, the approach is general and imposes no restrictions on the hidden domain or the number or arity of the predicates. The new learning method is based on an efficient, novel test that checks whether the assumption that a predicate is affected by a set of action patterns, namely, actions with specific argument positions, is consistent with the traces. The predicates and action patterns that pass the test provide the basis for the learned domain that is then easily completed with preconditions and static predicates. The new method is studied theoretically and experimentally. For the latter, the method is evaluated on traces and graphs obtained from standard classical domains like the 8-puzzle, which involve hundreds of thousands of states and transitions. The learned representations are then verified on larger instances.2025-09-16T00:00:00+00:00Copyright (c) 2024 Association for the Advancement of Artificial Intelligencehttps://ojs.aaai.org/index.php/ICAPS/article/view/36118Per-Domain Generalizing Policies: On Validation Instances and Scaling Behavior2025-09-16T15:21:05+00:00Timo P. Grostimo_philipp.gros@dfki.deNicola J. Müller~Nicola_J._Muller1@example.comDaniel Fišerdanfis@danfis.czIsabel Valera~Isabel_Valera1@example.comVerena Wolf~Verena_Wolf2@example.comJörg Hoffmann~Jorg_Hoffmann2@example.comRecent work has shown that successful per-domain generalizing action policies can be learned. Scaling behavior, from small training instances to large test instances, is the key objective; and the use of validation instances larger than training instances is one key to achieve it. Prior work has used fixed validation sets. Here, we introduce a method generating the validation set dynamically, on the fly, increasing instance size so long as informative and feasible. We also introduce refined methodology for evaluating scaling behavior, generating test instances systematically to guarantee a given confidence in coverage performance for each instance size. In experiments, dynamic validation improves scaling behavior of GNN policies in all 9 domains used.2025-09-16T00:00:00+00:00Copyright (c) 2024 Association for the Advancement of Artificial Intelligencehttps://ojs.aaai.org/index.php/ICAPS/article/view/36119Chasing Progress, Not Perfection: Revisiting Strategies for End-to-End LLM Plan Generation2025-09-16T15:21:06+00:00Sukai Huangsukaih@student.unimelb.edu.auTrevor Cohntrevorcohn@example.comNir Lipovetzky~Nir_Lipovetzky1@example.comThe capability of Large Language Models (LLMs) to plan remains a topic of debate. Some critics argue that strategies to boost LLMs' reasoning skills are ineffective in planning tasks, while others report strong outcomes merely from training models on a planning corpus. This paper revisits these claims by developing an end-to-end LLM-based planner and evaluating a range of reasoning-enhancement strategies --- including fine-tuning, Chain-of-Thought (CoT) prompting, and reinforcement learning (RL) --- across multiple dimensions of plan quality: validity, executability, goal satisfiability, and more. Our findings reveal fine-tuning alone is insufficient, especially on out-of-distribution tasks. Strategies like CoT prompting primarily enhance local coherence, yielding higher executability rates --- a necessary prerequisite for validity --- but provide only incremental gains and struggle to ensure global plan validity. Notably, RL guided by a novel Longest Contiguous Common Subsequence reward significantly enhances both executability and validity, particularly on longer-horizon problems. Overall, our research addresses key misconceptions in the LLM-planning literature and underscores reward-driven RL optimization as a promising direction for advancing robust LLM-based planning by jointly improving executability and validity.2025-09-16T00:00:00+00:00Copyright (c) 2024 Association for the Advancement of Artificial Intelligencehttps://ojs.aaai.org/index.php/ICAPS/article/view/36120Safe Interval Randomized Path Planning For Manipulators2025-09-16T15:21:07+00:00Nuraddin Kerimovkerimov.nm@phystech.eduAleksandr Onegin~Aleksandr_Onegin1@example.comKonstantin Yakovlevyakovlev.ks@gmail.comPlanning safe paths in 3D workspace for high DoF robotic systems, such as manipulators, is a challenging problem, especially when the environment is populated with the dynamic obstacles that need to be avoided. In this case the time dimension should be taken into account that further increases the complexity of planning. To mitigate this issue we suggest to combine safe-interval path planning (a prominent technique in heuristic search) with the randomized planning, specifically, with the bidirectional rapidly-exploring random trees (RRT-Connect) -- a fast and efficient algorithm for high-dimensional planning. Leveraging a dedicated technique of fast computation of the safe intervals, we end up with an efficient planner dubbed SI-RRT. We compare it with the state of the art and show that SI-RRT consistently outperforms the competitors both in runtime and solution cost.2025-09-16T00:00:00+00:00Copyright (c) 2024 Association for the Advancement of Artificial Intelligencehttps://ojs.aaai.org/index.php/ICAPS/article/view/36121Potential Heuristics: Weakening Consistency Constraints2025-09-16T15:21:09+00:00Pascal Lauerlauer@cs.uni-saarland.deDaniel Fišerdanfis@danfis.czIn classical planning, admissible potential heuristics are computed by solving linear programs (LPs) with constraints expressing consistency and goal-awareness of the heuristic. Potential heuristics can return negative estimates. So, given a potential heuristic h^P, the actual heuristic used in search is another heuristic defined as h^P_0+(s) = max(h^P(s),0) for every reachable state s. In this paper, we reformulate the LP constraints for consistency of h^P so that they ensure consistency of h^P_0+ instead. This leads to more informative heuristics with positive impact on the overall performance in exchange for a more time and memory demanding computation using mixed integer linear programs instead of LPs.2025-09-16T00:00:00+00:00Copyright (c) 2024 Association for the Advancement of Artificial Intelligencehttps://ojs.aaai.org/index.php/ICAPS/article/view/36122Strategies to Improve Goal Selection in Satisficing Oversubscription Planning2025-09-16T15:21:10+00:00Ángel García Olayaagolaya@inf.uc3m.esPatricia J. Riddle~Patricia_J._Riddle1@example.comMichael Barley~Mike_Barley1@example.comOversubscription planning (OSP) tackles the infeasibility of finding a plan that achieves all goals, due to a limited resource, typically a cost-bound. The objective is to discover a plan under this cost-bound that maximizes the utility. A leading-edge technique in satisficing OSP employs relaxed plans to estimate the cost of achieving goals and focuses planning efforts on goals deemed attainable based on this estimation. However, this approach faces two main challenges: the time required to calculate all estimations can result in no effective goal selection, and using relaxed plans often underestimates the real cost, leading to sets of oversubscribed goals. To address these challenges, our paper studies two solutions: computing the estimations only when needed and using real plans instead of relaxed plans to calculate cost estimations. Experiments show that real plans offer advantages in terms of initial plan utility, but the gap between both approaches narrows when more time is given for plan refinement.2025-09-16T00:00:00+00:00Copyright (c) 2024 Association for the Advancement of Artificial Intelligencehttps://ojs.aaai.org/index.php/ICAPS/article/view/36123Landmark Generation in HTN Planning Revisited2025-09-16T15:21:11+00:00Victor Scherer Putrichscherer.victor98@gmail.comFelipe Meneguzzi~Felipe_Meneguzzi1@example.comAndré Grahl Pereira~Andre_Grahl_Pereira1@example.comIn Hierarchical Task Network (HTN) planning, landmarks are facts that must hold true, and tasks or methods that must be included in every solution. Existing landmark generation techniques for HTN planning rely on the Delete and Ordering Free (DOF) relaxation and are known to be sound but incomplete, primarily due to the limitations introduced by Task Insertion. This paper presents a new landmark generation method that builds on a previous AND/OR graph-based approach, extending it to capture additional hierarchical dependencies among tasks and methods. We prove that our approach is sound and dominates existing techniques, though it remains incomplete under the DOF relaxation. Experimental results on IPC benchmarks for totally ordered problems show that our method identifies significantly more task and method landmarks across most domains, improving coverage with minimal computational overhead.2025-09-16T00:00:00+00:00Copyright (c) 2024 Association for the Advancement of Artificial Intelligencehttps://ojs.aaai.org/index.php/ICAPS/article/view/36124SibylSatOpt: a MaxSAT-based Greedy Optimal Search for TOHTN Planning2025-09-16T15:21:12+00:00Gaspard Quenardgaspard.quenard@univ-grenoble-alpes.frDamien Pellier~Damien_Pellier1@example.comHumbert Fiorino~Humbert_FIORINO1@example.comThis paper introduces SibylSatOpt, a novel approach to finding optimal plans for Totally-Ordered HTN (TOHTN) problems by leveraging greedy search techniques with MaxSAT. Unlike previous SAT-based HTN planners that employed a blind breadth-first search strategy, SibylSatOpt is guided by an admissible heuristic. This heuristic combines a relaxed MaxSAT encoding of the problem with the Task Decomposition Graph (TDG) heuristic. As we demonstrate, the admissibility of the heuristic guarantees that the found solution is optimal. Experimental results on IPC benchmarks show that SibylSatOpt significantly outperforms existing optimal TOHTN planners in both runtime and problem coverage.2025-09-16T00:00:00+00:00Copyright (c) 2024 Association for the Advancement of Artificial Intelligencehttps://ojs.aaai.org/index.php/ICAPS/article/view/36125On Using Lazy Greedy Best-First Search with Subgoaling Relaxation in Numeric Planning Problems2025-09-16T15:21:13+00:00Enrico Scalaenrico.scala@unibs.itLuigi Bonassiluigi.bonassi@unibs.itThis paper studies the use of lazy greedy best-first search for numeric planning problems in combination with relaxation-based heuristics, helpful actions, and up-to-jumping actions. In particular, the new search schema that we study, whilst postponing evaluation of the heuristic at expansion time, focuses the search over those states that are reached by helpful and up-to-jumping actions. In addition, we revisit linear abstractions by improving the balance between computation time and information, providing guidance in non-simple numeric planning problems, too. The new search schema compares favorably over the IPC-23 benchmarks with alternative complete heuristic search planners from the literature.2025-09-16T00:00:00+00:00Copyright (c) 2024 Association for the Advancement of Artificial Intelligencehttps://ojs.aaai.org/index.php/ICAPS/article/view/36126Automating the Generation of Prompts for LLM-based Action Choice in PDDL Planning2025-09-16T15:21:15+00:00Katharina Steinkstein@coli.uni-saarland.deDaniel Fišerdanfis@danfis.czJörg Hoffmann~Jorg_Hoffmann2@example.comAlexander Koller~Alexander_Koller2@example.comLarge language models (LLMs) have revolutionized a large variety of NLP tasks. An active debate is to what extent they can do reasoning and planning. Prior work has assessed the latter in the specific context of PDDL planning, based on manually converting three PDDL domains into natural language (NL) prompts. Here we automate this conversion step, showing how to leverage an LLM to automatically generate NL prompts from PDDL input. Our automatically generated NL prompts result in similar LLM-planning performance as the previous manually generated ones. Beyond this, the automation enables us to run much larger experiments, providing for the first time a broad evaluation of LLM planning performance in PDDL. Our NL prompts yield better performance than PDDL prompts and simple template-based NL prompts. Compared to symbolic planners, LLM planning lags far behind; but in some domains, our best LLM configuration scales up further than A* using LM-cut.2025-09-16T00:00:00+00:00Copyright (c) 2024 Association for the Advancement of Artificial Intelligencehttps://ojs.aaai.org/index.php/ICAPS/article/view/36127A* for Bounding Shortest Paths in the Graphs of Convex Sets2025-09-16T15:21:16+00:00Kaarthik Sundar~Kaarthik_Sundar1@example.comSivakumar Rathinamsrathinam@tamu.eduWe present a novel algorithm that fuses the existing convex-programming based approach with heuristic information to find optimality guarantees and near-optimal paths for the Shortest Path Problem in the Graph of Convex Sets (SPP-GCS). Our method, inspired by A* initiates a best-first-like procedure from a designated subset of vertices and iteratively expands it until further growth is neither possible nor beneficial. Traditionally, obtaining solutions with bounds for an optimization problem involves solving a relaxation, modifying the relaxed solution to a feasible one, and then comparing the two solutions to establish bounds. However, for SPP-GCS, we demonstrate that reversing this process can be more advantageous, especially with Euclidean travel costs. In other words, we initially employ A* to find a feasible solution for SPP-GCS, then solve a convex relaxation restricted to the vertices explored by A* to obtain a relaxed solution, and finally, compare the solutions to derive bounds. We present numerical results to highlight the advantages of our algorithm over the existing approach in terms of the sizes of the convex programs solved and computation time.2025-09-16T00:00:00+00:00Copyright (c) 2024 Association for the Advancement of Artificial Intelligencehttps://ojs.aaai.org/index.php/ICAPS/article/view/36128Leveraging Action Relational Structures for Integrated Learning and Planning2025-09-16T15:21:17+00:00Ryan Xiao Wang~Ryan_Xiao_Wang1@example.comFelipe Trevizanfelipe.trevizan@gmail.comRecent advances in planning have explored using learning methods to help planning. However, little attention has been given to adapting search algorithms to work better with learning systems. In this paper, we introduce partial-space search, a new search space for classical planning that leverages the relational structure of actions given by PDDL action schemas -- a structure overlooked by traditional planning approaches. This method allows for a more focused and efficient search and is better suited for machine learning heuristics by providing a more granular view of the search space. To guide partial-space search, we introduce action set heuristics that evaluate sets of actions in a state. We describe how to automatically convert existing heuristics into action set heuristics. We also train action set heuristics from scratch using large training datasets from partial-space search. Our new planner, LazyLifted, exploits our better integrated search and learning heuristics and outperforms the state-of-the-art ML-based heuristic on IPC 2023 learning track (LT) benchmarks. We also show the efficiency of LazyLifted on high branching factor tasks and show that it surpasses LAMA in the combined IPC 2023 LT and high branching factor benchmarks.2025-09-16T00:00:00+00:00Copyright (c) 2024 Association for the Advancement of Artificial Intelligencehttps://ojs.aaai.org/index.php/ICAPS/article/view/36129Partially Observable Monte-Carlo Graph Search2025-09-16T15:21:18+00:00Yang Youyang.you@ukaea.ukVincent Thomas~Vincent_Thomas2@example.comAlex Schutz~Alex_Schutz1@example.comRobert Skilton~Robert_Skilton1@example.comNick Hawes~Nick_Hawes1@example.comOlivier Buffet~Olivier_Buffet1@example.comCurrently, large partially observable Markov decision processes (POMDPs) are often solved by sampling-based online methods which interleave planning and execution phases. However, a pre-computed offline policy is more desirable in POMDP applications with time or energy constraints. But previous offline algorithms are not able to scale up to large POMDPs. In this article, we propose a new sampling-based algorithm, the partially observable Monte-Carlo graph search (POMCGS) to solve large POMDPs offline. Different from many online POMDP methods, which progressively develop a tree while performing (Monte-Carlo) simulations, POMCGS folds this search tree on the fly to construct a policy graph, so that computations can be drastically reduced, and users can analyze and validate the policy prior to embedding and executing it. Moreover, POMCGS, together with action progressive widening and observation clustering methods provided in this article, is able to address certain continuous POMDPs. Through experiments, we demonstrate that POMCGS can generate policies on the most challenging POMDPs, which cannot be computed by previous offline algorithms, and these policies' values are competitive compared with the state-of-the-art online POMDP algorithms.2025-09-16T00:00:00+00:00Copyright (c) 2024 Association for the Advancement of Artificial Intelligencehttps://ojs.aaai.org/index.php/ICAPS/article/view/36130DynTaskMAS: A Dynamic Task Graph-driven Framework for Asynchronous and Parallel LLM-based Multi-Agent Systems2025-09-16T15:21:20+00:00Junwei Yuyujunweiy@yeah.netYepeng Dingyepengd@acm.orgHiroyuki Sato~Hiroyuki_Sato2@example.comThe emergence of Large Language Models (LLMs) in Multi-Agent Systems (MAS) has opened new possibilities for artificial intelligence, yet current implementations face significant challenges in resource management, task coordination, and system efficiency. While existing frameworks demonstrate the potential of LLM-based agents in collaborative problem-solving, they often lack sophisticated mechanisms for parallel execution and dynamic task management. This paper introduces DynTaskMAS, a novel framework that orchestrates asynchronous and parallel operations in LLM-based MAS through dynamic task graphs. The framework features four key innovations: (1) a Dynamic Task Graph Generator that intelligently decomposes complex tasks while maintaining logical dependencies, (2) an Asynchronous Parallel Execution Engine that optimizes resource utilization through efficient task scheduling, (3) a Semantic-Aware Context Management System that enables efficient information sharing among agents, and (4) an Adaptive Workflow Manager that dynamically optimizes system performance. Experimental evaluations demonstrate that DynTaskMAS achieves significant improvements over traditional approaches: a 21-33% reduction in execution time across task complexities (with higher gains for more complex tasks), a 35.4% improvement in resource utilization (from 65% to 88%), and near-linear throughput scaling up to 16 concurrent agents (3.47× improvement for 4× agents). Our framework establishes a foundation for building scalable, high-performance LLM-based multi-agent systems capable of handling complex, dynamic tasks efficiently.2025-09-16T00:00:00+00:00Copyright (c) 2024 Association for the Advancement of Artificial Intelligencehttps://ojs.aaai.org/index.php/ICAPS/article/view/36131HTN Plan Repair Algorithms Compared: Strengths and Weaknesses of Different Methods2025-09-16T15:21:22+00:00Paul Zaidinspzaidins@umd.eduRobert P. Goldman~Robert_P._Goldman1@example.comUgur Kuter~Ugur_Kuter1@example.comDana Nau~Dana_S._Nau1@example.comMark Roberts~Mark_Roberts1@example.comThis paper provides theoretical and empirical comparisons of three recent hierarchical plan repair algorithms: SHOPFIXER, IPYHOPPER, and REWRITE. Our theoretical results show that the three algorithms correspond to three different definitions of the plan repair problem, leading to differences in the algorithms’ search spaces, the repair problems they can solve, and the kinds of repairs they can make. Understanding these distinctions is important when choosing a repair method for any given application. Building on the theoretical results, we evaluate the algorithms empirically in a series of benchmark planning problems. Our empirical results provide more detailed insight into the run- time repair performance of these systems and the coverage of the repair problems solved, based on algorithmic properties such as replanning, chronological backtracking, and back- jumping over plan trees.2025-09-16T00:00:00+00:00Copyright (c) 2024 Association for the Advancement of Artificial Intelligencehttps://ojs.aaai.org/index.php/ICAPS/article/view/36132Observation Adaptation via Annealed Importance Resampling for Partially Observable Markov Decision Processes2025-09-16T15:21:23+00:00Yunuo Zhangyunuo.zhang@vanderbilt.eduBaiting Luo~Baiting_Luo1@example.comAyan Mukhopadhyay~Ayan_Mukhopadhyay1@example.comAbhishek Dubey~Abhishek_Dubey1@example.comPartially observable Markov decision processes (POMDPs) are a general mathematical model for sequential decision-making in stochastic environments under state uncertainty. POMDPs are often solved online, which enables the algorithm to adapt to new information in real time. Online solvers typically use bootstrap particle filters based on importance resampling for updating the belief distribution. Since directly sampling from the ideal state distribution given the latest observation and previous state is infeasible, particle filters approximate the posterior belief distribution by propagating states and adjusting weights through prediction and resampling steps. However, in practice, the importance resampling technique often leads to particle degeneracy and sample impoverishment when the state transition model poorly aligns with the posterior belief distribution, especially when the received observation is noisy. We propose an approach that constructs a sequence of bridge distributions between the state-transition and optimal distributions through iterative Monte Carlo steps, better accommodating noisy observations in online POMDP solvers. Our algorithm demonstrates significantly superior performance compared to state-of-the-art methods when evaluated across multiple challenging POMDP domains.2025-09-16T00:00:00+00:00Copyright (c) 2024 Association for the Advancement of Artificial Intelligencehttps://ojs.aaai.org/index.php/ICAPS/article/view/36133Rack Position Optimization in Large-Scale Heterogeneous Data Centers2025-09-16T15:21:25+00:00Chang-Lin Chenchen3365@purdue.eduJiayu Chen~Jiayu_Chen2@example.comTian Lan~Tian_Lan4@example.comZhaoxia Zhao~Zhaoxia_Zhao1@example.comHongbo Donghongbodong@meta.comVaneet Aggarwalvaneet@purdue.eduAs rapidly growing AI computational demands accelerate the need for new hardware installation and maintenance, this work explores optimal data center resource management by balancing operational efficiency with fault tolerance through strategic rack positioning considering diverse resources and locations. Traditional mixed-integer programming (MIP) approaches often struggle with scalability, while heuristic methods may result in significant sub-optimality. To address these issues, this paper presents a novel two-tier optimization framework using a high-level deep reinforcement learning (DRL) model to guide a low-level gradient-based heuristic for local search. The high-level DRL agent employs Leader Reward for optimal rack type ordering, and the low-level heuristic efficiently maps racks to positions, minimizing movement counts and ensuring fault-tolerant resource distribution. This approach allows scalability to over 100,000 positions and 100 rack types. Our method outperformed the gradient-based heuristic by 7% on average and the MIP solver by over 30% in objective value. It achieved a 100% success rate versus MIP's 97.5% (within a 20-minute limit), completing in just 2 minutes compared to MIP's 1630 minutes (i.e., almost 4 orders of magnitude improvement). Unlike the MIP solver, which showed performance variability under time constraints and high penalties, our algorithm consistently delivered stable, efficient results—an essential feature for large-scale data center management.2025-09-16T00:00:00+00:00Copyright (c) 2024 Association for the Advancement of Artificial Intelligencehttps://ojs.aaai.org/index.php/ICAPS/article/view/36134New Exact Methods for Solving Quadratic Traveling Salesman Problem2025-09-16T15:21:27+00:00Yuxiao Chen~Yuxiao_Chen7@example.comAnubhav Singhanubhav.singh@utoronto.caRyo Kuroiwa~Ryo_Kuroiwa1@example.comJ. Christopher Beck~Chris_Beck1@example.comThe Quadratic Traveling Salesman Problem (QTSP) is a generalization of the Traveling Salesman Problem (TSP) with important applications in robotics and bioinformatics. The QTSP objective value depends on pairs of consecutive edges in the tour; hence, it is quadratic and generally hard to optimize. While various exact-solving approaches have been explored, many rely on specialized procedures and struggle to scale on large instances. More recently, carefully crafted metaheuristics have demonstrated better primal bounds and scalability, but they cannot provide any guarantees of solution quality nor prove the optimality of any solution. In this work, we propose new exact models for QTSP. We define direct encodings of QTSP in domain-independent dynamic programming (DIDP), constraint programming (CP), mixed integer quadratic programming (MIQP), and mixed integer linear programming (MILP), and compare them with the best-known exact method, a branch and cut (B&C) algorithm, and the state-of-the-art metaheuristic, a hybrid genetic algorithm (HGA). Our experimental results demonstrate that the DIDP model shows better scalability and finds the best feasible solutions on average among all exact solvers, including the B&C algorithm. HGA finds the best feasible solution among all approaches, with DIDP within 15% of the HGA cost on all experimented instances. Also, interestingly, our MILP model with the subtour elimination constraints generally finds better feasible solutions than the B&C algorithm while matching it in proving optimality, suggesting that lazily adding sub-tour elimination cuts is not particularly helpful in QTSP.2025-09-16T00:00:00+00:00Copyright (c) 2024 Association for the Advancement of Artificial Intelligencehttps://ojs.aaai.org/index.php/ICAPS/article/view/36135LTLf Adaptive Synthesis for Multi-Tier Goals in Nondeterministic Domains2025-09-16T15:21:28+00:00Giuseppe de Giacomo~Giuseppe_De_Giacomo1@example.comGianmarco Parrettiparretti@diag.uniroma1.itShufang Zhu~Shufang_Zhu2@example.comWe study a variant of LTLf synthesis that synthesizes adaptive strategies for achieving a multi-tier goal, consisting of multiple increasingly challenging LTLf objectives in nondeterministic planning domains. Adaptive strategies are strategies that at any point of their execution (i) enforce the satisfaction of as many objectives as possible in the multi-tier goal, and (ii) exploit possible cooperation from the environment to satisfy as many as possible of the remaining ones. This happens dynamically: if the environment cooperates (ii) and an objective becomes enforceable (i), then our strategies will enforce it. We provide a game-theoretic technique to compute adaptive strategies that is sound and complete. Notably, our technique is polynomial, in fact quadratic, in the number of objectives. In other words, it handles multi-tier goals with only a minor overhead compared to standard LTLf synthesis.2025-09-16T00:00:00+00:00Copyright (c) 2024 Association for the Advancement of Artificial Intelligencehttps://ojs.aaai.org/index.php/ICAPS/article/view/36136On the Gains from Using Action Observations in Domain Repair2025-09-16T15:21:29+00:00Alba Grageraagragera@pa.uc3m.esRaquel Fuentetaja~Raquel_Fuentetaja1@example.comÁngel García Olayaagolaya@inf.uc3m.esFernando Fernández~Fernando_Fernandez2@example.comDesigning a PDDL planning domain is an error-prone task, which can result in unsolvable planning tasks or unexpected plans. Existing domain repair methods either rely on a complete plan to identify unsatisfied preconditions or operate without any input plan by compiling the flawed planning task into a new planning task with self-repair actions. In contrast, learning approaches often benefit from a range of input observations to infer domain models. In this paper, we extend the self-repair compilation to also accept as input a variable number of action observations. Experimental results show improved domain repair quality and generally strong performance compared to previous domain repair and learning methods.2025-09-16T00:00:00+00:00Copyright (c) 2024 Association for the Advancement of Artificial Intelligencehttps://ojs.aaai.org/index.php/ICAPS/article/view/36137A Flow Based Planning Method for Multi-Agent Progression with Deployable Agents and Communication Constraints2025-09-16T15:21:30+00:00Emile Sibouletemile.siboulet@laas.frRoland Godet~Roland_Godet1@example.comArthur Bit-Monnot~Arthur_Bit-Monnot2@example.comMarc-Emmanuel Coupvent Des Graviers~Marc-Emmanuel_Coupvent_des_Graviers1@example.comChristophe Guettier~Christophe_GUETTIER1@example.comSimon Lacroix~Simon_Lacroix1@example.comThis paper deals with the problem of planning multiple agent movements through a mission area modeled as a graph. The agents undergo classic communication and temporal constraints, and the quantitative objective is the minimization of the team’s traversal makespan. Additional specificities make the problem a particularly complex routing one: on some nodes are associated durative and coordinated actions to perform, which can involve either the co-presence of several agents or time dependencies. Also, some agents are deployable and able to move on denser graphs: namely, aerial robots can take off and land on the ground vehicle at any planned position, and can fly above ground obstacles. We model the problem as a CSP and solve it with a network flow model. Results show the efficacy of the model and resolution scheme, which provides solutions with one or two orders of magnitude smaller time than a numerical temporal hierarchical planning model, with only a few percent loss of optimality.2025-09-16T00:00:00+00:00Copyright (c) 2024 Association for the Advancement of Artificial Intelligencehttps://ojs.aaai.org/index.php/ICAPS/article/view/36138Agent Planning Programs as Non-deterministic Planning under Fairness2025-09-16T15:21:32+00:00Nitin Yadavnitin.yadav@unimelb.edu.auSebastian Sardiña~Sebastian_Sardina1@example.comHector Geffner~Hector_Geffner2@example.comWe propose an approach for solving Agent Planning Programs (APP) based on a reduction to (strong-cyclic) Fully Observable Non-Deterministic (FOND) planning. APPs represent a middle-ground between automated planning and agent-oriented programming, in which the space of possible agent behavior is "programmed" as a network of declarative goals wrt an underlying planning domain. Each transition in an APP represents a local planning problem that may need to be addressed by the agent executing the APP. APPs allow the specification of continuous goal-driven behavior in which the "next" goal is externally chosen, thus going beyond one-shot planning. Two methods have been proposed for solving APPs: a principled but inefficient LTL reactive synthesis technique; and a more efficient but arguably ad-hoc approach that relies on multiple "local" classical planning and meta-level backtracking. We demonstrate how APPs can be solved in a principled manner by developing an elegant reduction to non-deterministic planning under fairness assumptions, and show experimentally with existing FOND solvers its practical value. We also provide a new solution concept that is simpler and closer to mainstream planning than the existing one.2025-09-16T00:00:00+00:00Copyright (c) 2024 Association for the Advancement of Artificial Intelligencehttps://ojs.aaai.org/index.php/ICAPS/article/view/36139Quality Diversity for Variational Quantum Circuit Optimization2025-09-16T15:21:33+00:00Maximilian Zornmaximilian.zorn@ifi.lmu.deJonas Stein~Jonas_Stein1@example.comMaximilian Balthasar Mansky~Maximilian_Balthasar_Mansky1@example.comPhilipp Altmann~Philipp_Altmann1@example.comMichael Kölle~Michael_Kolle1@example.comClaudia Linnhoff-Popien~Claudia_Linnhoff-Popien1@example.comOptimizing the architecture of variational quantum circuits (VQCs) is crucial for advancing quantum computing (QC) towards practical applications. Current methods range from static ansatz design and evolutionary methods to machine learned VQC optimization, but are either slow, sample inefficient or require infeasible circuit depth to realize advantages. Quality diversity (QD) search methods combine diversity-driven optimization with user-specified features that offer insight into the optimization quality of circuit solution candidates. However, the choice of quality measures and the representational modeling of the circuits to allow for optimization with the current state-of-the-art QD methods like covariance matrix adaptation (CMA), is currently still an open problem. In this work we introduce a directly matrix-based circuit engineering, that can be readily optimized with QD-CMA methods and evaluate heuristic circuit quality properties like expressivity and gate-diversity as quality measures. We empirically show superior circuit optimization of our QD optimization w.r.t. speed and solution score against a set of robust benchmark algorithms from the literature on a selection of NP-hard combinatorial optimization problems.2025-09-16T00:00:00+00:00Copyright (c) 2024 Association for the Advancement of Artificial Intelligencehttps://ojs.aaai.org/index.php/ICAPS/article/view/36140On Planning Through LLMs2025-09-16T15:21:35+00:00Mattia Chiarimattia.chiari@unibs.itLuca Putelli~Luca_Putelli1@example.comNicholas Rossetti~Nicholas_Rossetti1@example.comIvan Serina~Ivan_Serina2@example.comAlfonso Emilio Gerevini~Alfonso_Gerevini1@example.comIn recent years, various studies have been carried out to assess whether Large Language Models (LLMs) possess different reasoning capabilities, including those required in automated planning. Typically, these studies provide the LLM with a planning domain and a problem, specified by an initial state and a goal, and require the LLM model to generate a plan solving the problem. Despite this common configuration, such studies significantly differ in the used models, the information provided to the model, the possible involvement of symbolic planners, and the experimental approaches used for the evaluation. Motivated by the growing interest in LLMs and in the understanding of their reasoning abilities, in this work we offer a concise review of recent studies on using LLMs for planning. We outline the main research trends and discuss their most notable findings. Furthermore, we identify key challenges and highlight critical aspects to consider when evaluating a LLM in terms of learning to plan and generating solution plans.2025-09-16T00:00:00+00:00Copyright (c) 2024 Association for the Advancement of Artificial Intelligencehttps://ojs.aaai.org/index.php/ICAPS/article/view/36141Revisiting LLMs in Planning from Literature Review: a Semi-Automated Analysis Approach and Evolving Categories Representing Shifting Perspectives2025-09-16T15:21:37+00:00Vishal Pallaganivishalp@mailbox.sc.eduNitin Gupta~Nitin_Gupta3@example.comBharath Chandra Muppasani~Bharath_Chandra_Muppasani1@example.comBiplav Srivastava~Biplav_Srivastava2@example.comTracking the rapidly evolving literature at the intersection of large language models (LLMs) and planning has become increasingly complex due to significant growth in research output and shifting thematic focuses. Building on an earlier survey, which organized 126 papers collected till November 2023 into eight categories, we present a platform that automates the extraction, categorization, and trend analysis of new papers. Our analysis reports on category drift, identifying evolving perspectives on the use of LLMs for planning. Our analysis reveals a decline in the percentage of papers for six categories, an increase in two, and the emergence of two new categories. Specifically, we contribute by (1) developing an automated system for categorizing new papers into existing or emergent categories, (2) reporting on category shifts with the addition of 47 new papers till September 2024, and (3) introducing a platform for continuous extraction, categorization, and trend tracking in LLM and planning research. This platform also features a leaderboard to encourage innovations in automated paper categorization.2025-09-16T00:00:00+00:00Copyright (c) 2024 Association for the Advancement of Artificial Intelligencehttps://ojs.aaai.org/index.php/ICAPS/article/view/36142Knowledge Engineering for Planning and Scheduling in the LLM Era2025-09-16T15:21:39+00:00Mauro Vallatim.vallati@hud.ac.ukRoman Barták~Roman_Bartak1@example.comLukáš Chrpachrpaluk@cvut.czThomas L. McCluskey~Thomas_Leo_McCluskey1@example.comRonald P. A. Petrick~Ron_Petrick1@example.comAutomated planning requires explicit domain knowledge, typically represented in PDDL, to generate effective solutions. The process of formulating, maintaining, and validating this knowledge is the cornerstone of Knowledge Engineering for Planning and Scheduling (KEPS). Although Large Language Models (LLMs) have shown promise for automated planning tasks, and are gaining popularity in the field, their impact on KEPS remains unexplored. In this paper we investigate the potential of LLMs to streamline and enhance the KEPS field, by taking a close look at the processes used to develop explicit symbolic knowledge models in safety-related applications. The paper's findings are that while LLMs can assist in knowledge acquisition and formulation, human domain expertise and external symbolic validators remain indispensable for ensuring correctness, operationality and completeness of planning applications.2025-09-16T00:00:00+00:00Copyright (c) 2024 Association for the Advancement of Artificial Intelligencehttps://ojs.aaai.org/index.php/ICAPS/article/view/36143HDDLGym: A Tool for Studying Multi-Agent Hierarchical Problems Defined in HDDL with OpenAI Gym2025-09-16T15:21:40+00:00Ngoc Lantmla@mit.eduRuaridh Mon-Williams~Ruaridh_Mon-Williams1@example.comJulie A. Shah~Julie_Shah2@example.comIn recent years, reinforcement learning (RL) methods have been widely tested using tools like OpenAI Gym, though many tasks in these environments could also benefit from hierarchical planning. However, there is a lack of a tool that enables seamless integration of hierarchical planning with RL. Hierarchical Domain Definition Language (HDDL), used in classical planning, introduces a structured approach well-suited for model-based RL to address this gap. To bridge this integration, we introduce HDDLGym, a Python-based tool that automatically generates OpenAI Gym environments from HDDL domains and problems. HDDLGym serves as a link between RL and hierarchical planning, supporting multi-agent scenarios and enabling collaborative planning among agents. This paper provides an overview of HDDLGym’s design and implementation, highlighting the challenges and design choices involved in integrating HDDL with the Gym interface, and applying RL policies to support hierarchical planning. We also provide detailed instructions and demonstrations for using the HDDLGym framework, including how to work with existing HDDL domains and problems from International Planning Competitions, exemplified by the Transport domain. Additionally, we offer guidance on creating new HDDL domains for multi-agent scenarios and demonstrate the practical use of HDDLGym in the Overcooked domain. By leveraging the advantages of HDDL and Gym, HDDLGym aims to be a valuable tool for studying RL in hierarchical planning, particularly in multi-agent contexts.2025-09-16T00:00:00+00:00Copyright (c) 2024 Association for the Advancement of Artificial Intelligencehttps://ojs.aaai.org/index.php/ICAPS/article/view/36144Analyzing Launch Operations Using the Spaceport Throughput Analysis Resource (STAR)2025-09-16T15:21:42+00:00Richard Levinsonrichard.j.levinson@nasa.govVijayakumar Baskaranvijayakumar.baskaran-1@nasa.govJeffrey Brinkjeffrey.s.brink@nasa.govJeremy Frankjeremy.d.frank@nasa.govWe describe the development of the Spaceport Throughput Analysis Resource (STAR), which evaluates Kennedy Space Center spaceport launch throughput. STAR integrates simulation and limited rescheduling, using a constraint programming model to check constraints and reschedule events. The outputs of STAR are launch delays, and the constraint violations leading to those delays, that can be used to inform infrastructure investments to reduce future delays. At STAR's core is a constraint program representing a limited horizon scheduling problem used to identify resource constraint violations, and revise schedules in the presence of unexpected events that disrupt the launch schedule. We describe the design and implementation of STAR using illustrative examples, and describe performance results on use cases showing how increased launch rates stress KSC infrastructure.2025-09-16T00:00:00+00:00Copyright (c) 2024 Association for the Advancement of Artificial Intelligencehttps://ojs.aaai.org/index.php/ICAPS/article/view/36145Posthoc: The Visualisation Platform for Search2025-09-16T15:21:43+00:00Kevin Zhengkevin.zheng@monash.eduDaniel Harabor~Daniel_Harabor1@example.comMichael Wybrow~Michael_Wybrow1@example.comSearch, especially pathfinding search, is a foundational problem-solving technique in Computer Science for sequential-decision making problems. Such algorithms appear widely in the academic literature and they have found broad applicability including in personal navigation, robotics and computer games. Despite their importance, search algorithms can be challenging for practitioners to implement and difficult for learners to understand. In this work, we present POSTHOC, a visualisation and debugging tool which aims to improve the situation. Our approach relies on search traces, textual records of key operations that occur during the search process; e.g., node expansion, successor generation and other events of interest. We employ search traces to visualise the decision-making process and to construct domain-specific representations for each event. We show how these traces can be used — in a variety of contexts — to inspect, debug, and better understand search algorithms. Finally, we demonstrate POSTHOC in a range of different real-world case studies.2025-09-16T00:00:00+00:00Copyright (c) 2024 Association for the Advancement of Artificial Intelligence