Proceedings of the International Conference on Automated Planning and Scheduling

Proceedings of the International Conference on Automated Planning and Scheduling https://ojs.aaai.org/index.php/ICAPS <p>The annual ICAPS conference series was formed in 2003 through the merger of two preexisting biennial conferences, the International Conference on Artificial Intelligence Planning and Scheduling (AIPS) and the European Conference on Planning (ECP). ICAPS continues the traditional high standards of AIPS and ECP as an archival forum for new research in the field of automated planning and scheduling. The Proceedings of the International Conference on Automated Planning and Scheduling contains the annual, archival published work of the ICAPS conference.</p> Association for the Advancement of Artificial Intelligence en-US Proceedings of the International Conference on Automated Planning and Scheduling 2334-0835 Specifying Goals to Deep Neural Networks with Answer Set Programming https://ojs.aaai.org/index.php/ICAPS/article/view/31454 Recently, methods such as DeepCubeA have used deep reinforcement learning to learn domain-specific heuristic functions in a largely domain-independent fashion. However, such methods either assume a predetermined goal or assume that goals will be given as fully-specified states. Therefore, specifying a set of goal states to these learned heuristic functions is often impractical. To address this issue, we introduce a method of training a heuristic function that estimates the distance between a given state and a set of goal states represented as a set of ground atoms in first-order logic. Furthermore, to allow for more expressive goal specification, we introduce techniques for specifying goals as answer set programs and using answer set solvers to discover sets of ground atoms that meet the specified goals. In our experiments with the Rubik's cube, sliding tile puzzles, and Sokoban, we show that we can specify and reach different goals without any need to re-train the heuristic function. Our code is publicly available at https://github.com/forestagostinelli/SpecGoal. Forest Agostinelli Rojina Panta Vedant Khandelwal Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 2 10 10.1609/icaps.v34i1.31454 Exact Multi-objective Path Finding with Negative Weights https://ojs.aaai.org/index.php/ICAPS/article/view/31455 The point-to-point Multi-objective Shortest Path (MOSP) problem is a classic yet challenging task that involves finding all Pareto-optimal paths between two points in a graph with multiple edge costs. Recent studies have shown that employing A* search can lead to state-of-the-art performance in solving MOSP instances with non-negative costs. This paper proposes a novel A*-based multi-objective search framework that not only handles graphs with negative costs and even negative cycles but also incorporates multiple speed-up techniques to enhance the efficiency of exhaustive search with A*. Through extensive experiments, our algorithm demonstrates remarkable success in solving difficult MOSP instances, outperforming leading solutions by several factors. Saman Ahmadi Nathan R. Sturtevant Daniel Harabor Mahdi Jalili Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 11 19 10.1609/icaps.v34i1.31455 On the Computational Complexity of Stackelberg Planning and Meta-Operator Verification https://ojs.aaai.org/index.php/ICAPS/article/view/31456 Stackelberg planning is a recently introduced single-turn two-player adversarial planning model, where two players are acting in a joint classical planning task, the objective of the first player being hampering the second player from achieving its goal. This places the Stackelberg planning problem somewhere between classical planning and general combinatorial two-player games. But, where exactly? All investigations of Stackelberg planning so far focused on practical aspects. We close this gap by conducting the first theoretical complexity analysis of Stackelberg planning. We show that in general Stackelberg planning is actually no harder than classical planning. Under a polynomial plan-length restriction, however, Stackelberg planning is a level higher up in the polynomial complexity hierarchy, suggesting that compilations into classical planning come with a worst-case exponential plan-length increase. In attempts to identify tractable fragments, we further study its complexity under various planning task restrictions, showing that Stackelberg planning remains intractable where classical planning is not. We finally inspect the complexity of meta-operator verification, a problem that has been recently connected to Stackelberg planning. Gregor Behnke Marcel Steinmetz Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 20 24 10.1609/icaps.v34i1.31456 Non-deterministic Planning for Hyperproperty Verification https://ojs.aaai.org/index.php/ICAPS/article/view/31457 Non-deterministic planning aims to find a policy that achieves a given objective in an environment where actions have uncertain effects, and the agent - potentially - only observes parts of the current state. Hyperproperties are properties that relate multiple paths of a system and can, e.g., capture security and information-flow policies. Popular logics for expressing temporal hyperproperties - such as HyperLTL - extend LTL by offering selective quantification over executions of a system. In this paper, we show that planning offers a powerful intermediate language for the automated verification of hyperproperties. Concretely, we present an algorithm that, given a HyperLTL verification problem, constructs a non-deterministic multi-agent planning instance (in the form of a QDec-POMDP) that, when admitting a plan, implies the satisfaction of the verification problem. We show that for large fragments of HyperLTL, the resulting planning instance corresponds to a classical, FOND, or POND planning problem. We implement our encoding in a prototype verification tool and report on encouraging experimental results. Raven Beutner Bernd Finkbeiner Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 25 30 10.1609/icaps.v34i1.31457 On Policy Reuse: An Expressive Language for Representing and Executing General Policies that Call Other Policies https://ojs.aaai.org/index.php/ICAPS/article/view/31458 Recently, a simple but powerful language for expressing and learning general policies and problem decompositions (sketches) has been introduced in terms of rules defined over a set of Boolean and numerical features. In this work, we consider three extensions of this language aimed at making policies and sketches more flexible and reusable: internal memory states, as in finite state controllers; indexical features, whose values are a function of the state and a number of internal registers that can be loaded with objects; and modules that wrap up policies and sketches and allow them to call each other by passing parameters. In addition, unlike general policies that select state transitions rather than ground actions, the new language allows for the selection of such actions. The expressive power of the resulting language for policies and sketches is illustrated through a number of examples. Blai Bonet Dominik Drexler Héctor Geffner Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 31 39 10.1609/icaps.v34i1.31458 Abstraction Heuristics for Factored Tasks https://ojs.aaai.org/index.php/ICAPS/article/view/31459 One of the strongest approaches for optimal classical planning is A* search with heuristics based on abstractions of the planning task. Abstraction heuristics are well studied in planning formalisms without conditional effects such as SAS+. However, conditional effects are crucial to model many planning tasks compactly. In this paper, we focus on *factored* tasks which allow a specific form of conditional effect, where effects on variable x can only depend on the value of x. We generalize projections, domain abstractions, Cartesian abstractions and the counterexample-guided abstraction refinement method to this formalism. While merge-and-shrink already covers factored task in theory, we provide an implementation that does so. In our experiments, we compare these abstraction-based heuristics to other heuristics supporting conditional effects, as well as symbolic search. On our new benchmark set of factored tasks, pattern database heuristics solve the most problems, followed by symbolic approaches on par with domain abstractions. The more general Cartesian abstractions fall behind in terms of coverage but usually solve problems the fastest among all tested approaches. The generality of merge-and-shrink abstractions does not seem to be beneficial for these factored tasks. Clemens Büchner Patrick Ferber Jendrik Seipp Malte Helmert Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 40 49 10.1609/icaps.v34i1.31459 Multi-Agent Temporal Task Solving and Plan Optimization https://ojs.aaai.org/index.php/ICAPS/article/view/31460 Several multi-agent techniques are utilized to reduce the complexity of classical planning tasks, however, their applicability to temporal planning domains is a currently open line of study in the field of Automated Planning. In this paper, we present MA-LAMA, a factored, centralized, unthreated, satisfying, multi-agent temporal planner, that exploits the 'multi-agent nature' of temporal domains to perform plan optimization. In MA-LAMA, temporal tasks are translated to the constrained snap-actions paradigm, and an automatic agent decomposition, goal assignment, and required cooperation analysis are carried out to build independent search steps, called Search Phases. These Search Phases are then solved by consecutive agent local searches, using classical heuristics and temporal constraints. Experiments show that MA-LAMA is able to solve a wide range of classical and temporal multi-agent domains, performing significantly better in plan quality than other state-of-the-art temporal planners. J. Caballero Testón Maria D. R-Moreno Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 50 58 10.1609/icaps.v34i1.31460 Taming Discretised PDDL+ through Multiple Discretisations https://ojs.aaai.org/index.php/ICAPS/article/view/31461 The PDDL+ formalism allows the use of planning techniques in applications that require the ability to perform hybrid discrete-continuous reasoning. PDDL+ problems are notoriously challenging to tackle, and to reason upon them a well-established approach is discretisation. Existing systems rely on a single discretisation delta or, at most, two: a simulation delta to model the dynamics of the environment, and a planning delta, that is used to specify when decisions can be taken. However, there exist cases where this rigid schema is not ideal, for instance when agents with very different speeds need to cooperate or interact in a shared environment, and a more flexible approach that can accommodate more deltas is necessary. To address the needs of this class of hybrid planning problems, in this paper we introduce a reformulation approach that allows the encapsulation of different levels of discretisation in PDDL+ models, hence allowing any domain-independent planning engine to reap the benefits. Further, we provide the community with a new set of benchmarks that highlights the limits of fixed discretisation. Matteo Cardellini Marco Maratea Francesco Percassi Enrico Scala Mauro Vallati Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 59 67 10.1609/icaps.v34i1.31461 Return to Tradition: Learning Reliable Heuristics with Classical Machine Learning https://ojs.aaai.org/index.php/ICAPS/article/view/31462 Current approaches for learning for planning have yet to achieve competitive performance against classical planners in several domains, and have poor overall performance. In this work, we construct novel graph representations of lifted planning tasks and use the WL algorithm to generate features from them. These features are used with classical machine learning methods which have up to 2 orders of magnitude fewer parameters and train up to 3 orders of magnitude faster than the state-of-the-art deep learning for planning models. Our novel approach, WL-GOOSE, reliably learns heuristics from scratch and outperforms the hFF heuristic in a fair competition setting. It also outperforms or ties with LAMA on 4 out of 10 domains on coverage and 7 out of 10 domains on plan quality. WL-GOOSE is the first learning for planning model which achieves these feats. Furthermore, we study the connections between our novel WL feature generation method, previous theoretically flavoured learning architectures, and Description Logic Features for planning. Dillon Z. Chen Felipe Trevizan Sylvie Thiébaux Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 68 76 10.1609/icaps.v34i1.31462 More Flexible Proximity Wildcards Path Planning with Compressed Path Databases https://ojs.aaai.org/index.php/ICAPS/article/view/31463 Grid-based path planning is one of the classic problems in AI, and a popular topic in application areas such as computer games and robotics. Compressed Path Databases (CPDs) are recognized as a state-of-the-art method for grid-based path planning. It is able to find an optimal path extremely fast without state-space search. In recent years, researchers have tended to focus on improving CPDs by reducing CPD size or improving search performance. Among various methods, proximity wildcards are one of the most proven improvements in reducing the size of CPD. However, its proximity area is significantly restricted by complex terrain, which significantly affects the pathfinding efficiency and causes additional costs. In this paper, we enhance CPDs from the perspective of improving search efficiency and reducing search costs. Our work focuses on using more flexible methods to obtain larger proximity areas, so that more heuristic information can be used to improve search performance. Experiments conducted on the Grid-Based Path Planning Competition (GPPC) benchmarks demonstrate that the two proposed methods can effectively improve search efficiency and reduce search costs by up to 3 orders of magnitude. Remarkably, our methods can further reduce the storage cost, and improve the compression capability of CPDs simultaneously. Xi Chen Yue Zhang Yonggang Zhang Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 77 85 10.1609/icaps.v34i1.31463 On Verifying Linear Execution Strategies in Planning Against Nature https://ojs.aaai.org/index.php/ICAPS/article/view/31464 While planning and acting in environments in which nature can trigger non-deterministic events, the agent has to consider that the state of the environment might change without its consent. Practically, it means that the agent has to make sure that it eventually achieves its goal (if possible) despite the acts of nature. In this paper, we first formalize the semantics of such problems in Alternating-time Temporal Logic, which allows us to prove some theoretical properties of different types of solutions. Then, we focus on linear execution strategies, which resemble classical plans in that they follow a fixed sequence of actions. We show that any problem that can be solved by a linear execution strategy can be solved by a particular form of linear execution strategy which assigns wait-for preconditions to each action in the plan that specifies when to execute that action. Then, we propose a sound algorithm that verifies a sequence of actions and assigns wait-for preconditions to them by leveraging abstraction. Lukáš Chrpa Erez Karpas Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 86 94 10.1609/icaps.v34i1.31464 Planning and Acting While the Clock Ticks https://ojs.aaai.org/index.php/ICAPS/article/view/31465 Standard temporal planning assumes that planning takes place offline, and then execution starts at time 0. Recently, situated temporal planning was introduced, where planning starts at time 0, and execution occurs after planning terminates. Situated temporal planning reflects a more realistic scenario where time passes during planning. However, in situated temporal planning a complete plan must be generated before any action is executed. In some problems with time pressure, timing is too tight to complete planning before the first action must be executed. For example, an autonomous car that has a truck backing towards it should probably move out of the way now, and plan how to get to its destination later. In this paper, we propose a new problem setting: concurrent planning and execution, in which actions can be dispatched (executed) before planning terminates. Unlike previous work on planning and execution, we must handle wall clock deadlines that affect action applicability and goal achievement (as in situated planning) while also supporting dispatching actions before a complete plan has been found. We extend previous work on metareasoning for situated temporal planning to develop an algorithm for this new setting. Our empirical evaluation shows that when there is strong time pressure, our approach outperforms situated temporal planning. Andrew Coles Erez Karpas Andrey Lavrinenko Wheeler Ruml Solomon Eyal Shimony Shahaf Shperberg Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 95 103 10.1609/icaps.v34i1.31465 Planning with Object Creation https://ojs.aaai.org/index.php/ICAPS/article/view/31466 Classical planning problems are defined using some specification language, such as PDDL. The domain expert defines action schemas, objects, the initial state, and the goal. One key aspect of PDDL is that the set of objects cannot be modified during plan execution. While this is fine in many domains, sometimes it makes modeling more complicated. This may impact the performance of planners, and it requires the domain expert to bound the number of required objects beforehand, which can be a challenge. We introduce an extension to the classical planning formalism, where action effects can create and remove objects. This problem is semi-decidable, but it becomes decidable if we can bound the number of objects in any given state, even though the state space is still infinite. On the practical side, we extend the Powerlifted planning system to support this PDDL extension. Our results show that this extension improves the performance of Powerlifted while supporting more natural PDDL models. Augusto B. Corrêa Giuseppe De Giacomo Malte Helmert Sasha Rubin Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 104 113 10.1609/icaps.v34i1.31466 Multi-Objective Electric Vehicle Route and Charging Planning with Contraction Hierarchies https://ojs.aaai.org/index.php/ICAPS/article/view/31467 Electric vehicle (EV) travel planning is a complex task that involves planning the routes and the charging sessions for EVs while optimizing travel duration and cost. We show the applicability of the multi-objective EV travel planning algorithm with practically usable solution times on country-sized road graphs with a large number of charging stations and a realistic EV model. The approach is based on multi-objective A* search enhanced by Contraction hierarchies, optimal dimensionality reduction, and sub-optimal ϵ-relaxation techniques. We performed an extensive empirical evaluation on 182,000 problem instances showing the impact of various algorithm settings on real-world map of Bavaria and Germany with more than 12,000 charging stations. The results show the proposed approach is the first one capable of performing such a genuine multi-objective optimization on realistically large country-scale problem instances that can achieve practically usable planning times in order of seconds with only a minor loss of solution quality. The achieved speed-up varies from ~11× for optimal solution to more than 250× for sub-optimal solution compared to vanilla multi-objective A*. Marek Cuchý Jiří Vokřínek Michal Jakob Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 114 122 10.1609/icaps.v34i1.31467 Combined Task and Motion Planning via Sketch Decompositions https://ojs.aaai.org/index.php/ICAPS/article/view/31468 The challenge in combined task and motion planning (TAMP) is the effective integration of a search over a combinatorial space, usually carried out by a task planner, and a search over a continuous configuration space, carried out by a motion planner. Using motion planners for testing the feasibility of task plans and filling out the details is not effective because it makes the geometrical constraints play a passive role. This work introduces a new interleaved approach for integrating the two dimensions of TAMP that makes use of sketches, a recent simple but powerful language for expressing the decomposition of problems into subproblems. A sketch has width 1 if it decomposes the problem into subproblems that can be solved greedily in linear time. In the paper, a general sketch is introduced for several classes of TAMP problems which has width 1 under suitable assumptions. While sketch decompositions have been developed for classical planning, they offer two important benefits in the context of TAMP. First, when a task plan is found to be unfeasible due to the geometric constraints, the combinatorial search resumes in a specific subproblem. Second, the sampling of object configurations is not done once, globally, at the start of the search, but locally, at the start of each subproblem. Optimizations of this basic setting are also considered and experimental results over existing and new pick-and-place benchmarks are reported. Magí Dalmau Moreno Néstor García Vicenç Gómez Héctor Geffner Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 123 132 10.1609/icaps.v34i1.31468 Planning Domain Simulation: An Interactive System for Plan Visualisation https://ojs.aaai.org/index.php/ICAPS/article/view/31469 Representing and manipulating domain knowledge is essential for developing systems that can visualize plans. This paper presents a novel plan visualisation system called Planning Domain Simulation (PDSim) that employs knowledge representation and manipulation techniques to support the plan visualization process. PDSim can use PDDL or the Unified Planning Library's Python representation as the underlying language for modelling planning problems and provides an interface for users to manipulate this representation through interaction with the Unity game engine and a set of planners. The system’s features include visualising plan components, and their relationships, identifying plan conflicts, and examples applied to real-world problems. The benefits and limitations of PDSim are also discussed, highlighting future research directions in the area. Emanuele De Pellegrin Ronald P. A. Petrick Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 133 141 10.1609/icaps.v34i1.31469 Learning Quadruped Locomotion Policies Using Logical Rules https://ojs.aaai.org/index.php/ICAPS/article/view/31470 Quadruped animals are capable of exhibiting a diverse range of locomotion gaits. While progress has been made in demonstrating such gaits on robots, current methods rely on motion priors, dynamics models, or other forms of extensive manual efforts. People can use natural language to describe dance moves. Could one use a formal language to specify quadruped gaits? To this end, we aim to enable easy gait specification and efficient policy learning. Leveraging Reward Machines (RMs) for high-level gait specification over foot contacts, our approach is called RM-based Locomotion Learning (RMLL), and supports adjusting gait frequency at execution time. Gait specification is enabled through the use of a few logical rules per gait (e.g., alternate between moving front feet and back feet) and does not require labor-intensive motion priors. Experimental results in simulation highlight the diversity of learned gaits (including two novel gaits), their energy consumption and stability across different terrains, and the superior sample-efficiency when compared to baselines. We also demonstrate these learned policies with a real quadruped robot. Video and supplementary materials: https://sites.google.com/view/rm-locomotion-learning/home David DeFazio Yohei Hayamizu Shiqi Zhang Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 142 150 10.1609/icaps.v34i1.31470 Higher-Dimensional Potential Heuristics: Lower Bound Criterion and Connection to Correlation Complexity https://ojs.aaai.org/index.php/ICAPS/article/view/31471 Correlation complexity is a measure of a planning task indicating how hard it is. The introducing work, provides sufficient criteria to detect a correlation complexity of 2 on a planning task. It also introduced an example of a planning task with correlation complexity 3. In our work, we introduce a criterion to detect an arbitrary correlation complexity and extend the mentioned example to show with the new criterion that planning tasks with arbitrary correlation complexity exist. Simon Dold Malte Helmert Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 151 161 10.1609/icaps.v34i1.31471 New Fuzzing Biases for Action Policy Testing https://ojs.aaai.org/index.php/ICAPS/article/view/31472 Testing was recently proposed as a method to gain trust in learned action policies in classical planning. Test cases in this setting are states generated by a fuzzing process that performs random walks from the initial state. A fuzzing bias attempts to bias these random walks towards policy bugs, that is, states where the policy performs sub-optimally. Prior work explored a simple fuzzing bias based on policy-trace cost. Here, we investigate this topic more deeply. We introduce three new fuzzing biases based on analyses of policy-trace shape, estimating whether a trace is close to looping back on itself, whether it contains detours, and whether its goal-distance surface does not smoothly decline. Our experiments with two kinds of neural action policies show that these new biases improve bug-finding capabilities in many cases. Jan Eisenhut Xandra Schuler Daniel Fišer Daniel Höller Maria Christakis Jörg Hoffmann Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 162 167 10.1609/icaps.v34i1.31472 PDDL+ Models for Deployable yet Effective Traffic Signal Optimisation https://ojs.aaai.org/index.php/ICAPS/article/view/31473 The use of planning techniques in traffic signal optimisation has proven effective in managing unexpected traffic conditions as well as typical traffic patterns. However, significant challenges concerning the deployability of generated signal strategies remain, as existing approaches tend not to consider constraints and features of the actual real-world infrastructure on which they will be implemented. To address this challenge, we introduce a range of PDDL+ models embodying technological requirements as well as insights from domain experts. The proposed models have been extensively tested on historical data using a range of well-known search strategies and heuristics, as well as alternative encodings. Results demonstrate their competitiveness with the state of the art. Anas El Kouaiti Francesco Percassi Alessandro Saetti Thomas Leo McCluskey Mauro Vallati Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 168 177 10.1609/icaps.v34i1.31473 Termination Properties of Transition Rules for Indirect Effects https://ojs.aaai.org/index.php/ICAPS/article/view/31474 Indirect effects of agent's actions have traditionally been formalized as condition-effect rules that always fire whenever applicable, after each action taken by the agent. In this work, we investigate a core problem of indirect effects, the possibility of arbitrarily or infinitely long sequences of rule firings. Specifically we investigate the termination of rule firings, as well as their confluence, that is, the uniqueness of the state that is ultimately reached. Both problems turn out to be PSPACE-complete. After this, we devise practically interesting syntactic and structural restrictions that guarantee polynomial-time termination and confluence tests. Finally, in the context of planning languages that support indirect effects, we propose new implementation technologies. Mojtaba Elahi Saurabh Fadnis Jussi Rintanen Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 178 186 10.1609/icaps.v34i1.31474 A Fast Algorithm for k-Memory Messaging Scheme Design in Dynamic Environments with Uncertainty https://ojs.aaai.org/index.php/ICAPS/article/view/31475 We study the problem of designing the optimal k-memory messaging scheme in a dynamic environment. Specifically, a sender, who can perfectly observe the state of a dynamic environment but cannot take actions, aims to persuade an uninformed, far-sighted receiver to take actions to maximize the long-term utility of the sender, by sending messages. We focus on k-memory messaging schemes, i.e., at each time step, the sender's messaging scheme depends on information from the previous k steps. After receiving a message, the self-interested receiver derives a posterior belief and takes action. The immediate reward of each player can be unaligned, thus the sender needs to ensure persuasiveness when designing the messaging scheme. We first formulate this problem as a bi-linear program. Then we show that there are infinitely many non-trivial persuasive messaging schemes for any problem instance. Moreover, we show that when the sender uses a k-memory messaging scheme, the optimal strategy for the receiver is also a k-memory strategy. We propose a fast heuristic algorithm for this problem and show that it can be extended to the setting where the sender has threat ability. We experimentally evaluate our algorithm, comparing it with the solution obtained by the Gurobi solver, in terms of performance and running time, in both settings. Extensive experimental results show that our algorithm outperforms the solution in running time, yet achieves comparable performance. Zhikang Fan Weiran Shen Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 187 195 10.1609/icaps.v34i1.31475 SLAMuZero: Plan and Learn to Map for Joint SLAM and Navigation https://ojs.aaai.org/index.php/ICAPS/article/view/31476 MuZero has demonstrated remarkable performance in board and video games where Monte Carlo tree search (MCTS) method is utilized to learn and adapt to different game environments. This paper leverages the strength of MuZero to enhance agents’ planning capability for joint active simultaneous localization and mapping (SLAM) and navigation tasks, which require an agent to navigate an unknown environment while simultaneously constructing a map and localizing itself. We propose SLAMuZero, a novel approach for joint SLAM and navigation, which employs a search process that uses an explicit encoder-decoder architecture for mapping, followed by a prediction function to evaluate policy and value based on the generated map. SLAMuZero outperforms the state-of-the-art baseline and significantly reduces training time, underscoring the efficiency of our approach. Additionally, we develop a new open source library for implementing SLAMuZero, which is a flexible and modular toolkit for researchers and practitioners (https://github.com/bwfbowen/SLAMuZero). Bowen Fang Xu Chen Zhengkun Pan Xuan Di Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 196 200 10.1609/icaps.v34i1.31476 A Real-Time Rescheduling Algorithm for Multi-robot Plan Execution https://ojs.aaai.org/index.php/ICAPS/article/view/31477 One area of research in multi-agent path finding is to determine how replanning can be efficiently achieved in the case of agents being delayed during execution. One option is to reschedule the passing order of agents, i.e., the sequence in which agents visit the same location. In response, we propose Switchable-Edge Search (SES), an A*-style algorithm designed to find optimal passing orders. We prove the optimality of SES and evaluate its efficiency via simulations. The best variant of SES takes less than 1 second for small- and medium-sized problems and runs up to 4 times faster than baselines for large-sized problems. Ying Feng Adittyo Paul Zhe Chen Jiaoyang Li Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 201 209 10.1609/icaps.v34i1.31477 Towards Feasible Higher-Dimensional Potential Heuristics https://ojs.aaai.org/index.php/ICAPS/article/view/31478 Potential heuristics assign numerical values (potentials) to state features, where each feature is a conjunction of facts. It was previously shown that the informativeness of potential heuristics can be significantly improved by considering complex features, but computing potentials over all pairs of facts is already too costly in practice. In this paper, we investigate whether using just a few high-dimensional features instead of all conjunctions up to a dimension n can result in improved heuristics while keeping the computational cost at bay. We focus on (a) establishing a framework for studying this kind of potential heuristics, and (b) whether it is reasonable to expect improvement with just a few conjunctions. For (a), we propose two compilations that encode each conjunction explicitly as a new fact so that we can compute potentials over conjunctions in the original task as one-dimensional potentials in the compilation. Regarding (b), we provide evidence that informativeness of potential heuristics can be significantly increased with a small set of conjunctions, and these improvements have positive impact on the number of solved tasks. Daniel Fišer Marcel Steinmetz Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 210 220 10.1609/icaps.v34i1.31478 Progressive State Space Disaggregation for Infinite Horizon Dynamic Programming https://ojs.aaai.org/index.php/ICAPS/article/view/31479 High dimensionality of model-based Reinforcement Learning and Markov Decision Processes can be reduced using abstractions of the state and action spaces. Although hierarchical learning and state abstraction methods have been explored over the past decades, explicit methods to build useful abstractions of models are rarely provided. In this work, we provide a new state abstraction method for solving infinite horizon problems in the discounted and total settings. Our approach is to progressively disaggregate abstract regions by iteratively slicing aggregations of states relatively to a value function. The distinguishing feature of our method, in contrast to previous approximations of the Bellman operator, is the disaggregation of regions during value function iterations (or policy evaluation steps). The objective is to find a more efficient aggregation that reduces the error on each piece of the partition. We provide a proof of convergence for this algorithm without making any assumptions about the structure of the problem. We also show that this process decreases the computational complexity of the Bellman operator iteration and provides useful abstractions. We then plug this state space disaggregation process in classical Dynamic Programming algorithm namely Approximate Value Iteration, Q-Value Iteration and Policy Iteration. Finally, we conduct a numerical comparison on randomly generated MDPs as well as classical MDPs. Those experiments show that our policy-based algorithm is faster than both traditional dynamic programming approach and recent aggregative methods that use a fixed number of adaptive partitions. Orso Forghieri Hind Castel Emmanuel Hyon Erwan Le Pennec Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 221 229 10.1609/icaps.v34i1.31479 JaxPlan and GurobiPlan: Optimization Baselines for Replanning in Discrete and Mixed Discrete-Continuous Probabilistic Domains https://ojs.aaai.org/index.php/ICAPS/article/view/31480 Replanning methods that determinize a stochastic planning problem and replan at each action step have long been known to provide strong baseline (and even competition winning) solutions to discrete probabilistic planning problems. Recent work has explored the extension of replanning methods to the case of mixed discrete-continuous probabilistic domains by leveraging MILP compilations of the RDDL specification language. Other recent advances in probabilistic planning have explored the compilation of structured mixed discrete-continuous RDDL domains into a determinized computation graph that also lends itself to replanning via so-called planning by backpropagation methods. However, to date, there has not been any comprehensive comparison of these recent optimization-based replanning methodologies to the state-of-the-art winner of the discrete probabilistic IPC 2011 and 2014 and runner-up in 2018 (PROST) and the winner of the mixed discrete-continuous probabilistic IPC 2023 (DiSProd). In this paper, we describe JaxPlan, which makes several extensive upgrades to planning by backpropagation and its compact tensorized compilation from RDDL to a JAX computation graph that uses discrete relaxations and a sample average approximation. We also provide the first detailed overview of a compilation of the RDDL language specification to Gurobi's Mixed Integer Nonlinear Programming (MINLP) solver that we term GurobiPlan. We provide a comprehensive comparative analysis of JaxPlan and GurobiPlan with competition winning planners on 19 domains and a total of 155 instances to assess their performance across (a) different domains, (b) different instance sizes, and (c) different time budgets. We also release all code to reproduce the results along with the open-source planners we describe in this work. Michael Gimelfarb Ayal Taitler Scott Sanner Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 230 238 10.1609/icaps.v34i1.31480 Formal Representations of Classical Planning Domains https://ojs.aaai.org/index.php/ICAPS/article/view/31481 Planning domains are an important notion, e.g. when it comes to restricting the input for generalized planning or learning approaches. However, domains as specified in PDDL cannot fully capture the intuitive understanding of a planning domain. We close this semantic gap and propose using PDDL axioms to characterize the (typically infinite) set of legal tasks of a domain. A minor extension makes it possible to express all properties that can be determined in polynomial time. We demonstrate the suitability of the approach on established domains from the International Planning Competition. Claudia Grundke Gabriele Röger Malte Helmert Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 239 248 10.1609/icaps.v34i1.31481 Safe Explicable Planning https://ojs.aaai.org/index.php/ICAPS/article/view/31482 Human expectations arise from their understanding of others and the world. In the context of human-AI interaction, this understanding may not align with reality, leading to the AI agent failing to meet expectations and compromising team performance. Explicable planning, introduced as a method to bridge this gap, aims to reconcile human expectations with the agent's optimal behavior, facilitating interpretable decision-making. However, an unresolved critical issue is ensuring safety in explicable planning, as it could result in explicable behaviors that are unsafe. To address this, we propose Safe Explicable Planning (SEP), which extends the prior work to support the specification of a safety bound. The goal of SEP is to find behaviors that align with human expectations while adhering to the specified safety criterion. Our approach generalizes the consideration of multiple objectives stemming from multiple models rather than a single model, yielding a Pareto set of safe explicable policies. We present both an exact method, guaranteeing finding the Pareto set, and a more efficient greedy method that finds one of the policies in the Pareto set. Additionally, we offer approximate solutions based on state aggregation to improve scalability. We provide formal proofs that validate the desired theoretical properties of these methods. Evaluation through simulations and physical robot experiments confirms the effectiveness of our approach for safe explicable planning. Akkamahadevi Hanni Andrew Boateng Yu Zhang Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 249 257 10.1609/icaps.v34i1.31482 Replanning in Advance for Instant Delay Recovery in Multi-Agent Applications: Rerouting Trains in a Railway Hub https://ojs.aaai.org/index.php/ICAPS/article/view/31483 Train routing is sensitive to delays that occur in the network. When a train is delayed, it is imperative that a new plan be found quickly, or else other trains may need to be stopped to ensure safety, potentially causing cascading delays. In this paper, we consider this class of multi-agent planning problems, which we call Multi-Agent Execution Delay Replanning. We show that these can be solved by reducing the problem to an any-start-time safe interval planning problem. When an agent has an any-start-time plan, it can react to a delay by simply looking up the precomputed plan for the delayed start time. We identify crucial real-world problem characteristics like the agent's speed, size, and safety envelope, and extend the any-start-time planning to account for them. Experimental results on real-world train networks show that any-start-time plans are compact and can be computed in reasonable time while enabling agents to instantly recover a safe plan. Issa K. Hanou Devin Wild Thomas Wheeler Ruml Mathijs de Weerdt Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 258 266 10.1609/icaps.v34i1.31483 An Analysis of the Decidability and Complexity of Numeric Additive Planning https://ojs.aaai.org/index.php/ICAPS/article/view/31484 In this paper, we first define numeric additive planning (NAP), a planning formulation equivalent to Hoffmann's Restricted Tasks over Integers. Then, we analyze the minimal number of action repetitions required for a solution, since planning turns out to be decidable as long as such numbers can be calculated for all actions. We differentiate between two kinds of repetitions and solve for one by integer linear programming and the other by search. Additionally, we characterize the differences between propositional planning and NAP regarding these two kinds. To achieve this, we define so-called multi-valued partial order plans, a novel compact plan representation. Finally, we consider decidable fragments of NAP and their complexity. Hayyan Helal Gerhard Lakemeyer Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 267 275 10.1609/icaps.v34i1.31484 Versatile Cost Partitioning with Exact Sensitivity Analysis https://ojs.aaai.org/index.php/ICAPS/article/view/31485 Saturated post-hoc optimization is a powerful method for computing admissible heuristics for optimal classical planning. The approach solves a linear program (LP) for each state encountered during the search, which is computationally demanding. In this paper, we theoretically and empirically analyze to which extent we can reuse an LP solution of one state for another. We introduce a novel sensitivity analysis that can exactly characterize the set of states for which a unique LP solution is optimal. Furthermore, we identify two properties of the underlying LPs that affect reusability. Finally, we introduce an algorithm that optimizes LP solutions to generalize well to other states. Our new algorithms significantly reduce the number of necessary LP computations. Paul Höft David Speck Florian Pommerening Jendrik Seipp Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 276 280 10.1609/icaps.v34i1.31485 Expressiveness of Graph Neural Networks in Planning Domains https://ojs.aaai.org/index.php/ICAPS/article/view/31486 Graph Neural Networks (GNNs) have become the standard method of choice for learning with structured data, demonstrating particular promise in classical planning. Their inherent invariance under symmetries of the input graphs endows them with superior generalization capabilities, compared to their symmetry-oblivious counterparts. However, this comes at the cost of limited expressive power. Particularly, GNNs cannot distinguish between graphs that satisfy identical sentences of C2 logic. To leverage GNNs for learning policies in PDDL domains, one needs to encode the contextual representation of the planning states as graphs. The expressiveness of this encoding, coupled with a specific GNN architecture, then hinges on the absence of indistinguishable states necessitating distinct actions. This paper provides a comprehensive theoretical and statistical exploration of such situations in PDDL domains across diverse natural encoding schemes and GNN models. Rostislav Horčík Gustav Šír Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 281 289 10.1609/icaps.v34i1.31486 Converting Simple Temporal Networks with Uncertainty into Minimal Equivalent Dispatchable Form https://ojs.aaai.org/index.php/ICAPS/article/view/31487 A Simple Temporal Network with Uncertainty (STNU) is a structure for representing and reasoning about time constraints on actions that may have uncertain durations. An STNU is dynamically controllable (DC) if there exists a dynamic strategy for executing the network that guarantees that all of its constraints will be satisfied no matter how the uncertain durations turn out---within their specified bounds. However, such strategies typically require exponential space. Therefore, converting a DC STNU into a so-called dispatchable form for practical applications is essential. The relevant portions of a real-time execution strategy for a dispatchable STNU can be incrementally constructed during execution, requiring only O(n²) space, while also providing maximum flexibility and minimal computation during the execution of the network. Although existing algorithms can generate equivalent-dispatchable STNUs, they do not guarantee a minimal number of edges in the STNU graph. Since the number of edges directly impacts the computations during execution, this paper presents a novel algorithm for converting any dispatchable STNU into an equivalent dispatchable network having a minimal number of edges. The complexity of the algorithm is O(k n³), where k is the number of actions with uncertain durations, and n is the number of timepoints in the network. The paper also provides an empirical evaluation of the reduction of edges obtained by the impact of the new algorithm. Luke Hunsberger Roberto Posenato Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 290 300 10.1609/icaps.v34i1.31487 Rethinking Mutual Information for Language Conditioned Skill Discovery on Imitation Learning https://ojs.aaai.org/index.php/ICAPS/article/view/31488 Language-conditioned robot behavior plays a vital role in executing complex tasks by associating human commands or instructions with perception and actions. The ability to compose long-horizon tasks based on unconstrained language instructions necessitates the acquisition of a diverse set of general-purpose skills.However, acquiring inherent primitive skills in a coupled and long-horizon environment without external rewards or human supervision presents significant challenges. In this paper, we evaluate the relationship between skills and language instructions from a mathematical perspective, employing two forms of mutual information within the framework of language-conditioned policy learning.To maximize the mutual information between language and skills in an unsupervised manner, we propose an end-to-end imitation learning approach known as Language Conditioned Skill Discovery (LCSD). Specifically, we utilize vector quantization to learn discrete latent skills and leverage skill sequences of trajectories to reconstruct high-level semantic instructions.Through extensive experiments on language-conditioned robotic navigation and manipulation tasks, encompassing BabyAI, LORel, and Calvin, we demonstrate the superiority of our method over prior works. Our approach exhibits enhanced generalization capabilities towards unseen tasks, improved skill interpretability, and notably higher rates of task completion success. Zhaoxun Ju Chao Yang Fuchun Sun Hongbo Wang Yu Qiao Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 301 309 10.1609/icaps.v34i1.31488 Epistemic Exploration for Generalizable Planning and Learning in Non-Stationary Settings https://ojs.aaai.org/index.php/ICAPS/article/view/31489 This paper introduces a new approach for continual planning and model learning in relational, non-stationary stochastic environments. Such capabilities are essential for the deployment of sequential decision-making systems in the uncertain and constantly evolving real world. Working in such practical settings with unknown (and non-stationary) transition systems and changing tasks, the proposed framework models gaps in the agent's current state of knowledge and uses them to conduct focused, investigative explorations. Data collected using these explorations is used for learning generalizable probabilistic models for solving the current task despite continual changes in the environment dynamics. Empirical evaluations on several non-stationary benchmark domains show that this approach significantly outperforms planning and RL baselines in terms of sample complexity. Theoretical results show that the system exhibits desirable convergence properties when stationarity holds. Rushang Karia Pulkit Verma Alberto Speranzon Siddharth Srivastava Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 310 318 10.1609/icaps.v34i1.31489 Unifying and Certifying Top-Quality Planning https://ojs.aaai.org/index.php/ICAPS/article/view/31490 The growing utilization of planning tools in practical scenarios has sparked an interest in generating multiple high-quality plans. Consequently, a range of computational problems under the general umbrella of top-quality planning were introduced over a short time period, each with its own definition. In this work, we show that the existing definitions can be unified into one, based on a dominance relation. The different computational problems, therefore, simply correspond to different dominance relations. Given the unified definition, we can now certify the top-quality of the solutions, leveraging existing certification of unsolvability and optimality. We show that task transformations found in the existing literature can be employed for the efficient certification of various top-quality planning problems and propose a novel transformation to efficiently certify loopless top-quality planning. Michael Katz Junkyu Lee Shirin Sohrabi Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 319 323 10.1609/icaps.v34i1.31490 Explaining Plan Quality Differences https://ojs.aaai.org/index.php/ICAPS/article/view/31491 We describe a method for explaining the differences between the quality of plans produced for similar planning problems. The method exploits a process of abstracting away details of the planning problems until the difference in solution quality they support has been minimised. We give a general definition of a valid abstraction of a planning problem. We then give the details of the implementation of a number of useful abstractions. Finally, we present a breadth-first search algorithm for finding suitable abstractions for explanations; and detail the results of an evaluation of the approach. Benjamin Krarup Amanda Coles Derek Long David E. Smith Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 324 332 10.1609/icaps.v34i1.31491 Planning with a Learned Policy Basis to Optimally Solve Complex Tasks https://ojs.aaai.org/index.php/ICAPS/article/view/31492 Conventional reinforcement learning (RL) methods can successfully solve a wide range of sequential decision problems. However, learning policies that can generalize predictably across multiple tasks in a setting with non-Markovian reward specifications is a challenging problem. We propose to use successor features to learn a set of local policies that each solves a well-defined subproblem. In a task described by a finite state automaton (FSA) that involves the same set of subproblems, the combination of these local policies can then be used to generate an optimal solution without additional learning. In contrast to other methods that combine local policies via planning, our method asymptotically attains global optimality, even in stochastic environments. David Kuric Guillermo Infante Vicenç Gómez Anders Jonsson Herke van Hoof Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 333 341 10.1609/icaps.v34i1.31492 Action Model Learning from Noisy Traces: a Probabilistic Approach https://ojs.aaai.org/index.php/ICAPS/article/view/31493 We address the problem of learning planning domains from plan traces that are obtained by observing the environment states through noisy sensors. In such situations, approaches that assume correct traces are not applicable. We tackle the problem by designing a probabilistic graphical model where preconditions and effects of every planning domain operators, and traces’ observations are modeled by random variables. Probabilistic inference conditioned by the observed traces allows our approach to derive a posterior probability of an atom being a precondition and/or an effect of an operator. Planning domains are obtained either by sampling or by applying the maximum a posteriori criterion. We compare our approach with a frequentist baseline and the currently available state-of-the-art approaches. We measure the performance of each method according to two criteria: reconstruction of the original planning domain and effectiveness in solving new planning problems of the same domain. Our experimental analysis shows that our approach learns action models that are more accurate w.r.t. state-of-the-art approaches, and strongly outperforms other approaches in generating models that are effective for solving new problems. Leonardo Lamanna Luciano Serafini Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 342 350 10.1609/icaps.v34i1.31493 Neural Combinatorial Optimization on Heterogeneous Graphs: An Application to the Picker Routing Problem in Mixed-shelves Warehouses https://ojs.aaai.org/index.php/ICAPS/article/view/31494 In recent years, machine learning (ML) models capable of solving combinatorial optimization (CO) problems have received a surge of attention. While early approaches failed to outperform traditional CO solvers, the gap between handcrafted and learned heuristics has been steadily closing. However, most work in this area has focused on simple CO problems to benchmark new models and algorithms, leaving a gap in the development of methods specifically designed to handle more involved problems. Therefore, this work considers the problem of picker routing in the context of mixed-shelves warehouses, which involves not only a heterogeneous graph representation, but also a combinatorial action space resulting from the integrated selection and routing decisions to be made. We propose both a novel encoder to effectively learn representations of the heterogeneous graph and a hierarchical decoding scheme that exploits the combinatorial structure of the action space. The efficacy of the developed methods is demonstrated through a comprehensive comparison with established architectures as well as exact and heuristic solvers. Laurin Luttmann Lin Xie Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 351 359 10.1609/icaps.v34i1.31494 Investigating Large Neighbourhood Search for Bus Driver Scheduling https://ojs.aaai.org/index.php/ICAPS/article/view/31495 The Bus Driver Scheduling Problem (BDSP) is a combinatorial optimisation problem with high practical relevance. The aim is to assign bus drivers to predetermined routes while minimising a specified objective function that considers operating costs as well as employee satisfaction. Since we must satisfy several rules from a collective agreement and European regulations, the BDSP is highly constrained. Hence, using exact methods to solve large real-life-based instances is computationally too expensive, while heuristic methods still have a considerable gap to the optimum. Our paper presents a Large Neighbourhood Search (LNS) approach to solve the BDSP. We propose several novel destroy operators and an approach using column generation to repair the sub-problem. We analyse the impact of the destroy and repair operators and investigate various possibilities to select them, including adaptivity. The proposed approach improves all the upper bounds for larger instances that exact methods cannot solve, as well as for some mid-sized instances, and outperforms existing heuristic approaches for this problem on all benchmark instances. Tommaso Mannelli Mazzoli Lucas Kletzander Pascal Van Hentenryck Nysret Musliu Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 360 368 10.1609/icaps.v34i1.31495 Weak and Strong Reversibility of Non-deterministic Actions: Universality and Uniformity https://ojs.aaai.org/index.php/ICAPS/article/view/31496 Classical planning looks for a sequence of actions that transform the initial state of the environment into a goal state. Studying whether the effects of an action can be undone by a sequence of other actions, that is, action reversibility, is beneficial, for example, in determining whether an action is safe to apply. This paper deals with action reversibility of non-deterministic actions, i.e., actions whose application might result in different outcomes. Inspired by the established notions of weak and strong plans in non-deterministic (or FOND) planning, we define the notions of weak and strong reversibility for non-deterministic actions. We then focus on the universality and uniformity of action reversibility, that is, whether we can always undo all possible effects of the action by the same means (i.e., policy), or whether some of the effects can never be undone. We show how these classes of problems can be solved via classical or FOND planning and evaluate our approaches on FOND benchmark domains. Jakub Med Lukáš Chrpa Michael Morak Wolfgang Faber Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 369 377 10.1609/icaps.v34i1.31496 Preference Explanation and Decision Support for Multi-Objective Real-World Test Laboratory Scheduling https://ojs.aaai.org/index.php/ICAPS/article/view/31497 Complex real-world scheduling problems often include multiple conflicting objectives. Decision makers (DMs) can express their preferences over those objectives in different ways, including as sets of weights which are used in a linear combination of objective values. However, finding good sets of weights that result in solutions with desirable qualities is challenging and currently involves a lot of trial and error. We propose a general method to explain objectives' values under a given set of weights using Shapley regression values. We demonstrate this approach on the Test Laboratory Scheduling Problem (TLSP), for which we propose a multi-objective solution algorithm and show that suggestions for weight adjustments based on the introduced explanations are successful in guiding decision makers towards solutions that match their expectations. This method is included in the TLSP MO-Explorer, a new decision support system that enables the exploration and analysis of high-dimensional Pareto fronts. Florian Mischek Nysret Musliu Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 378 386 10.1609/icaps.v34i1.31497 Safe Learning of PDDL Domains with Conditional Effects https://ojs.aaai.org/index.php/ICAPS/article/view/31498 Powerful domain-independent planners have been developed to solve various types of planning problems. These planners often require a model of the acting agent's actions, given in some planning domain description language. Manually designing such an action model is a notoriously challenging task. An alternative is to automatically learn action models from observation. Such an action model is called safe if every plan created with it is consistent with the real, unknown action model. Algorithms for learning such safe action models exist, yet they cannot handle domains with conditional or universal effects, which are common constructs in many planning problems. We prove that learning non-trivial safe action models with conditional effects may require an exponential number of samples. Then, we identify reasonable assumptions under which such learning is tractable and propose Conditional-SAM, the first algorithm capable of doing so. We analyze Conditional-SAM theoretically and evaluate it experimentally. Our results show that the action models learned by Conditional-SAM can be used to solve perfectly most of the test set problems in most of the experimented domains. Argaman Mordoch Enrico Scala Roni Stern Brendan Juba Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 387 395 10.1609/icaps.v34i1.31498 SKATE : Successive Rank-based Task Assignment for Proactive Online Planning https://ojs.aaai.org/index.php/ICAPS/article/view/31499 The development of online applications for services such as package delivery, crowdsourcing, or taxi dispatching has caught the attention of the research community to the domain of online multi-agent multi-task allocation. In online service applications, tasks (or requests) to be performed arrive over time and need to be dynamically assigned to agents. Such planning problems are challenging because: (i) few or almost no information about future tasks is available for long-term reasoning; (ii) agent number, as well as, task number can be impressively high; and (iii) an efficient solution has to be reached in a limited amount of time. In this paper, we propose SKATE, a successive rank-based task assignment algorithm for online multi-agent planning. SKATE can be seen as a meta-heuristic approach which successively assigns a task to the best-ranked agent until all tasks have been assigned. We assessed the complexity of SKATE and showed it is cubic in the number of agents and tasks. To investigate how multi-agent multi-task assignment algorithms perform under a high number of agents and tasks, we compare three multi-task assignment methods in synthetic and real data benchmark environments: Integer Linear Programming (ILP), Genetic Algorithm (GA), and SKATE. In addition, a proactive approach is nested to all methods to determine near-future available agents (resources) using a receding-horizon. Based on the results obtained, we can argue that the classical ILP offers the better quality solutions when treating a low number of agents and tasks, i.e. low load despite the receding-horizon size, while it struggles to respect the time constraint for high load. SKATE performs better than the other methods in high load conditions, and even better when a variable receding-horizon is used. Déborah Conforto Nedelmann Jérôme Lacan Caroline P. C. Chanel Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 396 404 10.1609/icaps.v34i1.31499 Incremental Ordering for Scheduling Problems https://ojs.aaai.org/index.php/ICAPS/article/view/31500 Given an instance of a scheduling problem where we want to start executing jobs as soon as possible, it is advantageous if a scheduling algorithm emits the first parts of its solution early, in particular before the algorithm completes its work. Therefore, in this position paper, we analyze core scheduling problems in regards to their enumeration complexity, i.e. the computation time to the first emitted schedule entry (preprocessing time) and the worst case time between two consecutive parts of the solution (delay). Specifically, we look at scheduling instances that reduce to ordering problems. We apply a known incremental sorting algorithm for scheduling strategies that are at their core comparison-based sorting algorithms and translate corresponding upper and lower complexity bounds to the scheduling setting. For instances with n jobs and a precedence DAG with maximum degree Δ, we incrementally build a topological ordering with O(n) preprocessing and O(Δ) delay. We prove a matching lower bound and show with an adversary argument that the delay lower bound holds even in case the DAG has constant average degree and the ordering is emitted out-of-order in the form of insert operations. We complement our theoretical results with experiments that highlight the improved time-to-first-output and discuss research opportunities for similar incremental approaches for other scheduling problems. Stefan Neubert Katrin Casel Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 405 413 10.1609/icaps.v34i1.31500 Lookahead Pathology in Monte-Carlo Tree Search https://ojs.aaai.org/index.php/ICAPS/article/view/31501 Monte-Carlo Tree Search (MCTS) is a search paradigm that first found prominence with its success in the domain of computer Go. Early theoretical work established the soundness and convergence bounds for Upper Confidence bounds applied to Trees (UCT), the most popular instantiation of MCTS; however, there remain notable gaps in our understanding of how UCT behaves in practice. In this work, we address one such gap by considering the question of whether UCT can exhibit lookahead pathology in adversarial settings --- a paradoxical phenomenon first observed in Minimax search where greater search effort leads to worse decision-making. We introduce a novel family of synthetic games that offer rich modeling possibilities while remaining amenable to mathematical analysis. Our theoretical and experimental results suggest that UCT is indeed susceptible to pathological behavior in a range of games drawn from this family. Khoi P. N. Nguyen Raghuram Ramanujan Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 414 422 10.1609/icaps.v34i1.31501 Large Language Models as Planning Domain Generators https://ojs.aaai.org/index.php/ICAPS/article/view/31502 Developing domain models is one of the few remaining places that require manual human labor in AI planning. Thus, in order to make planning more accessible, it is desirable to automate the process of domain model generation. To this end, we investigate if large language models (LLMs) can be used to generate planning domain models from simple textual descriptions. Specifically, we introduce a framework for automated evaluation of LLM-generated domains by comparing the sets of plans for domain instances. Finally, we perform an empirical analysis of 7 large language models, including coding and chat models across 9 different planning domains, and under three classes of natural language domain descriptions. Our results indicate that LLMs, particularly those with high parameter counts, exhibit a moderate level of proficiency in generating correct planning domains from natural language descriptions. Our code is available at https://github.com/IBM/NL2PDDL. James Oswald Kavitha Srinivas Harsha Kokel Junkyu Lee Michael Katz Shirin Sohrabi Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 423 431 10.1609/icaps.v34i1.31502 On the Prospects of Incorporating Large Language Models (LLMs) in Automated Planning and Scheduling (APS) https://ojs.aaai.org/index.php/ICAPS/article/view/31503 Automated Planning and Scheduling is among the growing areas in Artificial Intelligence (AI) where mention of LLMs has gained popularity. Based on a comprehensive review of 126 papers, this paper investigates eight categories based on the unique applications of LLMs in addressing various aspects of planning problems: language translation, plan generation, model construction, multi-agent planning, interactive planning, heuristics optimization, tool integration, and brain-inspired planning. For each category, we articulate the issues considered and existing gaps. A critical insight resulting from our review is that the true potential of LLMs unfolds when they are integrated with traditional symbolic planners, pointing towards a promising neuro-symbolic approach. This approach effectively combines the generative aspects of LLMs with the precision of classical planning methods. By synthesizing insights from existing literature, we underline the potential of this integration to address complex planning challenges. Our goal is to encourage the ICAPS community to recognize the complementary strengths of LLMs and symbolic planners, advocating for a direction in automated planning that leverages these synergistic capabilities to develop more advanced and intelligent planning systems. We aim to keep the categorization of papers updated on https://ai4society.github.io/LLM-Planning-Viz/, a collaborative resource that allows researchers to contribute and add new literature to the categorization. Vishal Pallagani Bharath Chandra Muppasani Kaushik Roy Francesco Fabiano Andrea Loreggia Keerthiram Murugesan Biplav Srivastava Francesca Rossi Lior Horesh Amit Sheth Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 432 444 10.1609/icaps.v34i1.31503 Transition Landmarks from Abstraction Cuts https://ojs.aaai.org/index.php/ICAPS/article/view/31504 We introduce transition-counting constraints as a principled tool to formalize constraints that must hold in every solution of a transition system. We then show how to obtain transition landmark constraints from abstraction cuts. Transition landmarks dominate operator landmarks in theory but require solving a linear program that is prohibitively large in practice. We compare different constraints that project away transition-counting variables and then further relax the constraint. For one important special case, we provide a lossless projection. We finally discuss efficient data structures to derive cuts from abstractions and store them in a way that avoids repeated computation in every state. We compare the resulting heuristics both theoretically and on benchmarks from the international planning competition. Florian Pommerening Clemens Büchner Thomas Keller Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 445 454 10.1609/icaps.v34i1.31504 Computing Planning Centroids and Minimum Covering States Using Symbolic Bidirectional Search https://ojs.aaai.org/index.php/ICAPS/article/view/31505 In some scenarios, planning agents might be interested in reaching states that keep certain relationships with respect to a set of goals. Recently, two of these types of states were proposed: centroids, which minimize the average distance to the goals; and minimum covering states, which minimize the maximum distance to the goals. Previous approaches compute these states by searching forward either in the original or a reformulated task. In this paper, we propose several algorithms that use symbolic bidirectional search to efficiently compute centroids and minimum covering states. Experimental results in existing and novel benchmarks show that our algorithms scale much better than previous approaches, establishing a new state-of-the-art technique for this problem. Alberto Pozanco Álvaro Torralba Daniel Borrajo Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 455 463 10.1609/icaps.v34i1.31505 SayNav: Grounding Large Language Models for Dynamic Planning to Navigation in New Environments https://ojs.aaai.org/index.php/ICAPS/article/view/31506 Semantic reasoning and dynamic planning capabilities are crucial for an autonomous agent to perform complex navigation tasks in unknown environments. It requires a large amount of common-sense knowledge, that humans possess, to succeed in these tasks. We present SayNav, a new approach that leverages human knowledge from Large Language Models (LLMs) for efficient generalization to complex navigation tasks in unknown large-scale environments. SayNav uses a novel grounding mechanism, that incrementally builds a 3D scene graph of the explored environment as inputs to LLMs, for generating feasible and contextually appropriate high-level plans for navigation. The LLM-generated plan is then executed by a pre-trained low-level planner, that treats each planned step as a short-distance point-goal navigation sub-task. SayNav dynamically generates step-by-step instructions during navigation and continuously refines future steps based on newly perceived information. We evaluate SayNav on multi-object navigation (MultiON) task, that requires the agent to utilize a massive amount of human knowledge to efficiently search multiple different objects in an unknown environment. We also introduce a benchmark dataset for MultiON task employing ProcTHOR framework that provides large photo-realistic indoor environments with variety of objects. SayNav achieves state-of-the-art results and even outperforms an oracle based baseline with strong ground-truth assumptions by more than 8% in terms of success rate, highlighting its ability to generate dynamic plans for successfully locating objects in large-scale new environments. The code, benchmark dataset and demonstration videos are accessible at https://www.sri.com/ics/computer-vision/saynav. Abhinav Rajvanshi Karan Sikka Xiao Lin Bhoram Lee Han-Pang Chiu Alvaro Velasquez Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 464 474 10.1609/icaps.v34i1.31506 Online Control of Adaptive Large Neighborhood Search Using Deep Reinforcement Learning https://ojs.aaai.org/index.php/ICAPS/article/view/31507 The Adaptive Large Neighborhood Search (ALNS) algorithm has shown considerable success in solving combinatorial optimization problems (COPs). Nonetheless, the performance of ALNS relies on the proper configuration of its selection and acceptance parameters, which is known to be a complex and resource-intensive task. To address this, we introduce a Deep Reinforcement Learning (DRL) based approach called DR-ALNS that selects operators, adjusts parameters, and controls the acceptance criterion throughout the search. The proposed method aims to learn, based on the state of the search, to configure ALNS for the next iteration to yield more effective solutions for the given optimization problem. We evaluate the proposed method on an orienteering problem with stochastic weights and time windows, as presented in an IJCAI competition. The results show that our approach outperforms vanilla ALNS, ALNS tuned with Bayesian optimization, and two state-of-the-art DRL approaches that were the winning methods of the competition, achieving this with significantly fewer training observations. Furthermore, we demonstrate several good properties of the proposed DR-ALNS method: it is easily adapted to solve different routing problems, its learned policies perform consistently well across various instance sizes, and these policies can be directly applied to different problem variants. Robbert Reijnen Yingqian Zhang Hoong Chuin Lau Zaharah Bukhsh Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 475 483 10.1609/icaps.v34i1.31507 Map Connectivity and Empirical Hardness of Grid-based Multi-Agent Pathfinding Problem https://ojs.aaai.org/index.php/ICAPS/article/view/31508 We present an empirical study of the relationship between map connectivity and the empirical hardness of the multi-agent pathfinding (MAPF) problem. By analyzing the second smallest eigenvalue (commonly known as lambda2) of the normalized Laplacian matrix of different maps, our initial study indicates that maps with smaller lambda2 tend to create more challenging instances when agents are generated uniformly randomly. Additionally, we introduce a map generator based on Quality Diversity (QD) that is capable of producing maps with specified lambda2 ranges, offering a possible way for generating challenging MAPF instances. Despite the absence of a strict monotonic correlation with lambda2 and the empirical hardness of MAPF, this study serves as a valuable initial investigation for gaining a deeper understanding of what makes a MAPF instance hard to solve. Jingyao Ren Eric Ewing T. K. Satish Kumar Sven Koenig Nora Ayanian Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 484 488 10.1609/icaps.v34i1.31508 The Story So Far on Narrative Planning https://ojs.aaai.org/index.php/ICAPS/article/view/31509 Narrative planning is the use of automated planning to construct, communicate, and understand stories, a form of information to which human cognition and enaction is pre-disposed. We review the narrative planning problem in a manner suitable as an introduction to the area, survey different plan-based methodologies and affordances for reasoning about narrative, and discuss open challenges relevant to the broader AI community. Rogelio E. Cardona Rivera Arnav Jhala Julie Porteous R. Michael Young Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 489 499 10.1609/icaps.v34i1.31509 Learning General Policies for Planning through GPT Models https://ojs.aaai.org/index.php/ICAPS/article/view/31510 Transformer-based architectures, such as T5, BERT and GPT, have demonstrated revolutionary capabilities in Natural Language Processing. Several studies showed that deep learning models using these architectures not only possess remarkable linguistic knowledge, but they also exhibit forms of factual knowledge, common sense, and even programming skills. However, the scientific community still debates about their reasoning capabilities, which have been recently tested in the context of automated AI planning; the literature presents mixed results, and the prevailing view is that current transformer-based models may not be adequate for planning. In this paper, we address this challenge differently. We introduce a GPT-based model customised for planning (PLANGPT) to learn a general policy for classical planning by training the model from scratch with a dataset of solved planning instances. Once PLANGPT has been trained for a domain, it can be used to generate a solution plan for an input problem instance in that domain. Our training procedure exploits automated planning knowledge to enhance the performance of the trained model. We build and evaluate our GPT model with several planning domains, and we compare its performance w.r.t. other recent deep learning techniques for generalised planning, demonstrating the effectiveness of the proposed approach. Nicholas Rossetti Massimiliano Tummolo Alfonso Emilio Gerevini Luca Putelli Ivan Serina Mattia Chiari Matteo Olivato Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 500 508 10.1609/icaps.v34i1.31510 Efficiently Computing Transitions in Cartesian Abstractions https://ojs.aaai.org/index.php/ICAPS/article/view/31511 Counterexample-guided Cartesian abstraction refinement yields strong heuristics for optimal classical planning. The approach iteratively finds a new abstract solution, checks where it fails for the original task and refines the abstraction to avoid the same failure in subsequent iterations. The main bottleneck of this refinement loop is the memory needed for storing all abstract transitions. To address this issue, we introduce an algorithm that efficiently computes abstract transitions on demand. This drastically reduces the memory consumption and allows us to solve tasks during the refinement loop and during the search that were previously out of reach. Jendrik Seipp Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 509 513 10.1609/icaps.v34i1.31511 Imitating Cost-Constrained Behaviors in Reinforcement Learning https://ojs.aaai.org/index.php/ICAPS/article/view/31512 Complex planning and scheduling problems have long been solved using various optimization or heuristic approaches. In recent years, imitation learning that aims to learn from expert demonstrations has been proposed as a viable alternative to solving these problems. Generally speaking, imitation learning is designed to learn either the reward (or preference) model or directly the behavioral policy by observing the behavior of an expert. Existing work in imitation learning and inverse reinforcement learning has focused on imitation primarily in unconstrained settings (e.g., no limit on fuel consumed by the vehicle). However, in many real-world domains, the behavior of an expert is governed not only by reward (or preference) but also by constraints. For instance, decisions on self-driving delivery vehicles are dependent not only on the route preferences/rewards (depending on past demand data) but also on the fuel in the vehicle and the time available. In such problems, imitation learning is challenging as decisions are not only dictated by the reward model but are also dependent on a cost-constrained model. In this paper, we provide multiple methods that match expert distributions in the presence of trajectory cost constraints through (a) Lagrangian-based method; (b) Meta-gradients to find a good trade-off between expected return and minimizing constraint violation; and (c) Cost-violation-based alternating gradient. We empirically show that leading imitation learning approaches imitate cost-constrained behaviors poorly and our meta-gradient-based approach achieves the best performance. Qian Shao Pradeep Varakantham Shih-Fen Cheng Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 514 522 10.1609/icaps.v34i1.31512 Accelerating Search-Based Planning for Multi-Robot Manipulation by Leveraging Online-Generated Experiences https://ojs.aaai.org/index.php/ICAPS/article/view/31513 An exciting frontier in robotic manipulation is the use of multiple arms at once. However, planning concurrent motions is a challenging task using current methods. The high-dimensional composite state space renders many well-known motion planning algorithms intractable. Recently, Multi-Agent Path Finding (MAPF) algorithms have shown promise in discrete 2D domains, providing rigorous guarantees. However, widely used conflict-based methods in MAPF assume an efficient single-agent motion planner. This poses challenges in adapting them to manipulation cases where this assumption does not hold, due to the high dimensionality of configuration spaces and the computational bottlenecks associated with collision checking. To this end, we propose an approach for accelerating conflict-based search algorithms by leveraging their repetitive and incremental nature -- making them tractable for use in complex scenarios involving multi-arm coordination in obstacle-laden environments. We show that our method preserves completeness and bounded sub-optimality guarantees, and demonstrate its practical efficacy through a set of experiments with up to 10 robotic arms. Yorai Shaoul Itamar Mishani Maxim Likhachev Jiaoyang Li Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 523 531 10.1609/icaps.v34i1.31513 Logical Specifications-guided Dynamic Task Sampling for Reinforcement Learning Agents https://ojs.aaai.org/index.php/ICAPS/article/view/31514 Reinforcement Learning (RL) has made significant strides in enabling artificial agents to learn diverse behaviors. However, learning an effective policy often requires a large number of environment interactions. To mitigate sample complexity issues, recent approaches have used high-level task specifications, such as Linear Temporal Logic (LTLf) formulas or Reward Machines (RM), to guide the learning progress of the agent. In this work, we propose a novel approach, called Logical Specifications-guided Dynamic Task Sampling (LSTS), that learns a set of RL policies to guide an agent from an initial state to a goal state based on a high-level task specification, while minimizing the number of environmental interactions. Unlike previous work, LSTS does not assume information about the environment dynamics or the Reward Machine, and dynamically samples promising tasks that lead to successful goal policies. We evaluate LSTS on a gridworld and show that it achieves improved time-to-threshold performance on complex sequential decision-making problems compared to state-of-the-art RM and Automaton-guided RL baselines, such as Q-Learning for Reward Machines and Compositional RL from logical Specifications (DIRL). Moreover, we demonstrate that our method outperforms RM and Automaton-guided RL baselines in terms of sample-efficiency, both in a partially observable robotic task and in a continuous control robotic manipulation task. Yash Shukla Tanushree Burman Abhishek N. Kulkarni Robert Wright Alvaro Velasquez Jivko Sinapov Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 532 540 10.1609/icaps.v34i1.31514 Merging or Computing Saturated Cost Partitionings? A Merge Strategy for the Merge-and-Shrink Framework https://ojs.aaai.org/index.php/ICAPS/article/view/31515 The merge-and-shrink framework is a powerful tool for computing abstraction heuristics for optimal classical planning. Merging is one of its name-giving transformations. It entails computing the product of two factors of a factored transition system. To decide which two factors to merge, the framework uses a merge strategy. While there exist many merge strategies, it is generally unclear what constitutes a strong merge strategy, and a previous analysis shows that there is still lots of room for improvement with existing merge strategies. In this paper, we devise a new scoring function for score-based merge strategies based on answering the question whether merging two factors has any benefits over computing saturated cost partitioning heuristics over the factors instead. Our experimental evaluation shows that our new merge strategy achieves state-of-the-art performance on IPC benchmarks. Silvan Sievers Thomas Keller Gabriele Röger Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 541 545 10.1609/icaps.v34i1.31515 Decoupled Search for the Masses: A Novel Task Transformation for Classical Planning https://ojs.aaai.org/index.php/ICAPS/article/view/31516 Automated problem reformulation is a common technique in classical planning to identify and exploit problem structures. Decoupled search is an approach that automatically decomposes planning tasks based on their causal structure, often significantly reducing the search effort. However, its broad applicability is limited by the need for specialized algorithms. In this paper, we present an approach that embodies decoupled search for non-optimal planning through a novel task transformation. Specifically, given a task and a decomposition, we create a transformed task such that the state space of the transformed task is isomorphic to that of decoupled search on the original task. This eliminates the need for specialized algorithms and allows the use of various planning technology in the decoupled-search framework. Empirical evaluation shows that our method is empirically competitive with specialized decoupled algorithms and favorable to other related problem reformulation techniques. David Speck Daniel Gnad Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 546 554 10.1609/icaps.v34i1.31516 Explaining the Space of SSP Policies via Policy-Property Dependencies: Complexity, Algorithms, and Relation to Multi-Objective Planning https://ojs.aaai.org/index.php/ICAPS/article/view/31517 Stochastic shortest path (SSP) problems are a common framework for planning under uncertainty. However, the reactive structure of their solution policies is typically not easily comprehensible by an end-user, nor do planners justify the reasons behind their choice of a particular policy over others. To strengthen confidence in the planner's decision-making, recent work in classical planning has introduced a framework for explaining to the user the possible solution space in terms of necessary trade-offs between user-provided plan properties. Here, we extend this framework to SSPs. We introduce a notion of policy properties taking into account action-outcome uncertainty. We analyze formally the computational problem of identifying the exclusion relationships between policy properties, showing that this problem is in fact harder than SSP planning in a complexity theoretical sense. We show that all the relationships can be identified through a series of heuristic searches, which, if ordered in a clever way, yields an anytime algorithm. Further, we introduce an alternative method, which leverages a connection to multi-objective probabilistic planning to move all the computational burden to a preprocessing step. Finally, we explore empirically the feasibility of the proposed explanation methodology on a range of adapted IPPC benchmarks. Marcel Steinmetz Sylvie Thiébaux Daniel Höller Florent Teichteil-Königsbuch Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 555 564 10.1609/icaps.v34i1.31517 Addressing Myopic Constrained POMDP Planning with Recursive Dual Ascent https://ojs.aaai.org/index.php/ICAPS/article/view/31518 Lagrangian-guided Monte Carlo tree search with global dual ascent has been applied to solve large constrained partially observable Markov decision processes (CPOMDPs) online. In this work, we demonstrate that these global dual parameters can lead to myopic action selection during exploration, ultimately leading to suboptimal decision making. To address this, we introduce history-dependent dual variables that guide local action selection and are optimized with recursive dual ascent. We empirically compare the performance of our approach on a motivating toy example and two large CPOMDPs, demonstrating improved exploration, and ultimately, safer outcomes. Paula Stocco Suhas Chundi Arec Jamgochian Mykel J. Kochenderfer Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 565 569 10.1609/icaps.v34i1.31518 Robust Multi-Agent Pathfinding with Continuous Time https://ojs.aaai.org/index.php/ICAPS/article/view/31519 Multi-Agent Pathfinding (MAPF) is the problem of finding plans for multiple agents such that every agent moves from its start location to its goal location without collisions. If unexpected events delay some agents during plan execution, it may not be possible for the agents to continue following their plans without causing any collision. We define and solve a T-robust MAPF problem that seeks plans that can be followed even if some delays occur, under the generalized MAPFR setting with continuous time notions. The proposed approach is complete and provides provably optimal solutions. We also develop an exact method for collision detection among agents that can be delayed. We experimentally evaluate our proposed approach in terms of efficiency and plan cost. Wen Jun Tan Xueyan Tang Wentong Cai Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 570 578 10.1609/icaps.v34i1.31519 Multi-Robot Connected Fermat Spiral Coverage https://ojs.aaai.org/index.php/ICAPS/article/view/31520 We introduce Multi-Robot Connected Fermat Spiral (MCFS), a novel algorithmic framework for Multi-Robot Coverage Path Planning (MCPP) that adapts Coverage Fermat Spiral (CFS) from the computer graphics community to multi-robot coordination for the first time. MCFS uniquely enables the orchestration of multiple robots to generate coverage paths that contour around arbitrarily shaped obstacles, a feature notably lacking in traditional methods. Our framework not only enhances area coverage and optimizes task performance, particularly in terms of makespan, for workspaces rich in irregular obstacles but also addresses the challenges of path continuity and curvature critical for non-holonomic robots by generating smooth paths without decomposing the workspace. MCFS solves MCPP by constructing a graph of isolines and transforming MCPP into a combinatorial optimization problem, aiming to minimize the makespan while covering all vertices. Our contributions include developing a unified CFS version for scalable and adaptable MCPP, extending it to MCPP with novel optimization techniques for cost reduction and path continuity and smoothness, and demonstrating through extensive experiments that MCFS outperforms existing MCPP methods in makespan, path curvature, coverage ratio, and overlapping ratio. Our research marks a significant step in MCPP, showcasing the fusion of computer graphics and automated planning principles to advance the capabilities of multi-robot systems in complex environments. Our code is publicly available at https://github.com/reso1/MCFS. Jingtao Tang Hang Ma Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 579 587 10.1609/icaps.v34i1.31520 Optimal Infinite Temporal Planning: Cyclic Plans for Priced Timed Automata https://ojs.aaai.org/index.php/ICAPS/article/view/31521 Many applications require infinite plans ---i.e. an infinite sequence of actions--- in order to carry out some given process indefinitely. In addition, it is desirable to guarantee optimality. In this paper, we address this problem in the setting of doubly-priced timed automata, where we show how to efficiently compute ratio-optimal cycles for optimal infinite plans. For efficient computation, we present symbolic λ-deduction (S-λD), an any-time algorithm that uses a symbolic representation (priced zones) to search the state-space with a compact representation of the time constraints. Our approach guarantees termination while arriving at an optimal solution. Our experimental evaluation shows that S-λD outperforms the alternative of searching in the concrete state space; is very robust with respect to fine-grained temporal constraints; and has a very good anytime behaviour. Rasmus G. Tollund Nicklas S. Johansen Kristian Ø. Nielsen Álvaro Torralba Kim G. Larsen Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 588 596 10.1609/icaps.v34i1.31521 Improving Learnt Local MAPF Policies with Heuristic Search https://ojs.aaai.org/index.php/ICAPS/article/view/31522 Multi-agent path finding (MAPF) is the problem of finding collision-free paths for a team of agents to reach their goal locations. State-of-the-art classical MAPF solvers typically employ heuristic search to find solutions for hundreds of agents but are typically centralized and can struggle to scale when run with short timeouts. Machine learning (ML) approaches that learn policies for each agent are appealing as these could enable decentralized systems and scale well while maintaining good solution quality. Current ML approaches to MAPF have proposed methods that have started to scratch the surface of this potential. However, state-of-the-art ML approaches produce ``local" policies that only plan for a single timestep and have poor success rates and scalability. Our main idea is that we can improve a ML local policy by using heuristic search methods on the output probability distribution to resolve deadlocks and enable full horizon planning. We show several model-agnostic ways to use heuristic search with learnt policies that significantly improve the policies' success rates and scalability. To our best knowledge, we demonstrate the first time ML-based MAPF approaches have scaled to high congestion scenarios (e.g. 20% agent density). Rishi Veerapaneni Qian Wang Kevin Ren Arthur Jakobsson Jiaoyang Li Maxim Likhachev Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 597 606 10.1609/icaps.v34i1.31522 Neural Action Policy Safety Verification: Applicablity Filtering https://ojs.aaai.org/index.php/ICAPS/article/view/31523 Neural networks (NN) are an increasingly important representation of action policies pi. Applicability filtering is a commonly used practice in this context, restricting the action selection in pi to only applicable actions. Policy predicate abstraction (PPA) has recently been introduced to verify safety of neural pi, through over-approximating the state space subgraph induced by pi. Thus far however, PPA does not permit applicability filtering, which is challenging due to the additional constraints that need to be taken into account. Here we overcome that limitation, through a range of algorithmic enhancements. In our experiments, our enhancements achieve several orders of magnitude speed-up over a baseline implementation, bringing PPA with applicability filtering close to the performance of PPA without such filtering. Marcel Vinzent Jörg Hoffmann Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 607 612 10.1609/icaps.v34i1.31523 Efficient Approximate Search for Multi-Objective Multi-Agent Path Finding https://ojs.aaai.org/index.php/ICAPS/article/view/31524 The Multi-Objective Multi-Agent Path Finding (MO-MAPF) problem is the problem of computing collision-free paths for a team of agents while minimizing multiple cost metrics. Most existing MO-MAPF algorithms aim to compute the Pareto frontier. However, the Pareto frontier can be time-consuming to compute. Our first main contribution is BB-MO-CBS-pex, an approximate MO-MAPF algorithm that computes an approximate frontier for a user-specific approximation factor. BB-MO-CBS-pex builds upon BB-MO-CBS, a state-of-the-art MO-MAPF algorithm, and leverages A*pex, a state-of-the-art single-agent multi-objective search algorithm, to speed up different parts of BB-MO-CBS. We also provide two speed-up techniques for BB-MO-CBS-pex. Our second main contribution is BB-MO-CBS-k, which builds upon BB-MO-CBS-pex and computes up to k solutions for a user-provided k-value. BB-MO-CBS-k is useful when it is unclear how to determine an appropriate approximation factor. Our experimental results show that both BB-MO-CBS-pex and BB-MO-CBS-k solved significantly more instances than BB-MO-CBS for different approximation factors and k-values, respectively. Additionally, we compare BB-MO-CBS-pex with an approximate baseline algorithm derived from BB-MO-CBS and show that BB-MO-CBS-pex achieved speed-ups up to two orders of magnitude. Fangji Wang Han Zhang Sven Koenig Jiaoyang Li Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 613 622 10.1609/icaps.v34i1.31524 MAPF in 3D Warehouses: Dataset and Analysis https://ojs.aaai.org/index.php/ICAPS/article/view/31525 Recent works have made significant progress in multi-agent path finding (MAPF), with modern methods being able to scale to hundreds of agents, handle unexpected delays, work in groups, etc. The vast majority of these methods have focused on 2D "grid world" domains. However, modern warehouses often utilize multi-agent robotic systems that can move in 3D, enabling dense storage but resulting in a more complex multi-agent planning problem. Motivated by this, we introduce and experimentally analyze the application of MAPF to 3D warehouse management, and release the first (see http://mapf.info/index.php/Main/Benchmarks) open-source 3D MAPF dataset. We benchmark two state-of-the-art MAPF methods, EECBS and MAPF-LNS2, and show how different hyper-parameters affect these methods across various 3D MAPF problems. We also investigate how the warehouse structure itself affects MAPF performance. Based on our experimental analysis, we find that a fast low-level search is critical for 3D MAPF, EECBS's suboptimality significantly changes the effect of certain CBS techniques, and certain warehouse designs can noticeably influence MAPF scalability and speed. An additional important observation is that, overall, the tested 2D MAPF techniques scaled well to 3D warehouses and demonstrate how the MAPF community's progress in 2D can generalize to 3D warehouses. Qian Wang Rishi Veerapaneni Yu Wu Jiaoyang Li Maxim Likhachev Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 623 632 10.1609/icaps.v34i1.31525 Learning Generalised Policies for Numeric Planning https://ojs.aaai.org/index.php/ICAPS/article/view/31526 We extend Action Schema Networks (ASNets) to learn generalised policies for numeric planning, which features quantitative numeric state variables, preconditions and effects. We propose a neural network architecture that can reason about the numeric variables both directly and in context of other variables. We also develop a dynamic exploration algorithm for more efficient training, by better balancing the exploration versus learning tradeoff to account for the greater computational demand of numeric teacher planners. Experimentally, we find that the learned generalised policies are capable of outperforming traditional numeric planners on some domains, and the dynamic exploration algorithm to be on average much faster at learning effective generalised policies than the original ASNets training algorithm. Ryan Xiao Wang Sylvie Thiébaux Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 633 642 10.1609/icaps.v34i1.31526 Tightest Admissible Shortest Path https://ojs.aaai.org/index.php/ICAPS/article/view/31527 The shortest path problem in graphs is fundamental to AI. Nearly all variants of the problem and relevant algorithms that solve them ignore edge-weight computation time and its common relation to weight uncertainty. This implies that taking these factors into consideration can potentially lead to a performance boost in relevant applications. Recently, a generalized framework for weighted directed graphs was suggested, where edge-weight can be computed (estimated) multiple times, at increasing accuracy and run-time expense. We build on this framework to introduce the problem of finding the tightest admissible shortest path (TASP); a path with the tightest suboptimality bound on the optimal cost. This is a generalization of the shortest path problem to bounded uncertainty, where edge-weight uncertainty can be traded for computational cost. We present a complete algorithm for solving TASP, with guarantees on solution quality. Empirical evaluation supports the effectiveness of this approach. Eyal Weiss Ariel Felner Gal A. Kaminka Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 643 652 10.1609/icaps.v34i1.31527 Neuro-Symbolic Learning of Lifted Action Models from Visual Traces https://ojs.aaai.org/index.php/ICAPS/article/view/31528 Model-based planners rely on action models to describe available actions in terms of their preconditions and effects. Nonetheless, manually encoding such models is challenging, especially in complex domains. Numerous methods have been proposed to learn action models from examples of plan execution traces. However, high-level information, such as state labels within traces, is often unavailable and needs to be inferred indirectly from raw observations. In this paper, we aim to learn lifted action models from visual traces --- sequences of image-action pairs depicting discrete successive trace steps. We present ROSAME, a differentiable neuRO-Symbolic Action Model lEarner that infers action models from traces consisting of probabilistic state predictions and actions. By combining ROSAME with a deep learning computer vision model, we create an end-to-end framework that jointly learns state predictions from images and infers symbolic action models. Experimental results demonstrate that our method succeeds in both tasks, using different visual state representations, with the learned action models often matching or even surpassing those created by humans. Kai Xi Stephen Gould Sylvie Thiébaux Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 653 662 10.1609/icaps.v34i1.31528 Control in Stochastic Environment with Delays: A Model-based Reinforcement Learning Approach https://ojs.aaai.org/index.php/ICAPS/article/view/31529 In this paper we are introducing a new reinforcement learning method for control problems in environments with delayed feedback. Specifically, our method employs stochastic planning, versus previous methods that used deterministic planning. This allows us to embed risk preference in the policy optimization problem. We show that this formulation can recover the optimal policy for problems with deterministic transitions. We contrast our policy with two prior methods from literature. We apply the methodology to simple tasks to understand its features. Then, we compare the performance of the methods in controlling multiple Atari games. Zhiyuan Yao Ionut Florescu Chihoon Lee Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 663 670 10.1609/icaps.v34i1.31529 Contrastive Explanations of Centralized Multi-agent Optimization Solutions https://ojs.aaai.org/index.php/ICAPS/article/view/31530 In many real-world scenarios, agents are involved in optimization problems. Since most of these scenarios are over-constrained, optimal solutions do not always satisfy all agents. Some agents might be unhappy and ask questions of the form “Why does solution S not satisfy property P ?”. We propose CMAOE, a domain-independent approach to obtain contrastive explanations by: (i) generating a new solution S′ where property P is enforced, while also minimizing the differences between S and S′; and (ii) highlighting the differences between the two solutions, with respect to the features of the objective function of the multi-agent system. Such explanations aim to help agents understanding why the initial solution is better in the context of the multi-agent system than what they expected. We have carried out a computational evaluation that shows that CMAOE can generate contrastive explanations for large multi-agent optimization problems. We have also performed an extensive user study in four different domains that shows that: (i) after being presented with these explanations, humans’ satisfaction with the original solution increases; and (ii) the constrastive explanations generated by CMAOE are preferred or equally preferred by humans over the ones generated by state of the art approaches. Parisa Zehtabi Alberto Pozanco Ayala Bolch Daniel Borrajo Sarit Kraus Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 671 679 10.1609/icaps.v34i1.31530 Bounded-Suboptimal Weight-Constrained Shortest-Path Search via Efficient Representation of Paths https://ojs.aaai.org/index.php/ICAPS/article/view/31531 In the Weight-Constrained Shortest-Path (WCSP) problem, given a graph in which each edge is annotated with a cost and a weight, a start state, and a goal state, the task is to compute a minimum-cost path from the start state to the goal state with weight no larger than a given weight limit. While most existing works have focused on solving the WCSP problem optimally, many real-world situations admit a trade-off between efficiency and a suboptimality bound for the path cost. In this paper, we propose the bounded-suboptimal WCSP algorithm WC-A*pex, which is built on the state-of-the-art approximate bi-objective search algorithm A*pex. WC-A*pex uses an approximate representation of paths with similar costs and weights to compute a (1+ε)-suboptimal path, for a given ε. During its search, WC-A*pex avoids storing all paths explicitly and thereby reduces the search effort while still retaining its (1 + ε)-suboptimality bound. On benchmark road networks, our experimental results show that WC-A*pex with ε = 0.01 (i.e., with a guaranteed suboptimality of at most 1%) achieves a speed-up of up to an order of magnitude over WC-A*, a state-of-the-art WCSP algorithm, and its bounded-suboptimal variant. Han Zhang Oren Salzman Ariel Felner T. K. Satish Kumar Sven Koenig Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 680 688 10.1609/icaps.v34i1.31531 A Counter-Example Based Approach to Probabilistic Conformant Planning https://ojs.aaai.org/index.php/ICAPS/article/view/31532 This paper introduces a counter-example based approach for solving probabilistic conformant planning (PCP) problems. Our algorithm incrementally generates candidate plans and identifies counter-examples until it finds a plan for which the probability of success is above the specified threshold. We prove that the algorithm is sound and complete. We further propose a variation of our algorithm that uses hitting sets to accelerate the generation of candidate plans. Experimental results show that our planner is particularly suited for problems with a high probability threshold. Xiaodi Zhang Alban Grastien Charles Gretton Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 689 697 10.1609/icaps.v34i1.31532 Improving the Efficiency and Efficacy of Multi-Agent Reinforcement Learning on Complex Railway Networks with a Local-Critic Approach https://ojs.aaai.org/index.php/ICAPS/article/view/31533 The complex railway network is a challenging real-world multi-agent system usually involving thousands of agents. Current planning methods heavily depend on expert knowledge to formulate solutions for specific cases and are therefore hardly generalized to new scenarios, on which multi-agent reinforcement learning (MARL) draws significant attention. Despite some successful applications in multi-agent decision-making tasks, MARL is hard to scale to a large number of agents. This paper rethinks the curse of agents in the centralized-training-decentralized-execution (CTDE) paradigm and proposes a local-critic approach to address the issue. By combining the local critic with the PPO algorithm, we design a deep MARL algorithm denoted as local-critic PPO (LCPPO). In experiments, we evaluate the effectiveness of LCPPO on a complex railway network benchmark, Flatland, with various numbers of agents. Noticeably, LCPPO shows prominent generalizability and robustness under the changes of environments. Yuan Zhang Umashankar Deekshith Jianhong Wang Joschka Boedecker Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 698 706 10.1609/icaps.v34i1.31533 Planning and Execution in Multi-Agent Path Finding: Models and Algorithms https://ojs.aaai.org/index.php/ICAPS/article/view/31534 In applications of Multi-Agent Path Finding (MAPF), it is often the sum of planning and execution times that needs to be minimised (i.e., the Goal Achievement Time). Yet current methods seldom optimise for this objective. Optimal algorithms reduce execution time, but may require exponential planning time. Non-optimal algorithms reduce planning time, but at the expense of increased path length. To address these limitations we introduce PIE (Planning and Improving while Executing), a new framework for concurrent planning and execution in MAPF. We show how different instantiations of PIE affect practical performance, including initial planning time, action commitment time and concurrent vs. sequential planning and execution. We then adapt PIE to Lifelong MAPF, a popular application setting where agents are continuously assigned new goals and where additional decisions are required to ensure feasibility. We examine a variety of different approaches to overcome these challenges and we conduct comparative experiments vs. recently proposed alternatives. Results show that PIE substantially outperforms existing methods for One-shot and Lifelong MAPF. Yue Zhang Zhe Chen Daniel Harabor Pierre Le Bodic Peter J. Stuckey Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 707 715 10.1609/icaps.v34i1.31534 Decentralized, Decomposition-Based Observation Scheduling for a Large-Scale Satellite Constellation https://ojs.aaai.org/index.php/ICAPS/article/view/31535 Deploying multi-satellite constellations for Earth observation requires coordinating potentially hundreds of spacecraft. With increasing on-board capability for autonomy, we can view the constellation as a multi-agent system (MAS) and employ decentralized scheduling solutions. We formulate the problem as a distributed constraint optimization problem (DCOP) and desire scalable inter-agent communication. The problem consists of millions of variables which, coupled with the structure, make existing DCOP algorithms inadequate for this application. We develop a scheduling approach that employs a well-coordinated heuristic, referred to as the Geometric Neighborhood Decomposition (GND) heuristic, to decompose the global DCOP into sub-problems as to enable the application of DCOP algorithms. We present the Neighborhood Stochastic Search (NSS) algorithm, a decentralized algorithm to effectively solve the multi-satellite constellation observation scheduling problem using decomposition. In full, we identify the roadblocks of deploying DCOP solvers to a large-scale, real-world problem, propose a decomposition-based scheduling approach that is effective at tackling large scale DCOPs, empirically evaluate the approach against other baseline algorithms to demonstrate the effectiveness, and discuss the generality of the approach. Itai Zilberstein Ananya Rao Matthew Salis Steve Chien Copyright (c) 2024 Association for the Advancement of Artificial Intelligence 2024-05-30 2024-05-30 34 716 724 10.1609/icaps.v34i1.31535