Proceedings of the International Conference on Automated Planning and Scheduling
https://ojs.aaai.org/index.php/ICAPS
<p>The annual ICAPS conference series was formed in 2003 through the merger of two preexisting biennial conferences, the International Conference on Artificial Intelligence Planning and Scheduling (AIPS) and the European Conference on Planning (ECP). ICAPS continues the traditional high standards of AIPS and ECP as an archival forum for new research in the field of automated planning and scheduling. The Proceedings of the International Conference on Automated Planning and Scheduling contains the annual, archival published work of the ICAPS conference.</p>Association for the Advancement of Artificial Intelligenceen-USProceedings of the International Conference on Automated Planning and Scheduling2334-0835Specifying Goals to Deep Neural Networks with Answer Set Programming
https://ojs.aaai.org/index.php/ICAPS/article/view/31454
Recently, methods such as DeepCubeA have used deep reinforcement learning to learn domain-specific heuristic functions in a largely domain-independent fashion. However, such methods either assume a predetermined goal or assume that goals will be given as fully-specified states. Therefore, specifying a set of goal states to these learned heuristic functions is often impractical. To address this issue, we introduce a method of training a heuristic function that estimates the distance between a given state and a set of goal states represented as a set of ground atoms in first-order logic. Furthermore, to allow for more expressive goal specification, we introduce techniques for specifying goals as answer set programs and using answer set solvers to discover sets of ground atoms that meet the specified goals. In our experiments with the Rubik's cube, sliding tile puzzles, and Sokoban, we show that we can specify and reach different goals without any need to re-train the heuristic function. Our code is publicly available at https://github.com/forestagostinelli/SpecGoal.Forest AgostinelliRojina PantaVedant Khandelwal
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-303421010.1609/icaps.v34i1.31454Exact Multi-objective Path Finding with Negative Weights
https://ojs.aaai.org/index.php/ICAPS/article/view/31455
The point-to-point Multi-objective Shortest Path (MOSP) problem is a classic yet challenging task that involves finding all Pareto-optimal paths between two points in a graph with multiple edge costs. Recent studies have shown that employing A* search can lead to state-of-the-art performance in solving MOSP instances with non-negative costs. This paper proposes a novel A*-based multi-objective search framework that not only handles graphs with negative costs and even negative cycles but also incorporates multiple speed-up techniques to enhance the efficiency of exhaustive search with A*. Through extensive experiments, our algorithm demonstrates remarkable success in solving difficult MOSP instances, outperforming leading solutions by several factors.Saman AhmadiNathan R. SturtevantDaniel HaraborMahdi Jalili
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-3034111910.1609/icaps.v34i1.31455On the Computational Complexity of Stackelberg Planning and Meta-Operator Verification
https://ojs.aaai.org/index.php/ICAPS/article/view/31456
Stackelberg planning is a recently introduced single-turn two-player adversarial planning model, where two players are acting in a joint classical planning task, the objective of the first player being hampering the second player from achieving its goal. This places the Stackelberg planning problem somewhere between classical planning and general combinatorial two-player games. But, where exactly? All investigations of Stackelberg planning so far focused on practical aspects. We close this gap by conducting the first theoretical complexity analysis of Stackelberg planning. We show that in general Stackelberg planning is actually no harder than classical planning. Under a polynomial plan-length restriction, however, Stackelberg planning is a level higher up in the polynomial complexity hierarchy, suggesting that compilations into classical planning come with a worst-case exponential plan-length increase. In attempts to identify tractable fragments, we further study its complexity under various planning task restrictions, showing that Stackelberg planning remains intractable where classical planning is not. We finally inspect the complexity of meta-operator verification, a problem that has been recently connected to Stackelberg planning.Gregor BehnkeMarcel Steinmetz
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-3034202410.1609/icaps.v34i1.31456Non-deterministic Planning for Hyperproperty Verification
https://ojs.aaai.org/index.php/ICAPS/article/view/31457
Non-deterministic planning aims to find a policy that achieves a given objective in an environment where actions have uncertain effects, and the agent - potentially - only observes parts of the current state. Hyperproperties are properties that relate multiple paths of a system and can, e.g., capture security and information-flow policies. Popular logics for expressing temporal hyperproperties - such as HyperLTL - extend LTL by offering selective quantification over executions of a system. In this paper, we show that planning offers a powerful intermediate language for the automated verification of hyperproperties. Concretely, we present an algorithm that, given a HyperLTL verification problem, constructs a non-deterministic multi-agent planning instance (in the form of a QDec-POMDP) that, when admitting a plan, implies the satisfaction of the verification problem. We show that for large fragments of HyperLTL, the resulting planning instance corresponds to a classical, FOND, or POND planning problem. We implement our encoding in a prototype verification tool and report on encouraging experimental results.Raven BeutnerBernd Finkbeiner
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-3034253010.1609/icaps.v34i1.31457On Policy Reuse: An Expressive Language for Representing and Executing General Policies that Call Other Policies
https://ojs.aaai.org/index.php/ICAPS/article/view/31458
Recently, a simple but powerful language for expressing and learning general policies and problem decompositions (sketches) has been introduced in terms of rules defined over a set of Boolean and numerical features. In this work, we consider three extensions of this language aimed at making policies and sketches more flexible and reusable: internal memory states, as in finite state controllers; indexical features, whose values are a function of the state and a number of internal registers that can be loaded with objects; and modules that wrap up policies and sketches and allow them to call each other by passing parameters. In addition, unlike general policies that select state transitions rather than ground actions, the new language allows for the selection of such actions. The expressive power of the resulting language for policies and sketches is illustrated through a number of examples.Blai BonetDominik DrexlerHéctor Geffner
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-3034313910.1609/icaps.v34i1.31458Abstraction Heuristics for Factored Tasks
https://ojs.aaai.org/index.php/ICAPS/article/view/31459
One of the strongest approaches for optimal classical planning is A* search with heuristics based on abstractions of the planning task. Abstraction heuristics are well studied in planning formalisms without conditional effects such as SAS+. However, conditional effects are crucial to model many planning tasks compactly. In this paper, we focus on *factored* tasks which allow a specific form of conditional effect, where effects on variable x can only depend on the value of x. We generalize projections, domain abstractions, Cartesian abstractions and the counterexample-guided abstraction refinement method to this formalism. While merge-and-shrink already covers factored task in theory, we provide an implementation that does so. In our experiments, we compare these abstraction-based heuristics to other heuristics supporting conditional effects, as well as symbolic search. On our new benchmark set of factored tasks, pattern database heuristics solve the most problems, followed by symbolic approaches on par with domain abstractions. The more general Cartesian abstractions fall behind in terms of coverage but usually solve problems the fastest among all tested approaches. The generality of merge-and-shrink abstractions does not seem to be beneficial for these factored tasks.Clemens BüchnerPatrick FerberJendrik SeippMalte Helmert
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-3034404910.1609/icaps.v34i1.31459Multi-Agent Temporal Task Solving and Plan Optimization
https://ojs.aaai.org/index.php/ICAPS/article/view/31460
Several multi-agent techniques are utilized to reduce the complexity of classical planning tasks, however, their applicability to temporal planning domains is a currently open line of study in the field of Automated Planning. In this paper, we present MA-LAMA, a factored, centralized, unthreated, satisfying, multi-agent temporal planner, that exploits the 'multi-agent nature' of temporal domains to perform plan optimization. In MA-LAMA, temporal tasks are translated to the constrained snap-actions paradigm, and an automatic agent decomposition, goal assignment, and required cooperation analysis are carried out to build independent search steps, called Search Phases. These Search Phases are then solved by consecutive agent local searches, using classical heuristics and temporal constraints. Experiments show that MA-LAMA is able to solve a wide range of classical and temporal multi-agent domains, performing significantly better in plan quality than other state-of-the-art temporal planners.J. Caballero TestónMaria D. R-Moreno
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-3034505810.1609/icaps.v34i1.31460Taming Discretised PDDL+ through Multiple Discretisations
https://ojs.aaai.org/index.php/ICAPS/article/view/31461
The PDDL+ formalism allows the use of planning techniques in applications that require the ability to perform hybrid discrete-continuous reasoning. PDDL+ problems are notoriously challenging to tackle, and to reason upon them a well-established approach is discretisation. Existing systems rely on a single discretisation delta or, at most, two: a simulation delta to model the dynamics of the environment, and a planning delta, that is used to specify when decisions can be taken. However, there exist cases where this rigid schema is not ideal, for instance when agents with very different speeds need to cooperate or interact in a shared environment, and a more flexible approach that can accommodate more deltas is necessary. To address the needs of this class of hybrid planning problems, in this paper we introduce a reformulation approach that allows the encapsulation of different levels of discretisation in PDDL+ models, hence allowing any domain-independent planning engine to reap the benefits. Further, we provide the community with a new set of benchmarks that highlights the limits of fixed discretisation.Matteo CardelliniMarco MarateaFrancesco PercassiEnrico ScalaMauro Vallati
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-3034596710.1609/icaps.v34i1.31461Return to Tradition: Learning Reliable Heuristics with Classical Machine Learning
https://ojs.aaai.org/index.php/ICAPS/article/view/31462
Current approaches for learning for planning have yet to achieve competitive performance against classical planners in several domains, and have poor overall performance. In this work, we construct novel graph representations of lifted planning tasks and use the WL algorithm to generate features from them. These features are used with classical machine learning methods which have up to 2 orders of magnitude fewer parameters and train up to 3 orders of magnitude faster than the state-of-the-art deep learning for planning models. Our novel approach, WL-GOOSE, reliably learns heuristics from scratch and outperforms the hFF heuristic in a fair competition setting. It also outperforms or ties with LAMA on 4 out of 10 domains on coverage and 7 out of 10 domains on plan quality. WL-GOOSE is the first learning for planning model which achieves these feats. Furthermore, we study the connections between our novel WL feature generation method, previous theoretically flavoured learning architectures, and Description Logic Features for planning.Dillon Z. ChenFelipe TrevizanSylvie Thiébaux
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-3034687610.1609/icaps.v34i1.31462More Flexible Proximity Wildcards Path Planning with Compressed Path Databases
https://ojs.aaai.org/index.php/ICAPS/article/view/31463
Grid-based path planning is one of the classic problems in AI, and a popular topic in application areas such as computer games and robotics. Compressed Path Databases (CPDs) are recognized as a state-of-the-art method for grid-based path planning. It is able to find an optimal path extremely fast without state-space search. In recent years, researchers have tended to focus on improving CPDs by reducing CPD size or improving search performance. Among various methods, proximity wildcards are one of the most proven improvements in reducing the size of CPD. However, its proximity area is significantly restricted by complex terrain, which significantly affects the pathfinding efficiency and causes additional costs. In this paper, we enhance CPDs from the perspective of improving search efficiency and reducing search costs. Our work focuses on using more flexible methods to obtain larger proximity areas, so that more heuristic information can be used to improve search performance. Experiments conducted on the Grid-Based Path Planning Competition (GPPC) benchmarks demonstrate that the two proposed methods can effectively improve search efficiency and reduce search costs by up to 3 orders of magnitude. Remarkably, our methods can further reduce the storage cost, and improve the compression capability of CPDs simultaneously.Xi ChenYue ZhangYonggang Zhang
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-3034778510.1609/icaps.v34i1.31463On Verifying Linear Execution Strategies in Planning Against Nature
https://ojs.aaai.org/index.php/ICAPS/article/view/31464
While planning and acting in environments in which nature can trigger non-deterministic events, the agent has to consider that the state of the environment might change without its consent. Practically, it means that the agent has to make sure that it eventually achieves its goal (if possible) despite the acts of nature. In this paper, we first formalize the semantics of such problems in Alternating-time Temporal Logic, which allows us to prove some theoretical properties of different types of solutions. Then, we focus on linear execution strategies, which resemble classical plans in that they follow a fixed sequence of actions. We show that any problem that can be solved by a linear execution strategy can be solved by a particular form of linear execution strategy which assigns wait-for preconditions to each action in the plan that specifies when to execute that action. Then, we propose a sound algorithm that verifies a sequence of actions and assigns wait-for preconditions to them by leveraging abstraction.Lukáš ChrpaErez Karpas
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-3034869410.1609/icaps.v34i1.31464Planning and Acting While the Clock Ticks
https://ojs.aaai.org/index.php/ICAPS/article/view/31465
Standard temporal planning assumes that planning takes place offline, and then execution starts at time 0. Recently, situated temporal planning was introduced, where planning starts at time 0, and execution occurs after planning terminates. Situated temporal planning reflects a more realistic scenario where time passes during planning. However, in situated temporal planning a complete plan must be generated before any action is executed. In some problems with time pressure, timing is too tight to complete planning before the first action must be executed. For example, an autonomous car that has a truck backing towards it should probably move out of the way now, and plan how to get to its destination later. In this paper, we propose a new problem setting: concurrent planning and execution, in which actions can be dispatched (executed) before planning terminates. Unlike previous work on planning and execution, we must handle wall clock deadlines that affect action applicability and goal achievement (as in situated planning) while also supporting dispatching actions before a complete plan has been found. We extend previous work on metareasoning for situated temporal planning to develop an algorithm for this new setting. Our empirical evaluation shows that when there is strong time pressure, our approach outperforms situated temporal planning.Andrew ColesErez KarpasAndrey LavrinenkoWheeler RumlSolomon Eyal ShimonyShahaf Shperberg
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-30349510310.1609/icaps.v34i1.31465Planning with Object Creation
https://ojs.aaai.org/index.php/ICAPS/article/view/31466
Classical planning problems are defined using some specification language, such as PDDL. The domain expert defines action schemas, objects, the initial state, and the goal. One key aspect of PDDL is that the set of objects cannot be modified during plan execution. While this is fine in many domains, sometimes it makes modeling more complicated. This may impact the performance of planners, and it requires the domain expert to bound the number of required objects beforehand, which can be a challenge. We introduce an extension to the classical planning formalism, where action effects can create and remove objects. This problem is semi-decidable, but it becomes decidable if we can bound the number of objects in any given state, even though the state space is still infinite. On the practical side, we extend the Powerlifted planning system to support this PDDL extension. Our results show that this extension improves the performance of Powerlifted while supporting more natural PDDL models.Augusto B. CorrêaGiuseppe De GiacomoMalte HelmertSasha Rubin
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-303410411310.1609/icaps.v34i1.31466Multi-Objective Electric Vehicle Route and Charging Planning with Contraction Hierarchies
https://ojs.aaai.org/index.php/ICAPS/article/view/31467
Electric vehicle (EV) travel planning is a complex task that involves planning the routes and the charging sessions for EVs while optimizing travel duration and cost. We show the applicability of the multi-objective EV travel planning algorithm with practically usable solution times on country-sized road graphs with a large number of charging stations and a realistic EV model. The approach is based on multi-objective A* search enhanced by Contraction hierarchies, optimal dimensionality reduction, and sub-optimal ϵ-relaxation techniques. We performed an extensive empirical evaluation on 182,000 problem instances showing the impact of various algorithm settings on real-world map of Bavaria and Germany with more than 12,000 charging stations. The results show the proposed approach is the first one capable of performing such a genuine multi-objective optimization on realistically large country-scale problem instances that can achieve practically usable planning times in order of seconds with only a minor loss of solution quality. The achieved speed-up varies from ~11× for optimal solution to more than 250× for sub-optimal solution compared to vanilla multi-objective A*.Marek CuchýJiří VokřínekMichal Jakob
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-303411412210.1609/icaps.v34i1.31467Combined Task and Motion Planning via Sketch Decompositions
https://ojs.aaai.org/index.php/ICAPS/article/view/31468
The challenge in combined task and motion planning (TAMP) is the effective integration of a search over a combinatorial space, usually carried out by a task planner, and a search over a continuous configuration space, carried out by a motion planner. Using motion planners for testing the feasibility of task plans and filling out the details is not effective because it makes the geometrical constraints play a passive role. This work introduces a new interleaved approach for integrating the two dimensions of TAMP that makes use of sketches, a recent simple but powerful language for expressing the decomposition of problems into subproblems. A sketch has width 1 if it decomposes the problem into subproblems that can be solved greedily in linear time. In the paper, a general sketch is introduced for several classes of TAMP problems which has width 1 under suitable assumptions. While sketch decompositions have been developed for classical planning, they offer two important benefits in the context of TAMP. First, when a task plan is found to be unfeasible due to the geometric constraints, the combinatorial search resumes in a specific subproblem. Second, the sampling of object configurations is not done once, globally, at the start of the search, but locally, at the start of each subproblem. Optimizations of this basic setting are also considered and experimental results over existing and new pick-and-place benchmarks are reported.Magí Dalmau MorenoNéstor GarcíaVicenç GómezHéctor Geffner
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-303412313210.1609/icaps.v34i1.31468Planning Domain Simulation: An Interactive System for Plan Visualisation
https://ojs.aaai.org/index.php/ICAPS/article/view/31469
Representing and manipulating domain knowledge is essential for developing systems that can visualize plans. This paper presents a novel plan visualisation system called Planning Domain Simulation (PDSim) that employs knowledge representation and manipulation techniques to support the plan visualization process. PDSim can use PDDL or the Unified Planning Library's Python representation as the underlying language for modelling planning problems and provides an interface for users to manipulate this representation through interaction with the Unity game engine and a set of planners. The system’s features include visualising plan components, and their relationships, identifying plan conflicts, and examples applied to real-world problems. The benefits and limitations of PDSim are also discussed, highlighting future research directions in the area.Emanuele De PellegrinRonald P. A. Petrick
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-303413314110.1609/icaps.v34i1.31469Learning Quadruped Locomotion Policies Using Logical Rules
https://ojs.aaai.org/index.php/ICAPS/article/view/31470
Quadruped animals are capable of exhibiting a diverse range of locomotion gaits. While progress has been made in demonstrating such gaits on robots, current methods rely on motion priors, dynamics models, or other forms of extensive manual efforts. People can use natural language to describe dance moves. Could one use a formal language to specify quadruped gaits? To this end, we aim to enable easy gait specification and efficient policy learning. Leveraging Reward Machines (RMs) for high-level gait specification over foot contacts, our approach is called RM-based Locomotion Learning (RMLL), and supports adjusting gait frequency at execution time. Gait specification is enabled through the use of a few logical rules per gait (e.g., alternate between moving front feet and back feet) and does not require labor-intensive motion priors. Experimental results in simulation highlight the diversity of learned gaits (including two novel gaits), their energy consumption and stability across different terrains, and the superior sample-efficiency when compared to baselines. We also demonstrate these learned policies with a real quadruped robot. Video and supplementary materials: https://sites.google.com/view/rm-locomotion-learning/homeDavid DeFazioYohei HayamizuShiqi Zhang
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-303414215010.1609/icaps.v34i1.31470Higher-Dimensional Potential Heuristics: Lower Bound Criterion and Connection to Correlation Complexity
https://ojs.aaai.org/index.php/ICAPS/article/view/31471
Correlation complexity is a measure of a planning task indicating how hard it is. The introducing work, provides sufficient criteria to detect a correlation complexity of 2 on a planning task. It also introduced an example of a planning task with correlation complexity 3. In our work, we introduce a criterion to detect an arbitrary correlation complexity and extend the mentioned example to show with the new criterion that planning tasks with arbitrary correlation complexity exist.Simon DoldMalte Helmert
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-303415116110.1609/icaps.v34i1.31471New Fuzzing Biases for Action Policy Testing
https://ojs.aaai.org/index.php/ICAPS/article/view/31472
Testing was recently proposed as a method to gain trust in learned action policies in classical planning. Test cases in this setting are states generated by a fuzzing process that performs random walks from the initial state. A fuzzing bias attempts to bias these random walks towards policy bugs, that is, states where the policy performs sub-optimally. Prior work explored a simple fuzzing bias based on policy-trace cost. Here, we investigate this topic more deeply. We introduce three new fuzzing biases based on analyses of policy-trace shape, estimating whether a trace is close to looping back on itself, whether it contains detours, and whether its goal-distance surface does not smoothly decline. Our experiments with two kinds of neural action policies show that these new biases improve bug-finding capabilities in many cases.Jan EisenhutXandra SchulerDaniel FišerDaniel HöllerMaria ChristakisJörg Hoffmann
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-303416216710.1609/icaps.v34i1.31472PDDL+ Models for Deployable yet Effective Traffic Signal Optimisation
https://ojs.aaai.org/index.php/ICAPS/article/view/31473
The use of planning techniques in traffic signal optimisation has proven effective in managing unexpected traffic conditions as well as typical traffic patterns. However, significant challenges concerning the deployability of generated signal strategies remain, as existing approaches tend not to consider constraints and features of the actual real-world infrastructure on which they will be implemented. To address this challenge, we introduce a range of PDDL+ models embodying technological requirements as well as insights from domain experts. The proposed models have been extensively tested on historical data using a range of well-known search strategies and heuristics, as well as alternative encodings. Results demonstrate their competitiveness with the state of the art.Anas El KouaitiFrancesco PercassiAlessandro SaettiThomas Leo McCluskeyMauro Vallati
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-303416817710.1609/icaps.v34i1.31473Termination Properties of Transition Rules for Indirect Effects
https://ojs.aaai.org/index.php/ICAPS/article/view/31474
Indirect effects of agent's actions have traditionally been formalized as condition-effect rules that always fire whenever applicable, after each action taken by the agent. In this work, we investigate a core problem of indirect effects, the possibility of arbitrarily or infinitely long sequences of rule firings. Specifically we investigate the termination of rule firings, as well as their confluence, that is, the uniqueness of the state that is ultimately reached. Both problems turn out to be PSPACE-complete. After this, we devise practically interesting syntactic and structural restrictions that guarantee polynomial-time termination and confluence tests. Finally, in the context of planning languages that support indirect effects, we propose new implementation technologies.Mojtaba ElahiSaurabh FadnisJussi Rintanen
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-303417818610.1609/icaps.v34i1.31474A Fast Algorithm for k-Memory Messaging Scheme Design in Dynamic Environments with Uncertainty
https://ojs.aaai.org/index.php/ICAPS/article/view/31475
We study the problem of designing the optimal k-memory messaging scheme in a dynamic environment. Specifically, a sender, who can perfectly observe the state of a dynamic environment but cannot take actions, aims to persuade an uninformed, far-sighted receiver to take actions to maximize the long-term utility of the sender, by sending messages. We focus on k-memory messaging schemes, i.e., at each time step, the sender's messaging scheme depends on information from the previous k steps. After receiving a message, the self-interested receiver derives a posterior belief and takes action. The immediate reward of each player can be unaligned, thus the sender needs to ensure persuasiveness when designing the messaging scheme. We first formulate this problem as a bi-linear program. Then we show that there are infinitely many non-trivial persuasive messaging schemes for any problem instance. Moreover, we show that when the sender uses a k-memory messaging scheme, the optimal strategy for the receiver is also a k-memory strategy. We propose a fast heuristic algorithm for this problem and show that it can be extended to the setting where the sender has threat ability. We experimentally evaluate our algorithm, comparing it with the solution obtained by the Gurobi solver, in terms of performance and running time, in both settings. Extensive experimental results show that our algorithm outperforms the solution in running time, yet achieves comparable performance.Zhikang FanWeiran Shen
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-303418719510.1609/icaps.v34i1.31475SLAMuZero: Plan and Learn to Map for Joint SLAM and Navigation
https://ojs.aaai.org/index.php/ICAPS/article/view/31476
MuZero has demonstrated remarkable performance in board and video games where Monte Carlo tree search (MCTS) method is utilized to learn and adapt to different game environments. This paper leverages the strength of MuZero to enhance agents’ planning capability for joint active simultaneous localization and mapping (SLAM) and navigation tasks, which require an agent to navigate an unknown environment while simultaneously constructing a map and localizing itself. We propose SLAMuZero, a novel approach for joint SLAM and navigation, which employs a search process that uses an explicit encoder-decoder architecture for mapping, followed by a prediction function to evaluate policy and value based on the generated map. SLAMuZero outperforms the state-of-the-art baseline and significantly reduces training time, underscoring the efficiency of our approach. Additionally, we develop a new open source library for implementing SLAMuZero, which is a flexible and modular toolkit for researchers and practitioners (https://github.com/bwfbowen/SLAMuZero).Bowen FangXu ChenZhengkun PanXuan Di
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-303419620010.1609/icaps.v34i1.31476A Real-Time Rescheduling Algorithm for Multi-robot Plan Execution
https://ojs.aaai.org/index.php/ICAPS/article/view/31477
One area of research in multi-agent path finding is to determine how replanning can be efficiently achieved in the case of agents being delayed during execution. One option is to reschedule the passing order of agents, i.e., the sequence in which agents visit the same location. In response, we propose Switchable-Edge Search (SES), an A*-style algorithm designed to find optimal passing orders. We prove the optimality of SES and evaluate its efficiency via simulations. The best variant of SES takes less than 1 second for small- and medium-sized problems and runs up to 4 times faster than baselines for large-sized problems.Ying FengAdittyo PaulZhe ChenJiaoyang Li
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-303420120910.1609/icaps.v34i1.31477Towards Feasible Higher-Dimensional Potential Heuristics
https://ojs.aaai.org/index.php/ICAPS/article/view/31478
Potential heuristics assign numerical values (potentials) to state features, where each feature is a conjunction of facts. It was previously shown that the informativeness of potential heuristics can be significantly improved by considering complex features, but computing potentials over all pairs of facts is already too costly in practice. In this paper, we investigate whether using just a few high-dimensional features instead of all conjunctions up to a dimension n can result in improved heuristics while keeping the computational cost at bay. We focus on (a) establishing a framework for studying this kind of potential heuristics, and (b) whether it is reasonable to expect improvement with just a few conjunctions. For (a), we propose two compilations that encode each conjunction explicitly as a new fact so that we can compute potentials over conjunctions in the original task as one-dimensional potentials in the compilation. Regarding (b), we provide evidence that informativeness of potential heuristics can be significantly increased with a small set of conjunctions, and these improvements have positive impact on the number of solved tasks.Daniel FišerMarcel Steinmetz
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-303421022010.1609/icaps.v34i1.31478Progressive State Space Disaggregation for Infinite Horizon Dynamic Programming
https://ojs.aaai.org/index.php/ICAPS/article/view/31479
High dimensionality of model-based Reinforcement Learning and Markov Decision Processes can be reduced using abstractions of the state and action spaces. Although hierarchical learning and state abstraction methods have been explored over the past decades, explicit methods to build useful abstractions of models are rarely provided. In this work, we provide a new state abstraction method for solving infinite horizon problems in the discounted and total settings. Our approach is to progressively disaggregate abstract regions by iteratively slicing aggregations of states relatively to a value function. The distinguishing feature of our method, in contrast to previous approximations of the Bellman operator, is the disaggregation of regions during value function iterations (or policy evaluation steps). The objective is to find a more efficient aggregation that reduces the error on each piece of the partition. We provide a proof of convergence for this algorithm without making any assumptions about the structure of the problem. We also show that this process decreases the computational complexity of the Bellman operator iteration and provides useful abstractions. We then plug this state space disaggregation process in classical Dynamic Programming algorithm namely Approximate Value Iteration, Q-Value Iteration and Policy Iteration. Finally, we conduct a numerical comparison on randomly generated MDPs as well as classical MDPs. Those experiments show that our policy-based algorithm is faster than both traditional dynamic programming approach and recent aggregative methods that use a fixed number of adaptive partitions.Orso ForghieriHind CastelEmmanuel HyonErwan Le Pennec
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-303422122910.1609/icaps.v34i1.31479JaxPlan and GurobiPlan: Optimization Baselines for Replanning in Discrete and Mixed Discrete-Continuous Probabilistic Domains
https://ojs.aaai.org/index.php/ICAPS/article/view/31480
Replanning methods that determinize a stochastic planning problem and replan at each action step have long been known to provide strong baseline (and even competition winning) solutions to discrete probabilistic planning problems. Recent work has explored the extension of replanning methods to the case of mixed discrete-continuous probabilistic domains by leveraging MILP compilations of the RDDL specification language. Other recent advances in probabilistic planning have explored the compilation of structured mixed discrete-continuous RDDL domains into a determinized computation graph that also lends itself to replanning via so-called planning by backpropagation methods. However, to date, there has not been any comprehensive comparison of these recent optimization-based replanning methodologies to the state-of-the-art winner of the discrete probabilistic IPC 2011 and 2014 and runner-up in 2018 (PROST) and the winner of the mixed discrete-continuous probabilistic IPC 2023 (DiSProd). In this paper, we describe JaxPlan, which makes several extensive upgrades to planning by backpropagation and its compact tensorized compilation from RDDL to a JAX computation graph that uses discrete relaxations and a sample average approximation. We also provide the first detailed overview of a compilation of the RDDL language specification to Gurobi's Mixed Integer Nonlinear Programming (MINLP) solver that we term GurobiPlan. We provide a comprehensive comparative analysis of JaxPlan and GurobiPlan with competition winning planners on 19 domains and a total of 155 instances to assess their performance across (a) different domains, (b) different instance sizes, and (c) different time budgets. We also release all code to reproduce the results along with the open-source planners we describe in this work.Michael GimelfarbAyal TaitlerScott Sanner
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-303423023810.1609/icaps.v34i1.31480Formal Representations of Classical Planning Domains
https://ojs.aaai.org/index.php/ICAPS/article/view/31481
Planning domains are an important notion, e.g. when it comes to restricting the input for generalized planning or learning approaches. However, domains as specified in PDDL cannot fully capture the intuitive understanding of a planning domain. We close this semantic gap and propose using PDDL axioms to characterize the (typically infinite) set of legal tasks of a domain. A minor extension makes it possible to express all properties that can be determined in polynomial time. We demonstrate the suitability of the approach on established domains from the International Planning Competition.Claudia GrundkeGabriele RögerMalte Helmert
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-303423924810.1609/icaps.v34i1.31481Safe Explicable Planning
https://ojs.aaai.org/index.php/ICAPS/article/view/31482
Human expectations arise from their understanding of others and the world. In the context of human-AI interaction, this understanding may not align with reality, leading to the AI agent failing to meet expectations and compromising team performance. Explicable planning, introduced as a method to bridge this gap, aims to reconcile human expectations with the agent's optimal behavior, facilitating interpretable decision-making. However, an unresolved critical issue is ensuring safety in explicable planning, as it could result in explicable behaviors that are unsafe. To address this, we propose Safe Explicable Planning (SEP), which extends the prior work to support the specification of a safety bound. The goal of SEP is to find behaviors that align with human expectations while adhering to the specified safety criterion. Our approach generalizes the consideration of multiple objectives stemming from multiple models rather than a single model, yielding a Pareto set of safe explicable policies. We present both an exact method, guaranteeing finding the Pareto set, and a more efficient greedy method that finds one of the policies in the Pareto set. Additionally, we offer approximate solutions based on state aggregation to improve scalability. We provide formal proofs that validate the desired theoretical properties of these methods. Evaluation through simulations and physical robot experiments confirms the effectiveness of our approach for safe explicable planning.Akkamahadevi HanniAndrew BoatengYu Zhang
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-303424925710.1609/icaps.v34i1.31482Replanning in Advance for Instant Delay Recovery in Multi-Agent Applications: Rerouting Trains in a Railway Hub
https://ojs.aaai.org/index.php/ICAPS/article/view/31483
Train routing is sensitive to delays that occur in the network. When a train is delayed, it is imperative that a new plan be found quickly, or else other trains may need to be stopped to ensure safety, potentially causing cascading delays. In this paper, we consider this class of multi-agent planning problems, which we call Multi-Agent Execution Delay Replanning. We show that these can be solved by reducing the problem to an any-start-time safe interval planning problem. When an agent has an any-start-time plan, it can react to a delay by simply looking up the precomputed plan for the delayed start time. We identify crucial real-world problem characteristics like the agent's speed, size, and safety envelope, and extend the any-start-time planning to account for them. Experimental results on real-world train networks show that any-start-time plans are compact and can be computed in reasonable time while enabling agents to instantly recover a safe plan.Issa K. HanouDevin Wild ThomasWheeler RumlMathijs de Weerdt
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-303425826610.1609/icaps.v34i1.31483An Analysis of the Decidability and Complexity of Numeric Additive Planning
https://ojs.aaai.org/index.php/ICAPS/article/view/31484
In this paper, we first define numeric additive planning (NAP), a planning formulation equivalent to Hoffmann's Restricted Tasks over Integers. Then, we analyze the minimal number of action repetitions required for a solution, since planning turns out to be decidable as long as such numbers can be calculated for all actions. We differentiate between two kinds of repetitions and solve for one by integer linear programming and the other by search. Additionally, we characterize the differences between propositional planning and NAP regarding these two kinds. To achieve this, we define so-called multi-valued partial order plans, a novel compact plan representation. Finally, we consider decidable fragments of NAP and their complexity.Hayyan HelalGerhard Lakemeyer
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-303426727510.1609/icaps.v34i1.31484Versatile Cost Partitioning with Exact Sensitivity Analysis
https://ojs.aaai.org/index.php/ICAPS/article/view/31485
Saturated post-hoc optimization is a powerful method for computing admissible heuristics for optimal classical planning. The approach solves a linear program (LP) for each state encountered during the search, which is computationally demanding. In this paper, we theoretically and empirically analyze to which extent we can reuse an LP solution of one state for another. We introduce a novel sensitivity analysis that can exactly characterize the set of states for which a unique LP solution is optimal. Furthermore, we identify two properties of the underlying LPs that affect reusability. Finally, we introduce an algorithm that optimizes LP solutions to generalize well to other states. Our new algorithms significantly reduce the number of necessary LP computations.Paul HöftDavid SpeckFlorian PommereningJendrik Seipp
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-303427628010.1609/icaps.v34i1.31485Expressiveness of Graph Neural Networks in Planning Domains
https://ojs.aaai.org/index.php/ICAPS/article/view/31486
Graph Neural Networks (GNNs) have become the standard method of choice for learning with structured data, demonstrating particular promise in classical planning. Their inherent invariance under symmetries of the input graphs endows them with superior generalization capabilities, compared to their symmetry-oblivious counterparts. However, this comes at the cost of limited expressive power. Particularly, GNNs cannot distinguish between graphs that satisfy identical sentences of C2 logic. To leverage GNNs for learning policies in PDDL domains, one needs to encode the contextual representation of the planning states as graphs. The expressiveness of this encoding, coupled with a specific GNN architecture, then hinges on the absence of indistinguishable states necessitating distinct actions. This paper provides a comprehensive theoretical and statistical exploration of such situations in PDDL domains across diverse natural encoding schemes and GNN models.Rostislav HorčíkGustav Šír
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-303428128910.1609/icaps.v34i1.31486Converting Simple Temporal Networks with Uncertainty into Minimal Equivalent Dispatchable Form
https://ojs.aaai.org/index.php/ICAPS/article/view/31487
A Simple Temporal Network with Uncertainty (STNU) is a structure for representing and reasoning about time constraints on actions that may have uncertain durations. An STNU is dynamically controllable (DC) if there exists a dynamic strategy for executing the network that guarantees that all of its constraints will be satisfied no matter how the uncertain durations turn out---within their specified bounds. However, such strategies typically require exponential space. Therefore, converting a DC STNU into a so-called dispatchable form for practical applications is essential. The relevant portions of a real-time execution strategy for a dispatchable STNU can be incrementally constructed during execution, requiring only O(n²) space, while also providing maximum flexibility and minimal computation during the execution of the network. Although existing algorithms can generate equivalent-dispatchable STNUs, they do not guarantee a minimal number of edges in the STNU graph. Since the number of edges directly impacts the computations during execution, this paper presents a novel algorithm for converting any dispatchable STNU into an equivalent dispatchable network having a minimal number of edges. The complexity of the algorithm is O(k n³), where k is the number of actions with uncertain durations, and n is the number of timepoints in the network. The paper also provides an empirical evaluation of the reduction of edges obtained by the impact of the new algorithm.Luke HunsbergerRoberto Posenato
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-303429030010.1609/icaps.v34i1.31487Rethinking Mutual Information for Language Conditioned Skill Discovery on Imitation Learning
https://ojs.aaai.org/index.php/ICAPS/article/view/31488
Language-conditioned robot behavior plays a vital role in executing complex tasks by associating human commands or instructions with perception and actions. The ability to compose long-horizon tasks based on unconstrained language instructions necessitates the acquisition of a diverse set of general-purpose skills.However, acquiring inherent primitive skills in a coupled and long-horizon environment without external rewards or human supervision presents significant challenges. In this paper, we evaluate the relationship between skills and language instructions from a mathematical perspective, employing two forms of mutual information within the framework of language-conditioned policy learning.To maximize the mutual information between language and skills in an unsupervised manner, we propose an end-to-end imitation learning approach known as Language Conditioned Skill Discovery (LCSD). Specifically, we utilize vector quantization to learn discrete latent skills and leverage skill sequences of trajectories to reconstruct high-level semantic instructions.Through extensive experiments on language-conditioned robotic navigation and manipulation tasks, encompassing BabyAI, LORel, and Calvin, we demonstrate the superiority of our method over prior works. Our approach exhibits enhanced generalization capabilities towards unseen tasks, improved skill interpretability, and notably higher rates of task completion success.Zhaoxun JuChao YangFuchun SunHongbo WangYu Qiao
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-303430130910.1609/icaps.v34i1.31488Epistemic Exploration for Generalizable Planning and Learning in Non-Stationary Settings
https://ojs.aaai.org/index.php/ICAPS/article/view/31489
This paper introduces a new approach for continual planning and model learning in relational, non-stationary stochastic environments. Such capabilities are essential for the deployment of sequential decision-making systems in the uncertain and constantly evolving real world. Working in such practical settings with unknown (and non-stationary) transition systems and changing tasks, the proposed framework models gaps in the agent's current state of knowledge and uses them to conduct focused, investigative explorations. Data collected using these explorations is used for learning generalizable probabilistic models for solving the current task despite continual changes in the environment dynamics. Empirical evaluations on several non-stationary benchmark domains show that this approach significantly outperforms planning and RL baselines in terms of sample complexity. Theoretical results show that the system exhibits desirable convergence properties when stationarity holds.Rushang KariaPulkit VermaAlberto SperanzonSiddharth Srivastava
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-303431031810.1609/icaps.v34i1.31489Unifying and Certifying Top-Quality Planning
https://ojs.aaai.org/index.php/ICAPS/article/view/31490
The growing utilization of planning tools in practical scenarios has sparked an interest in generating multiple high-quality plans. Consequently, a range of computational problems under the general umbrella of top-quality planning were introduced over a short time period, each with its own definition. In this work, we show that the existing definitions can be unified into one, based on a dominance relation. The different computational problems, therefore, simply correspond to different dominance relations. Given the unified definition, we can now certify the top-quality of the solutions, leveraging existing certification of unsolvability and optimality. We show that task transformations found in the existing literature can be employed for the efficient certification of various top-quality planning problems and propose a novel transformation to efficiently certify loopless top-quality planning.Michael KatzJunkyu LeeShirin Sohrabi
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-303431932310.1609/icaps.v34i1.31490Explaining Plan Quality Differences
https://ojs.aaai.org/index.php/ICAPS/article/view/31491
We describe a method for explaining the differences between the quality of plans produced for similar planning problems. The method exploits a process of abstracting away details of the planning problems until the difference in solution quality they support has been minimised. We give a general definition of a valid abstraction of a planning problem. We then give the details of the implementation of a number of useful abstractions. Finally, we present a breadth-first search algorithm for finding suitable abstractions for explanations; and detail the results of an evaluation of the approach.Benjamin KrarupAmanda ColesDerek LongDavid E. Smith
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-303432433210.1609/icaps.v34i1.31491Planning with a Learned Policy Basis to Optimally Solve Complex Tasks
https://ojs.aaai.org/index.php/ICAPS/article/view/31492
Conventional reinforcement learning (RL) methods can successfully solve a wide range of sequential decision problems. However, learning policies that can generalize predictably across multiple tasks in a setting with non-Markovian reward specifications is a challenging problem. We propose to use successor features to learn a set of local policies that each solves a well-defined subproblem. In a task described by a finite state automaton (FSA) that involves the same set of subproblems, the combination of these local policies can then be used to generate an optimal solution without additional learning. In contrast to other methods that combine local policies via planning, our method asymptotically attains global optimality, even in stochastic environments.David KuricGuillermo InfanteVicenç GómezAnders JonssonHerke van Hoof
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-303433334110.1609/icaps.v34i1.31492Action Model Learning from Noisy Traces: a Probabilistic Approach
https://ojs.aaai.org/index.php/ICAPS/article/view/31493
We address the problem of learning planning domains from plan traces that are obtained by observing the environment states through noisy sensors. In such situations, approaches that assume correct traces are not applicable. We tackle the problem by designing a probabilistic graphical model where preconditions and effects of every planning domain operators, and traces’ observations are modeled by random variables. Probabilistic inference conditioned by the observed traces allows our approach to derive a posterior probability of an atom being a precondition and/or an effect of an operator. Planning domains are obtained either by sampling or by applying the maximum a posteriori criterion. We compare our approach with a frequentist baseline and the currently available state-of-the-art approaches. We measure the performance of each method according to two criteria: reconstruction of the original planning domain and effectiveness in solving new planning problems of the same domain. Our experimental analysis shows that our approach learns action models that are more accurate w.r.t. state-of-the-art approaches, and strongly outperforms other approaches in generating models that are effective for solving new problems.Leonardo LamannaLuciano Serafini
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-303434235010.1609/icaps.v34i1.31493Neural Combinatorial Optimization on Heterogeneous Graphs: An Application to the Picker Routing Problem in Mixed-shelves Warehouses
https://ojs.aaai.org/index.php/ICAPS/article/view/31494
In recent years, machine learning (ML) models capable of solving combinatorial optimization (CO) problems have received a surge of attention. While early approaches failed to outperform traditional CO solvers, the gap between handcrafted and learned heuristics has been steadily closing. However, most work in this area has focused on simple CO problems to benchmark new models and algorithms, leaving a gap in the development of methods specifically designed to handle more involved problems. Therefore, this work considers the problem of picker routing in the context of mixed-shelves warehouses, which involves not only a heterogeneous graph representation, but also a combinatorial action space resulting from the integrated selection and routing decisions to be made. We propose both a novel encoder to effectively learn representations of the heterogeneous graph and a hierarchical decoding scheme that exploits the combinatorial structure of the action space. The efficacy of the developed methods is demonstrated through a comprehensive comparison with established architectures as well as exact and heuristic solvers.Laurin LuttmannLin Xie
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-303435135910.1609/icaps.v34i1.31494Investigating Large Neighbourhood Search for Bus Driver Scheduling
https://ojs.aaai.org/index.php/ICAPS/article/view/31495
The Bus Driver Scheduling Problem (BDSP) is a combinatorial optimisation problem with high practical relevance. The aim is to assign bus drivers to predetermined routes while minimising a specified objective function that considers operating costs as well as employee satisfaction. Since we must satisfy several rules from a collective agreement and European regulations, the BDSP is highly constrained. Hence, using exact methods to solve large real-life-based instances is computationally too expensive, while heuristic methods still have a considerable gap to the optimum. Our paper presents a Large Neighbourhood Search (LNS) approach to solve the BDSP. We propose several novel destroy operators and an approach using column generation to repair the sub-problem. We analyse the impact of the destroy and repair operators and investigate various possibilities to select them, including adaptivity. The proposed approach improves all the upper bounds for larger instances that exact methods cannot solve, as well as for some mid-sized instances, and outperforms existing heuristic approaches for this problem on all benchmark instances.Tommaso Mannelli MazzoliLucas KletzanderPascal Van HentenryckNysret Musliu
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-303436036810.1609/icaps.v34i1.31495Weak and Strong Reversibility of Non-deterministic Actions: Universality and Uniformity
https://ojs.aaai.org/index.php/ICAPS/article/view/31496
Classical planning looks for a sequence of actions that transform the initial state of the environment into a goal state. Studying whether the effects of an action can be undone by a sequence of other actions, that is, action reversibility, is beneficial, for example, in determining whether an action is safe to apply. This paper deals with action reversibility of non-deterministic actions, i.e., actions whose application might result in different outcomes. Inspired by the established notions of weak and strong plans in non-deterministic (or FOND) planning, we define the notions of weak and strong reversibility for non-deterministic actions. We then focus on the universality and uniformity of action reversibility, that is, whether we can always undo all possible effects of the action by the same means (i.e., policy), or whether some of the effects can never be undone. We show how these classes of problems can be solved via classical or FOND planning and evaluate our approaches on FOND benchmark domains.Jakub MedLukáš ChrpaMichael MorakWolfgang Faber
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-303436937710.1609/icaps.v34i1.31496Preference Explanation and Decision Support for Multi-Objective Real-World Test Laboratory Scheduling
https://ojs.aaai.org/index.php/ICAPS/article/view/31497
Complex real-world scheduling problems often include multiple conflicting objectives. Decision makers (DMs) can express their preferences over those objectives in different ways, including as sets of weights which are used in a linear combination of objective values. However, finding good sets of weights that result in solutions with desirable qualities is challenging and currently involves a lot of trial and error. We propose a general method to explain objectives' values under a given set of weights using Shapley regression values. We demonstrate this approach on the Test Laboratory Scheduling Problem (TLSP), for which we propose a multi-objective solution algorithm and show that suggestions for weight adjustments based on the introduced explanations are successful in guiding decision makers towards solutions that match their expectations. This method is included in the TLSP MO-Explorer, a new decision support system that enables the exploration and analysis of high-dimensional Pareto fronts.Florian MischekNysret Musliu
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-303437838610.1609/icaps.v34i1.31497Safe Learning of PDDL Domains with Conditional Effects
https://ojs.aaai.org/index.php/ICAPS/article/view/31498
Powerful domain-independent planners have been developed to solve various types of planning problems. These planners often require a model of the acting agent's actions, given in some planning domain description language. Manually designing such an action model is a notoriously challenging task. An alternative is to automatically learn action models from observation. Such an action model is called safe if every plan created with it is consistent with the real, unknown action model. Algorithms for learning such safe action models exist, yet they cannot handle domains with conditional or universal effects, which are common constructs in many planning problems. We prove that learning non-trivial safe action models with conditional effects may require an exponential number of samples. Then, we identify reasonable assumptions under which such learning is tractable and propose Conditional-SAM, the first algorithm capable of doing so. We analyze Conditional-SAM theoretically and evaluate it experimentally. Our results show that the action models learned by Conditional-SAM can be used to solve perfectly most of the test set problems in most of the experimented domains.Argaman MordochEnrico ScalaRoni SternBrendan Juba
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-303438739510.1609/icaps.v34i1.31498SKATE : Successive Rank-based Task Assignment for Proactive Online Planning
https://ojs.aaai.org/index.php/ICAPS/article/view/31499
The development of online applications for services such as package delivery, crowdsourcing, or taxi dispatching has caught the attention of the research community to the domain of online multi-agent multi-task allocation. In online service applications, tasks (or requests) to be performed arrive over time and need to be dynamically assigned to agents. Such planning problems are challenging because: (i) few or almost no information about future tasks is available for long-term reasoning; (ii) agent number, as well as, task number can be impressively high; and (iii) an efficient solution has to be reached in a limited amount of time. In this paper, we propose SKATE, a successive rank-based task assignment algorithm for online multi-agent planning. SKATE can be seen as a meta-heuristic approach which successively assigns a task to the best-ranked agent until all tasks have been assigned. We assessed the complexity of SKATE and showed it is cubic in the number of agents and tasks. To investigate how multi-agent multi-task assignment algorithms perform under a high number of agents and tasks, we compare three multi-task assignment methods in synthetic and real data benchmark environments: Integer Linear Programming (ILP), Genetic Algorithm (GA), and SKATE. In addition, a proactive approach is nested to all methods to determine near-future available agents (resources) using a receding-horizon. Based on the results obtained, we can argue that the classical ILP offers the better quality solutions when treating a low number of agents and tasks, i.e. low load despite the receding-horizon size, while it struggles to respect the time constraint for high load. SKATE performs better than the other methods in high load conditions, and even better when a variable receding-horizon is used.Déborah Conforto NedelmannJérôme LacanCaroline P. C. Chanel
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-303439640410.1609/icaps.v34i1.31499Incremental Ordering for Scheduling Problems
https://ojs.aaai.org/index.php/ICAPS/article/view/31500
Given an instance of a scheduling problem where we want to start executing jobs as soon as possible, it is advantageous if a scheduling algorithm emits the first parts of its solution early, in particular before the algorithm completes its work. Therefore, in this position paper, we analyze core scheduling problems in regards to their enumeration complexity, i.e. the computation time to the first emitted schedule entry (preprocessing time) and the worst case time between two consecutive parts of the solution (delay). Specifically, we look at scheduling instances that reduce to ordering problems. We apply a known incremental sorting algorithm for scheduling strategies that are at their core comparison-based sorting algorithms and translate corresponding upper and lower complexity bounds to the scheduling setting. For instances with n jobs and a precedence DAG with maximum degree Δ, we incrementally build a topological ordering with O(n) preprocessing and O(Δ) delay. We prove a matching lower bound and show with an adversary argument that the delay lower bound holds even in case the DAG has constant average degree and the ordering is emitted out-of-order in the form of insert operations. We complement our theoretical results with experiments that highlight the improved time-to-first-output and discuss research opportunities for similar incremental approaches for other scheduling problems.Stefan NeubertKatrin Casel
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-303440541310.1609/icaps.v34i1.31500Lookahead Pathology in Monte-Carlo Tree Search
https://ojs.aaai.org/index.php/ICAPS/article/view/31501
Monte-Carlo Tree Search (MCTS) is a search paradigm that first found prominence with its success in the domain of computer Go. Early theoretical work established the soundness and convergence bounds for Upper Confidence bounds applied to Trees (UCT), the most popular instantiation of MCTS; however, there remain notable gaps in our understanding of how UCT behaves in practice. In this work, we address one such gap by considering the question of whether UCT can exhibit lookahead pathology in adversarial settings --- a paradoxical phenomenon first observed in Minimax search where greater search effort leads to worse decision-making. We introduce a novel family of synthetic games that offer rich modeling possibilities while remaining amenable to mathematical analysis. Our theoretical and experimental results suggest that UCT is indeed susceptible to pathological behavior in a range of games drawn from this family.Khoi P. N. NguyenRaghuram Ramanujan
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-303441442210.1609/icaps.v34i1.31501Large Language Models as Planning Domain Generators
https://ojs.aaai.org/index.php/ICAPS/article/view/31502
Developing domain models is one of the few remaining places that require manual human labor in AI planning. Thus, in order to make planning more accessible, it is desirable to automate the process of domain model generation. To this end, we investigate if large language models (LLMs) can be used to generate planning domain models from simple textual descriptions. Specifically, we introduce a framework for automated evaluation of LLM-generated domains by comparing the sets of plans for domain instances. Finally, we perform an empirical analysis of 7 large language models, including coding and chat models across 9 different planning domains, and under three classes of natural language domain descriptions. Our results indicate that LLMs, particularly those with high parameter counts, exhibit a moderate level of proficiency in generating correct planning domains from natural language descriptions. Our code is available at https://github.com/IBM/NL2PDDL.James OswaldKavitha SrinivasHarsha KokelJunkyu LeeMichael KatzShirin Sohrabi
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-303442343110.1609/icaps.v34i1.31502On the Prospects of Incorporating Large Language Models (LLMs) in Automated Planning and Scheduling (APS)
https://ojs.aaai.org/index.php/ICAPS/article/view/31503
Automated Planning and Scheduling is among the growing areas in Artificial Intelligence (AI) where mention of LLMs has gained popularity. Based on a comprehensive review of 126 papers, this paper investigates eight categories based on the unique applications of LLMs in addressing various aspects of planning problems: language translation, plan generation, model construction, multi-agent planning, interactive planning, heuristics optimization, tool integration, and brain-inspired planning. For each category, we articulate the issues considered and existing gaps. A critical insight resulting from our review is that the true potential of LLMs unfolds when they are integrated with traditional symbolic planners, pointing towards a promising neuro-symbolic approach. This approach effectively combines the generative aspects of LLMs with the precision of classical planning methods. By synthesizing insights from existing literature, we underline the potential of this integration to address complex planning challenges. Our goal is to encourage the ICAPS community to recognize the complementary strengths of LLMs and symbolic planners, advocating for a direction in automated planning that leverages these synergistic capabilities to develop more advanced and intelligent planning systems. We aim to keep the categorization of papers updated on https://ai4society.github.io/LLM-Planning-Viz/, a collaborative resource that allows researchers to contribute and add new literature to the categorization.Vishal PallaganiBharath Chandra MuppasaniKaushik RoyFrancesco FabianoAndrea LoreggiaKeerthiram MurugesanBiplav SrivastavaFrancesca RossiLior HoreshAmit Sheth
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-303443244410.1609/icaps.v34i1.31503Transition Landmarks from Abstraction Cuts
https://ojs.aaai.org/index.php/ICAPS/article/view/31504
We introduce transition-counting constraints as a principled tool to formalize constraints that must hold in every solution of a transition system. We then show how to obtain transition landmark constraints from abstraction cuts. Transition landmarks dominate operator landmarks in theory but require solving a linear program that is prohibitively large in practice. We compare different constraints that project away transition-counting variables and then further relax the constraint. For one important special case, we provide a lossless projection. We finally discuss efficient data structures to derive cuts from abstractions and store them in a way that avoids repeated computation in every state. We compare the resulting heuristics both theoretically and on benchmarks from the international planning competition.Florian PommereningClemens BüchnerThomas Keller
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-303444545410.1609/icaps.v34i1.31504Computing Planning Centroids and Minimum Covering States Using Symbolic Bidirectional Search
https://ojs.aaai.org/index.php/ICAPS/article/view/31505
In some scenarios, planning agents might be interested in reaching states that keep certain relationships with respect to a set of goals. Recently, two of these types of states were proposed: centroids, which minimize the average distance to the goals; and minimum covering states, which minimize the maximum distance to the goals. Previous approaches compute these states by searching forward either in the original or a reformulated task. In this paper, we propose several algorithms that use symbolic bidirectional search to efficiently compute centroids and minimum covering states. Experimental results in existing and novel benchmarks show that our algorithms scale much better than previous approaches, establishing a new state-of-the-art technique for this problem.Alberto PozancoÁlvaro TorralbaDaniel Borrajo
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-303445546310.1609/icaps.v34i1.31505SayNav: Grounding Large Language Models for Dynamic Planning to Navigation in New Environments
https://ojs.aaai.org/index.php/ICAPS/article/view/31506
Semantic reasoning and dynamic planning capabilities are crucial for an autonomous agent to perform complex navigation tasks in unknown environments. It requires a large amount of common-sense knowledge, that humans possess, to succeed in these tasks. We present SayNav, a new approach that leverages human knowledge from Large Language Models (LLMs) for efficient generalization to complex navigation tasks in unknown large-scale environments. SayNav uses a novel grounding mechanism, that incrementally builds a 3D scene graph of the explored environment as inputs to LLMs, for generating feasible and contextually appropriate high-level plans for navigation. The LLM-generated plan is then executed by a pre-trained low-level planner, that treats each planned step as a short-distance point-goal navigation sub-task. SayNav dynamically generates step-by-step instructions during navigation and continuously refines future steps based on newly perceived information. We evaluate SayNav on multi-object navigation (MultiON) task, that requires the agent to utilize a massive amount of human knowledge to efficiently search multiple different objects in an unknown environment. We also introduce a benchmark dataset for MultiON task employing ProcTHOR framework that provides large photo-realistic indoor environments with variety of objects. SayNav achieves state-of-the-art results and even outperforms an oracle based baseline with strong ground-truth assumptions by more than 8% in terms of success rate, highlighting its ability to generate dynamic plans for successfully locating objects in large-scale new environments. The code, benchmark dataset and demonstration videos are accessible at https://www.sri.com/ics/computer-vision/saynav.Abhinav RajvanshiKaran SikkaXiao LinBhoram LeeHan-Pang ChiuAlvaro Velasquez
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-303446447410.1609/icaps.v34i1.31506Online Control of Adaptive Large Neighborhood Search Using Deep Reinforcement Learning
https://ojs.aaai.org/index.php/ICAPS/article/view/31507
The Adaptive Large Neighborhood Search (ALNS) algorithm has shown considerable success in solving combinatorial optimization problems (COPs). Nonetheless, the performance of ALNS relies on the proper configuration of its selection and acceptance parameters, which is known to be a complex and resource-intensive task. To address this, we introduce a Deep Reinforcement Learning (DRL) based approach called DR-ALNS that selects operators, adjusts parameters, and controls the acceptance criterion throughout the search. The proposed method aims to learn, based on the state of the search, to configure ALNS for the next iteration to yield more effective solutions for the given optimization problem. We evaluate the proposed method on an orienteering problem with stochastic weights and time windows, as presented in an IJCAI competition. The results show that our approach outperforms vanilla ALNS, ALNS tuned with Bayesian optimization, and two state-of-the-art DRL approaches that were the winning methods of the competition, achieving this with significantly fewer training observations. Furthermore, we demonstrate several good properties of the proposed DR-ALNS method: it is easily adapted to solve different routing problems, its learned policies perform consistently well across various instance sizes, and these policies can be directly applied to different problem variants.Robbert ReijnenYingqian ZhangHoong Chuin LauZaharah Bukhsh
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-303447548310.1609/icaps.v34i1.31507Map Connectivity and Empirical Hardness of Grid-based Multi-Agent Pathfinding Problem
https://ojs.aaai.org/index.php/ICAPS/article/view/31508
We present an empirical study of the relationship between map connectivity and the empirical hardness of the multi-agent pathfinding (MAPF) problem. By analyzing the second smallest eigenvalue (commonly known as lambda2) of the normalized Laplacian matrix of different maps, our initial study indicates that maps with smaller lambda2 tend to create more challenging instances when agents are generated uniformly randomly. Additionally, we introduce a map generator based on Quality Diversity (QD) that is capable of producing maps with specified lambda2 ranges, offering a possible way for generating challenging MAPF instances. Despite the absence of a strict monotonic correlation with lambda2 and the empirical hardness of MAPF, this study serves as a valuable initial investigation for gaining a deeper understanding of what makes a MAPF instance hard to solve.Jingyao RenEric EwingT. K. Satish KumarSven KoenigNora Ayanian
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-303448448810.1609/icaps.v34i1.31508The Story So Far on Narrative Planning
https://ojs.aaai.org/index.php/ICAPS/article/view/31509
Narrative planning is the use of automated planning to construct, communicate, and understand stories, a form of information to which human cognition and enaction is pre-disposed. We review the narrative planning problem in a manner suitable as an introduction to the area, survey different plan-based methodologies and affordances for reasoning about narrative, and discuss open challenges relevant to the broader AI community.Rogelio E. Cardona RiveraArnav JhalaJulie PorteousR. Michael Young
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-303448949910.1609/icaps.v34i1.31509Learning General Policies for Planning through GPT Models
https://ojs.aaai.org/index.php/ICAPS/article/view/31510
Transformer-based architectures, such as T5, BERT and GPT, have demonstrated revolutionary capabilities in Natural Language Processing. Several studies showed that deep learning models using these architectures not only possess remarkable linguistic knowledge, but they also exhibit forms of factual knowledge, common sense, and even programming skills. However, the scientific community still debates about their reasoning capabilities, which have been recently tested in the context of automated AI planning; the literature presents mixed results, and the prevailing view is that current transformer-based models may not be adequate for planning. In this paper, we address this challenge differently. We introduce a GPT-based model customised for planning (PLANGPT) to learn a general policy for classical planning by training the model from scratch with a dataset of solved planning instances. Once PLANGPT has been trained for a domain, it can be used to generate a solution plan for an input problem instance in that domain. Our training procedure exploits automated planning knowledge to enhance the performance of the trained model. We build and evaluate our GPT model with several planning domains, and we compare its performance w.r.t. other recent deep learning techniques for generalised planning, demonstrating the effectiveness of the proposed approach.Nicholas RossettiMassimiliano TummoloAlfonso Emilio GereviniLuca PutelliIvan SerinaMattia ChiariMatteo Olivato
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-303450050810.1609/icaps.v34i1.31510Efficiently Computing Transitions in Cartesian Abstractions
https://ojs.aaai.org/index.php/ICAPS/article/view/31511
Counterexample-guided Cartesian abstraction refinement yields strong heuristics for optimal classical planning. The approach iteratively finds a new abstract solution, checks where it fails for the original task and refines the abstraction to avoid the same failure in subsequent iterations. The main bottleneck of this refinement loop is the memory needed for storing all abstract transitions. To address this issue, we introduce an algorithm that efficiently computes abstract transitions on demand. This drastically reduces the memory consumption and allows us to solve tasks during the refinement loop and during the search that were previously out of reach.Jendrik Seipp
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-303450951310.1609/icaps.v34i1.31511Imitating Cost-Constrained Behaviors in Reinforcement Learning
https://ojs.aaai.org/index.php/ICAPS/article/view/31512
Complex planning and scheduling problems have long been solved using various optimization or heuristic approaches. In recent years, imitation learning that aims to learn from expert demonstrations has been proposed as a viable alternative to solving these problems. Generally speaking, imitation learning is designed to learn either the reward (or preference) model or directly the behavioral policy by observing the behavior of an expert. Existing work in imitation learning and inverse reinforcement learning has focused on imitation primarily in unconstrained settings (e.g., no limit on fuel consumed by the vehicle). However, in many real-world domains, the behavior of an expert is governed not only by reward (or preference) but also by constraints. For instance, decisions on self-driving delivery vehicles are dependent not only on the route preferences/rewards (depending on past demand data) but also on the fuel in the vehicle and the time available. In such problems, imitation learning is challenging as decisions are not only dictated by the reward model but are also dependent on a cost-constrained model. In this paper, we provide multiple methods that match expert distributions in the presence of trajectory cost constraints through (a) Lagrangian-based method; (b) Meta-gradients to find a good trade-off between expected return and minimizing constraint violation; and (c) Cost-violation-based alternating gradient. We empirically show that leading imitation learning approaches imitate cost-constrained behaviors poorly and our meta-gradient-based approach achieves the best performance.Qian ShaoPradeep VarakanthamShih-Fen Cheng
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-303451452210.1609/icaps.v34i1.31512Accelerating Search-Based Planning for Multi-Robot Manipulation by Leveraging Online-Generated Experiences
https://ojs.aaai.org/index.php/ICAPS/article/view/31513
An exciting frontier in robotic manipulation is the use of multiple arms at once. However, planning concurrent motions is a challenging task using current methods. The high-dimensional composite state space renders many well-known motion planning algorithms intractable. Recently, Multi-Agent Path Finding (MAPF) algorithms have shown promise in discrete 2D domains, providing rigorous guarantees. However, widely used conflict-based methods in MAPF assume an efficient single-agent motion planner. This poses challenges in adapting them to manipulation cases where this assumption does not hold, due to the high dimensionality of configuration spaces and the computational bottlenecks associated with collision checking. To this end, we propose an approach for accelerating conflict-based search algorithms by leveraging their repetitive and incremental nature -- making them tractable for use in complex scenarios involving multi-arm coordination in obstacle-laden environments. We show that our method preserves completeness and bounded sub-optimality guarantees, and demonstrate its practical efficacy through a set of experiments with up to 10 robotic arms.Yorai ShaoulItamar MishaniMaxim LikhachevJiaoyang Li
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-303452353110.1609/icaps.v34i1.31513Logical Specifications-guided Dynamic Task Sampling for Reinforcement Learning Agents
https://ojs.aaai.org/index.php/ICAPS/article/view/31514
Reinforcement Learning (RL) has made significant strides in enabling artificial agents to learn diverse behaviors. However, learning an effective policy often requires a large number of environment interactions. To mitigate sample complexity issues, recent approaches have used high-level task specifications, such as Linear Temporal Logic (LTLf) formulas or Reward Machines (RM), to guide the learning progress of the agent. In this work, we propose a novel approach, called Logical Specifications-guided Dynamic Task Sampling (LSTS), that learns a set of RL policies to guide an agent from an initial state to a goal state based on a high-level task specification, while minimizing the number of environmental interactions. Unlike previous work, LSTS does not assume information about the environment dynamics or the Reward Machine, and dynamically samples promising tasks that lead to successful goal policies. We evaluate LSTS on a gridworld and show that it achieves improved time-to-threshold performance on complex sequential decision-making problems compared to state-of-the-art RM and Automaton-guided RL baselines, such as Q-Learning for Reward Machines and Compositional RL from logical Specifications (DIRL). Moreover, we demonstrate that our method outperforms RM and Automaton-guided RL baselines in terms of sample-efficiency, both in a partially observable robotic task and in a continuous control robotic manipulation task.Yash ShuklaTanushree BurmanAbhishek N. KulkarniRobert WrightAlvaro VelasquezJivko Sinapov
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-303453254010.1609/icaps.v34i1.31514Merging or Computing Saturated Cost Partitionings? A Merge Strategy for the Merge-and-Shrink Framework
https://ojs.aaai.org/index.php/ICAPS/article/view/31515
The merge-and-shrink framework is a powerful tool for computing abstraction heuristics for optimal classical planning. Merging is one of its name-giving transformations. It entails computing the product of two factors of a factored transition system. To decide which two factors to merge, the framework uses a merge strategy. While there exist many merge strategies, it is generally unclear what constitutes a strong merge strategy, and a previous analysis shows that there is still lots of room for improvement with existing merge strategies. In this paper, we devise a new scoring function for score-based merge strategies based on answering the question whether merging two factors has any benefits over computing saturated cost partitioning heuristics over the factors instead. Our experimental evaluation shows that our new merge strategy achieves state-of-the-art performance on IPC benchmarks.Silvan SieversThomas KellerGabriele Röger
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-303454154510.1609/icaps.v34i1.31515Decoupled Search for the Masses: A Novel Task Transformation for Classical Planning
https://ojs.aaai.org/index.php/ICAPS/article/view/31516
Automated problem reformulation is a common technique in classical planning to identify and exploit problem structures. Decoupled search is an approach that automatically decomposes planning tasks based on their causal structure, often significantly reducing the search effort. However, its broad applicability is limited by the need for specialized algorithms. In this paper, we present an approach that embodies decoupled search for non-optimal planning through a novel task transformation. Specifically, given a task and a decomposition, we create a transformed task such that the state space of the transformed task is isomorphic to that of decoupled search on the original task. This eliminates the need for specialized algorithms and allows the use of various planning technology in the decoupled-search framework. Empirical evaluation shows that our method is empirically competitive with specialized decoupled algorithms and favorable to other related problem reformulation techniques.David SpeckDaniel Gnad
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-303454655410.1609/icaps.v34i1.31516Explaining the Space of SSP Policies via Policy-Property Dependencies: Complexity, Algorithms, and Relation to Multi-Objective Planning
https://ojs.aaai.org/index.php/ICAPS/article/view/31517
Stochastic shortest path (SSP) problems are a common framework for planning under uncertainty. However, the reactive structure of their solution policies is typically not easily comprehensible by an end-user, nor do planners justify the reasons behind their choice of a particular policy over others. To strengthen confidence in the planner's decision-making, recent work in classical planning has introduced a framework for explaining to the user the possible solution space in terms of necessary trade-offs between user-provided plan properties. Here, we extend this framework to SSPs. We introduce a notion of policy properties taking into account action-outcome uncertainty. We analyze formally the computational problem of identifying the exclusion relationships between policy properties, showing that this problem is in fact harder than SSP planning in a complexity theoretical sense. We show that all the relationships can be identified through a series of heuristic searches, which, if ordered in a clever way, yields an anytime algorithm. Further, we introduce an alternative method, which leverages a connection to multi-objective probabilistic planning to move all the computational burden to a preprocessing step. Finally, we explore empirically the feasibility of the proposed explanation methodology on a range of adapted IPPC benchmarks.Marcel SteinmetzSylvie ThiébauxDaniel HöllerFlorent Teichteil-Königsbuch
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-303455556410.1609/icaps.v34i1.31517Addressing Myopic Constrained POMDP Planning with Recursive Dual Ascent
https://ojs.aaai.org/index.php/ICAPS/article/view/31518
Lagrangian-guided Monte Carlo tree search with global dual ascent has been applied to solve large constrained partially observable Markov decision processes (CPOMDPs) online. In this work, we demonstrate that these global dual parameters can lead to myopic action selection during exploration, ultimately leading to suboptimal decision making. To address this, we introduce history-dependent dual variables that guide local action selection and are optimized with recursive dual ascent. We empirically compare the performance of our approach on a motivating toy example and two large CPOMDPs, demonstrating improved exploration, and ultimately, safer outcomes.Paula StoccoSuhas ChundiArec JamgochianMykel J. Kochenderfer
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-303456556910.1609/icaps.v34i1.31518Robust Multi-Agent Pathfinding with Continuous Time
https://ojs.aaai.org/index.php/ICAPS/article/view/31519
Multi-Agent Pathfinding (MAPF) is the problem of finding plans for multiple agents such that every agent moves from its start location to its goal location without collisions. If unexpected events delay some agents during plan execution, it may not be possible for the agents to continue following their plans without causing any collision. We define and solve a T-robust MAPF problem that seeks plans that can be followed even if some delays occur, under the generalized MAPFR setting with continuous time notions. The proposed approach is complete and provides provably optimal solutions. We also develop an exact method for collision detection among agents that can be delayed. We experimentally evaluate our proposed approach in terms of efficiency and plan cost.Wen Jun TanXueyan TangWentong Cai
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-303457057810.1609/icaps.v34i1.31519Multi-Robot Connected Fermat Spiral Coverage
https://ojs.aaai.org/index.php/ICAPS/article/view/31520
We introduce Multi-Robot Connected Fermat Spiral (MCFS), a novel algorithmic framework for Multi-Robot Coverage Path Planning (MCPP) that adapts Coverage Fermat Spiral (CFS) from the computer graphics community to multi-robot coordination for the first time. MCFS uniquely enables the orchestration of multiple robots to generate coverage paths that contour around arbitrarily shaped obstacles, a feature notably lacking in traditional methods. Our framework not only enhances area coverage and optimizes task performance, particularly in terms of makespan, for workspaces rich in irregular obstacles but also addresses the challenges of path continuity and curvature critical for non-holonomic robots by generating smooth paths without decomposing the workspace. MCFS solves MCPP by constructing a graph of isolines and transforming MCPP into a combinatorial optimization problem, aiming to minimize the makespan while covering all vertices. Our contributions include developing a unified CFS version for scalable and adaptable MCPP, extending it to MCPP with novel optimization techniques for cost reduction and path continuity and smoothness, and demonstrating through extensive experiments that MCFS outperforms existing MCPP methods in makespan, path curvature, coverage ratio, and overlapping ratio. Our research marks a significant step in MCPP, showcasing the fusion of computer graphics and automated planning principles to advance the capabilities of multi-robot systems in complex environments. Our code is publicly available at https://github.com/reso1/MCFS.Jingtao TangHang Ma
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-303457958710.1609/icaps.v34i1.31520Optimal Infinite Temporal Planning: Cyclic Plans for Priced Timed Automata
https://ojs.aaai.org/index.php/ICAPS/article/view/31521
Many applications require infinite plans ---i.e. an infinite sequence of actions--- in order to carry out some given process indefinitely. In addition, it is desirable to guarantee optimality. In this paper, we address this problem in the setting of doubly-priced timed automata, where we show how to efficiently compute ratio-optimal cycles for optimal infinite plans. For efficient computation, we present symbolic λ-deduction (S-λD), an any-time algorithm that uses a symbolic representation (priced zones) to search the state-space with a compact representation of the time constraints. Our approach guarantees termination while arriving at an optimal solution. Our experimental evaluation shows that S-λD outperforms the alternative of searching in the concrete state space; is very robust with respect to fine-grained temporal constraints; and has a very good anytime behaviour.Rasmus G. TollundNicklas S. JohansenKristian Ø. NielsenÁlvaro TorralbaKim G. Larsen
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-303458859610.1609/icaps.v34i1.31521Improving Learnt Local MAPF Policies with Heuristic Search
https://ojs.aaai.org/index.php/ICAPS/article/view/31522
Multi-agent path finding (MAPF) is the problem of finding collision-free paths for a team of agents to reach their goal locations. State-of-the-art classical MAPF solvers typically employ heuristic search to find solutions for hundreds of agents but are typically centralized and can struggle to scale when run with short timeouts. Machine learning (ML) approaches that learn policies for each agent are appealing as these could enable decentralized systems and scale well while maintaining good solution quality. Current ML approaches to MAPF have proposed methods that have started to scratch the surface of this potential. However, state-of-the-art ML approaches produce ``local" policies that only plan for a single timestep and have poor success rates and scalability. Our main idea is that we can improve a ML local policy by using heuristic search methods on the output probability distribution to resolve deadlocks and enable full horizon planning. We show several model-agnostic ways to use heuristic search with learnt policies that significantly improve the policies' success rates and scalability. To our best knowledge, we demonstrate the first time ML-based MAPF approaches have scaled to high congestion scenarios (e.g. 20% agent density).Rishi VeerapaneniQian WangKevin RenArthur JakobssonJiaoyang LiMaxim Likhachev
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-303459760610.1609/icaps.v34i1.31522Neural Action Policy Safety Verification: Applicablity Filtering
https://ojs.aaai.org/index.php/ICAPS/article/view/31523
Neural networks (NN) are an increasingly important representation of action policies pi. Applicability filtering is a commonly used practice in this context, restricting the action selection in pi to only applicable actions. Policy predicate abstraction (PPA) has recently been introduced to verify safety of neural pi, through over-approximating the state space subgraph induced by pi. Thus far however, PPA does not permit applicability filtering, which is challenging due to the additional constraints that need to be taken into account. Here we overcome that limitation, through a range of algorithmic enhancements. In our experiments, our enhancements achieve several orders of magnitude speed-up over a baseline implementation, bringing PPA with applicability filtering close to the performance of PPA without such filtering.Marcel VinzentJörg Hoffmann
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-303460761210.1609/icaps.v34i1.31523Efficient Approximate Search for Multi-Objective Multi-Agent Path Finding
https://ojs.aaai.org/index.php/ICAPS/article/view/31524
The Multi-Objective Multi-Agent Path Finding (MO-MAPF) problem is the problem of computing collision-free paths for a team of agents while minimizing multiple cost metrics. Most existing MO-MAPF algorithms aim to compute the Pareto frontier. However, the Pareto frontier can be time-consuming to compute. Our first main contribution is BB-MO-CBS-pex, an approximate MO-MAPF algorithm that computes an approximate frontier for a user-specific approximation factor. BB-MO-CBS-pex builds upon BB-MO-CBS, a state-of-the-art MO-MAPF algorithm, and leverages A*pex, a state-of-the-art single-agent multi-objective search algorithm, to speed up different parts of BB-MO-CBS. We also provide two speed-up techniques for BB-MO-CBS-pex. Our second main contribution is BB-MO-CBS-k, which builds upon BB-MO-CBS-pex and computes up to k solutions for a user-provided k-value. BB-MO-CBS-k is useful when it is unclear how to determine an appropriate approximation factor. Our experimental results show that both BB-MO-CBS-pex and BB-MO-CBS-k solved significantly more instances than BB-MO-CBS for different approximation factors and k-values, respectively. Additionally, we compare BB-MO-CBS-pex with an approximate baseline algorithm derived from BB-MO-CBS and show that BB-MO-CBS-pex achieved speed-ups up to two orders of magnitude.Fangji WangHan ZhangSven KoenigJiaoyang Li
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-303461362210.1609/icaps.v34i1.31524MAPF in 3D Warehouses: Dataset and Analysis
https://ojs.aaai.org/index.php/ICAPS/article/view/31525
Recent works have made significant progress in multi-agent path finding (MAPF), with modern methods being able to scale to hundreds of agents, handle unexpected delays, work in groups, etc. The vast majority of these methods have focused on 2D "grid world" domains. However, modern warehouses often utilize multi-agent robotic systems that can move in 3D, enabling dense storage but resulting in a more complex multi-agent planning problem. Motivated by this, we introduce and experimentally analyze the application of MAPF to 3D warehouse management, and release the first (see http://mapf.info/index.php/Main/Benchmarks) open-source 3D MAPF dataset. We benchmark two state-of-the-art MAPF methods, EECBS and MAPF-LNS2, and show how different hyper-parameters affect these methods across various 3D MAPF problems. We also investigate how the warehouse structure itself affects MAPF performance. Based on our experimental analysis, we find that a fast low-level search is critical for 3D MAPF, EECBS's suboptimality significantly changes the effect of certain CBS techniques, and certain warehouse designs can noticeably influence MAPF scalability and speed. An additional important observation is that, overall, the tested 2D MAPF techniques scaled well to 3D warehouses and demonstrate how the MAPF community's progress in 2D can generalize to 3D warehouses.Qian WangRishi VeerapaneniYu WuJiaoyang LiMaxim Likhachev
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-303462363210.1609/icaps.v34i1.31525Learning Generalised Policies for Numeric Planning
https://ojs.aaai.org/index.php/ICAPS/article/view/31526
We extend Action Schema Networks (ASNets) to learn generalised policies for numeric planning, which features quantitative numeric state variables, preconditions and effects. We propose a neural network architecture that can reason about the numeric variables both directly and in context of other variables. We also develop a dynamic exploration algorithm for more efficient training, by better balancing the exploration versus learning tradeoff to account for the greater computational demand of numeric teacher planners. Experimentally, we find that the learned generalised policies are capable of outperforming traditional numeric planners on some domains, and the dynamic exploration algorithm to be on average much faster at learning effective generalised policies than the original ASNets training algorithm.Ryan Xiao WangSylvie Thiébaux
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-303463364210.1609/icaps.v34i1.31526Tightest Admissible Shortest Path
https://ojs.aaai.org/index.php/ICAPS/article/view/31527
The shortest path problem in graphs is fundamental to AI. Nearly all variants of the problem and relevant algorithms that solve them ignore edge-weight computation time and its common relation to weight uncertainty. This implies that taking these factors into consideration can potentially lead to a performance boost in relevant applications. Recently, a generalized framework for weighted directed graphs was suggested, where edge-weight can be computed (estimated) multiple times, at increasing accuracy and run-time expense. We build on this framework to introduce the problem of finding the tightest admissible shortest path (TASP); a path with the tightest suboptimality bound on the optimal cost. This is a generalization of the shortest path problem to bounded uncertainty, where edge-weight uncertainty can be traded for computational cost. We present a complete algorithm for solving TASP, with guarantees on solution quality. Empirical evaluation supports the effectiveness of this approach.Eyal WeissAriel FelnerGal A. Kaminka
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-303464365210.1609/icaps.v34i1.31527Neuro-Symbolic Learning of Lifted Action Models from Visual Traces
https://ojs.aaai.org/index.php/ICAPS/article/view/31528
Model-based planners rely on action models to describe available actions in terms of their preconditions and effects. Nonetheless, manually encoding such models is challenging, especially in complex domains. Numerous methods have been proposed to learn action models from examples of plan execution traces. However, high-level information, such as state labels within traces, is often unavailable and needs to be inferred indirectly from raw observations. In this paper, we aim to learn lifted action models from visual traces --- sequences of image-action pairs depicting discrete successive trace steps. We present ROSAME, a differentiable neuRO-Symbolic Action Model lEarner that infers action models from traces consisting of probabilistic state predictions and actions. By combining ROSAME with a deep learning computer vision model, we create an end-to-end framework that jointly learns state predictions from images and infers symbolic action models. Experimental results demonstrate that our method succeeds in both tasks, using different visual state representations, with the learned action models often matching or even surpassing those created by humans.Kai XiStephen GouldSylvie Thiébaux
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-303465366210.1609/icaps.v34i1.31528Control in Stochastic Environment with Delays: A Model-based Reinforcement Learning Approach
https://ojs.aaai.org/index.php/ICAPS/article/view/31529
In this paper we are introducing a new reinforcement learning method for control problems in environments with delayed feedback. Specifically, our method employs stochastic planning, versus previous methods that used deterministic planning. This allows us to embed risk preference in the policy optimization problem. We show that this formulation can recover the optimal policy for problems with deterministic transitions. We contrast our policy with two prior methods from literature. We apply the methodology to simple tasks to understand its features. Then, we compare the performance of the methods in controlling multiple Atari games.Zhiyuan YaoIonut FlorescuChihoon Lee
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-303466367010.1609/icaps.v34i1.31529Contrastive Explanations of Centralized Multi-agent Optimization Solutions
https://ojs.aaai.org/index.php/ICAPS/article/view/31530
In many real-world scenarios, agents are involved in optimization problems. Since most of these scenarios are over-constrained, optimal solutions do not always satisfy all agents. Some agents might be unhappy and ask questions of the form “Why does solution S not satisfy property P ?”. We propose CMAOE, a domain-independent approach to obtain contrastive explanations by: (i) generating a new solution S′ where property P is enforced, while also minimizing the differences between S and S′; and (ii) highlighting the differences between the two solutions, with respect to the features of the objective function of the multi-agent system. Such explanations aim to help agents understanding why the initial solution is better in the context of the multi-agent system than what they expected. We have carried out a computational evaluation that shows that CMAOE can generate contrastive explanations for large multi-agent optimization problems. We have also performed an extensive user study in four different domains that shows that: (i) after being presented with these explanations, humans’ satisfaction with the original solution increases; and (ii) the constrastive explanations generated by CMAOE are preferred or equally preferred by humans over the ones generated by state of the art approaches.Parisa ZehtabiAlberto PozancoAyala BolchDaniel BorrajoSarit Kraus
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-303467167910.1609/icaps.v34i1.31530Bounded-Suboptimal Weight-Constrained Shortest-Path Search via Efficient Representation of Paths
https://ojs.aaai.org/index.php/ICAPS/article/view/31531
In the Weight-Constrained Shortest-Path (WCSP) problem, given a graph in which each edge is annotated with a cost and a weight, a start state, and a goal state, the task is to compute a minimum-cost path from the start state to the goal state with weight no larger than a given weight limit. While most existing works have focused on solving the WCSP problem optimally, many real-world situations admit a trade-off between efficiency and a suboptimality bound for the path cost. In this paper, we propose the bounded-suboptimal WCSP algorithm WC-A*pex, which is built on the state-of-the-art approximate bi-objective search algorithm A*pex. WC-A*pex uses an approximate representation of paths with similar costs and weights to compute a (1+ε)-suboptimal path, for a given ε. During its search, WC-A*pex avoids storing all paths explicitly and thereby reduces the search effort while still retaining its (1 + ε)-suboptimality bound. On benchmark road networks, our experimental results show that WC-A*pex with ε = 0.01 (i.e., with a guaranteed suboptimality of at most 1%) achieves a speed-up of up to an order of magnitude over WC-A*, a state-of-the-art WCSP algorithm, and its bounded-suboptimal variant.Han ZhangOren SalzmanAriel FelnerT. K. Satish KumarSven Koenig
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-303468068810.1609/icaps.v34i1.31531A Counter-Example Based Approach to Probabilistic Conformant Planning
https://ojs.aaai.org/index.php/ICAPS/article/view/31532
This paper introduces a counter-example based approach for solving probabilistic conformant planning (PCP) problems. Our algorithm incrementally generates candidate plans and identifies counter-examples until it finds a plan for which the probability of success is above the specified threshold. We prove that the algorithm is sound and complete. We further propose a variation of our algorithm that uses hitting sets to accelerate the generation of candidate plans. Experimental results show that our planner is particularly suited for problems with a high probability threshold.Xiaodi ZhangAlban GrastienCharles Gretton
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-303468969710.1609/icaps.v34i1.31532Improving the Efficiency and Efficacy of Multi-Agent Reinforcement Learning on Complex Railway Networks with a Local-Critic Approach
https://ojs.aaai.org/index.php/ICAPS/article/view/31533
The complex railway network is a challenging real-world multi-agent system usually involving thousands of agents. Current planning methods heavily depend on expert knowledge to formulate solutions for specific cases and are therefore hardly generalized to new scenarios, on which multi-agent reinforcement learning (MARL) draws significant attention. Despite some successful applications in multi-agent decision-making tasks, MARL is hard to scale to a large number of agents. This paper rethinks the curse of agents in the centralized-training-decentralized-execution (CTDE) paradigm and proposes a local-critic approach to address the issue. By combining the local critic with the PPO algorithm, we design a deep MARL algorithm denoted as local-critic PPO (LCPPO). In experiments, we evaluate the effectiveness of LCPPO on a complex railway network benchmark, Flatland, with various numbers of agents. Noticeably, LCPPO shows prominent generalizability and robustness under the changes of environments.Yuan ZhangUmashankar DeekshithJianhong WangJoschka Boedecker
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-303469870610.1609/icaps.v34i1.31533Planning and Execution in Multi-Agent Path Finding: Models and Algorithms
https://ojs.aaai.org/index.php/ICAPS/article/view/31534
In applications of Multi-Agent Path Finding (MAPF), it is often the sum of planning and execution times that needs to be minimised (i.e., the Goal Achievement Time). Yet current methods seldom optimise for this objective. Optimal algorithms reduce execution time, but may require exponential planning time. Non-optimal algorithms reduce planning time, but at the expense of increased path length. To address these limitations we introduce PIE (Planning and Improving while Executing), a new framework for concurrent planning and execution in MAPF. We show how different instantiations of PIE affect practical performance, including initial planning time, action commitment time and concurrent vs. sequential planning and execution. We then adapt PIE to Lifelong MAPF, a popular application setting where agents are continuously assigned new goals and where additional decisions are required to ensure feasibility. We examine a variety of different approaches to overcome these challenges and we conduct comparative experiments vs. recently proposed alternatives. Results show that PIE substantially outperforms existing methods for One-shot and Lifelong MAPF.Yue ZhangZhe ChenDaniel HaraborPierre Le BodicPeter J. Stuckey
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-303470771510.1609/icaps.v34i1.31534Decentralized, Decomposition-Based Observation Scheduling for a Large-Scale Satellite Constellation
https://ojs.aaai.org/index.php/ICAPS/article/view/31535
Deploying multi-satellite constellations for Earth observation requires coordinating potentially hundreds of spacecraft. With increasing on-board capability for autonomy, we can view the constellation as a multi-agent system (MAS) and employ decentralized scheduling solutions. We formulate the problem as a distributed constraint optimization problem (DCOP) and desire scalable inter-agent communication. The problem consists of millions of variables which, coupled with the structure, make existing DCOP algorithms inadequate for this application. We develop a scheduling approach that employs a well-coordinated heuristic, referred to as the Geometric Neighborhood Decomposition (GND) heuristic, to decompose the global DCOP into sub-problems as to enable the application of DCOP algorithms. We present the Neighborhood Stochastic Search (NSS) algorithm, a decentralized algorithm to effectively solve the multi-satellite constellation observation scheduling problem using decomposition. In full, we identify the roadblocks of deploying DCOP solvers to a large-scale, real-world problem, propose a decomposition-based scheduling approach that is effective at tackling large scale DCOPs, empirically evaluate the approach against other baseline algorithms to demonstrate the effectiveness, and discuss the generality of the approach.Itai ZilbersteinAnanya RaoMatthew SalisSteve Chien
Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
2024-05-302024-05-303471672410.1609/icaps.v34i1.31535