Recent from talks
Nothing was collected or created yet.
Automated planning and scheduling
View on WikipediaThis article includes a list of general references, but it lacks sufficient corresponding inline citations. (January 2012) |
| Part of a series on |
| Artificial intelligence (AI) |
|---|
Automated planning and scheduling, sometimes denoted as simply AI planning,[1] is a branch of artificial intelligence that concerns the realization of strategies or action sequences, typically for execution by intelligent agents, autonomous robots and unmanned vehicles. Unlike classical control and classification problems, the solutions are complex and must be discovered and optimized in multidimensional space. Planning is also related to decision theory.
In known environments with available models, planning can be done offline. Solutions can be found and evaluated prior to execution. In dynamically unknown environments, the strategy often needs to be revised online. Models and policies must be adapted. Solutions usually resort to iterative trial and error processes commonly seen in artificial intelligence. These include dynamic programming, reinforcement learning and combinatorial optimization. Languages used to describe planning and scheduling are often called action languages.
Overview
[edit]This section needs additional citations for verification. (February 2021) |
Given a description of the possible initial states of the world, a description of the desired goals, and a description of a set of possible actions, the planning problem is to synthesize a plan that is guaranteed (when applied to any of the initial states) to generate a state which contains the desired goals (such a state is called a goal state).
The difficulty of planning is dependent on the simplifying assumptions employed. Several classes of planning problems can be identified depending on the properties the problems have in several dimensions.
- Are the actions deterministic or non-deterministic? For nondeterministic actions, are the associated probabilities available?
- Are the state variables discrete or continuous? If they are discrete, do they have only a finite number of possible values?
- Can the current state be observed unambiguously? There can be full observability and partial observability.
- How many initial states are there, finite or arbitrarily many?
- Do actions have a duration?
- Can several actions be taken concurrently, or is only one action possible at a time?
- Is the objective of a plan to reach a designated goal state, or to maximize a reward function?
- Is there only one agent or are there several agents? Are the agents cooperative or selfish? Do all of the agents construct their own plans separately, or are the plans constructed centrally for all agents?
The simplest possible planning problem, known as the Classical Planning Problem, is determined by:
- a unique known initial state,
- durationless actions,
- deterministic actions,
- which can be taken only one at a time,
- and a single agent.
Since the initial state is known unambiguously, and all actions are deterministic, the state of the world after any sequence of actions can be accurately predicted, and the question of observability is irrelevant for classical planning.
Further, plans can be defined as sequences of actions, because it is always known in advance which actions will be needed.
With nondeterministic actions or other events outside the control of the agent, the possible executions form a tree, and plans have to determine the appropriate actions for every node of the tree.
Discrete-time Markov decision processes (MDP) are planning problems with:
- durationless actions,
- nondeterministic actions with probabilities,
- full observability,
- maximization of a reward function,
- and a single agent.
When full observability is replaced by partial observability, planning corresponds to a partially observable Markov decision process (POMDP).
If there are more than one agent, we have multi-agent planning, which is closely related to game theory.
Domain independent planning
[edit]This section needs additional citations for verification. (February 2021) |
In AI planning, planners typically input a domain model (a description of a set of possible actions which model the domain) as well as the specific problem to be solved specified by the initial state and goal, in contrast to those in which there is no input domain specified. Such planners are called "domain independent" to emphasize the fact that they can solve planning problems from a wide range of domains. Typical examples of domains are block-stacking, logistics, workflow management, and robot task planning. Hence a single domain-independent planner can be used to solve planning problems in all these various domains. On the other hand, a route planner is typical of a domain-specific planner.
Planning domain modelling languages
[edit]This section needs additional citations for verification. (February 2021) |
The most commonly used languages for representing planning domains and specific planning problems, such as STRIPS and PDDL for Classical Planning, are based on state variables. Each possible state of the world is an assignment of values to the state variables, and actions determine how the values of the state variables change when that action is taken. Since a set of state variables induce a state space that has a size that is exponential in the set, planning, similarly to many other computational problems, suffers from the curse of dimensionality and the combinatorial explosion.
An alternative language for describing planning problems is that of hierarchical task networks, in which a set of tasks is given, and each task can be either realized by a primitive action or decomposed into a set of other tasks. This does not necessarily involve state variables, although in more realistic applications state variables simplify the description of task networks.
Algorithms for planning
[edit]Classical planning
[edit]- forward chaining state space search, possibly enhanced with heuristics
- backward chaining search, possibly enhanced by the use of state constraints (see STRIPS, graphplan)
- partial-order planning
Action model learning
[edit]Creating domain models is difficult, takes a lot of time, and can easily lead to mistakes. To help with this, several methods have been developed to automatically learn full or partial domain models from given observations. [2] [3] [4]
- Read more: Action model learning
Reduction to other problems
[edit]- reduction to the propositional satisfiability problem (satplan).
- reduction to model checking - both are essentially problems of traversing state spaces, and the classical planning problem corresponds to a subclass of model checking problems.
Temporal planning
[edit]Temporal planning can be solved with methods similar to classical planning. The main difference is, because of the possibility of several, temporally overlapping actions with a duration being taken concurrently, that the definition of a state has to include information about the current absolute time and how far the execution of each active action has proceeded. Further, in planning with rational or real time, the state space may be infinite, unlike in classical planning or planning with integer time. Temporal planning is closely related to scheduling problems when uncertainty is involved and can also be understood in terms of timed automata. The Simple Temporal Network with Uncertainty (STNU) is a scheduling problem which involves controllable actions, uncertain events and temporal constraints. Dynamic Controllability for such problems is a type of scheduling which requires a temporal planning strategy to activate controllable actions reactively as uncertain events are observed so that all constraints are guaranteed to be satisfied. [5]
Probabilistic planning
[edit]Probabilistic planning can be solved with iterative methods such as value iteration and policy iteration, when the state space is sufficiently small. With partial observability, probabilistic planning is similarly solved with iterative methods, but using a representation of the value functions defined for the space of beliefs instead of states.
Preference-based planning
[edit]In preference-based planning, the objective is not only to produce a plan but also to satisfy user-specified preferences. A difference to the more common reward-based planning, for example corresponding to MDPs, preferences don't necessarily have a precise numerical value.
Conditional planning
[edit]Deterministic planning was introduced with the STRIPS planning system, which is a hierarchical planner. Action names are ordered in a sequence and this is a plan for the robot. Hierarchical planning can be compared with an automatic generated behavior tree.[6] The disadvantage is, that a normal behavior tree is not so expressive like a computer program. That means, the notation of a behavior graph contains action commands, but no loops or if-then-statements. Conditional planning overcomes the bottleneck and introduces an elaborated notation which is similar to a control flow, known from other programming languages like Pascal. It is very similar to program synthesis, which means a planner generates sourcecode which can be executed by an interpreter.[7]
An early example of a conditional planner is “Warplan-C” which was introduced in the mid 1970s.[8] What is the difference between a normal sequence and a complicated plan, which contains if-then-statements? It has to do with uncertainty at runtime of a plan. The idea is that a plan can react to sensor signals which are unknown for the planner. The planner generates two choices in advance. For example, if an object was detected, then action A is executed, if an object is missing, then action B is executed.[9] A major advantage of conditional planning is the ability to handle partial plans.[10] An agent is not forced to plan everything from start to finish but can divide the problem into chunks. This helps to reduce the state space and solves much more complex problems.
Contingency planning
[edit]We speak of "contingent planning" when the environment is observable through sensors, which can be faulty. It is thus a situation where the planning agent acts under incomplete information. For a contingent planning problem, a plan is no longer a sequence of actions but a decision tree because each step of the plan is represented by a set of states rather than a single perfectly observable state, as in the case of classical planning.[11] The selected actions depend on the state of the system. For example, if it rains, the agent chooses to take the umbrella, and if it doesn't, they may choose not to take it.
Michael L. Littman showed in 1998 that with branching actions, the planning problem becomes EXPTIME-complete.[12][13] A particular case of contiguous planning is represented by FOND problems - for "fully-observable and non-deterministic". If the goal is specified in LTLf (linear time logic on finite trace) then the problem is always EXPTIME-complete[14] and 2EXPTIME-complete if the goal is specified with LDLf.
Conformant planning
[edit]Conformant planning is when the agent is uncertain about the state of the system, and it cannot make any observations. The agent then has beliefs about the real world, but cannot verify them with sensing actions, for instance. These problems are solved by techniques similar to those of classical planning,[15][16] but where the state space is exponential in the size of the problem, because of the uncertainty about the current state. A solution for a conformant planning problem is a sequence of actions. Haslum and Jonsson have demonstrated that the problem of conformant planning is EXPSPACE-complete,[17] and 2EXPTIME-complete when the initial situation is uncertain, and there is non-determinism in the actions outcomes.[13]
Deployment of planning systems
[edit]- The Hubble Space Telescope uses a short-term system called SPSS and a long-term planning system called Spike [citation needed].
See also
[edit]- Action description language
- Action model learning
- Actor model
- Applications of artificial intelligence
- Constraint satisfaction problem
- International Conference on Automated Planning and Scheduling
- Reactive planning
- Scheduling (computing)
- Strategy (game theory)
- Lists
References
[edit]- ^ Ghallab, Malik; Nau, Dana S.; Traverso, Paolo (2004), Automated Planning: Theory and Practice, Morgan Kaufmann, ISBN 1-55860-856-7, archived from the original on 2009-08-24, retrieved 2008-08-20
- ^ Callanan, Ethan and De Venezia, Rebecca and Armstrong, Victoria and Paredes, Alison and Chakraborti, Tathagata and Muise, Christian (2022). MACQ: A Holistic View of Model Acquisition Techniques (PDF). ICAPS Workshop on Knowledge Engineering for Planning and Scheduling (KEPS).
{{cite conference}}: CS1 maint: multiple names: authors list (link) - ^ Aineto, Diego and Jiménez Celorrio, Sergio and Onaindia, Eva (2019). "Learning action models with minimal observability". Artificial Intelligence. 275: 104–137. doi:10.1016/j.artint.2019.05.003. hdl:10251/144560.
{{cite journal}}: CS1 maint: multiple names: authors list (link) - ^ Jiménez, Sergio and de la Rosa, Tomás and Fernández, Susana and Fernández, Fernando and Borrajo, Daniel (2012). "A review of machine learning for automated planning". The Knowledge Engineering Review. 27 (4): 433–467. doi:10.1017/S026988891200001X.
{{cite journal}}: CS1 maint: multiple names: authors list (link) - ^ Vidal, Thierry (January 1999). "Handling contingency in temporal constraint networks: from consistency to controllabilities". Journal of Experimental & Theoretical Artificial Intelligence. 11 (1): 23--45. CiteSeerX 10.1.1.107.1065. doi:10.1080/095281399146607.
- ^ Neufeld, Xenija and Mostaghim, Sanaz and Sancho-Pradel, Dario and Brand, Sandy (2017). "Building a Planner: A Survey of Planning Systems Used in Commercial Video Games". IEEE Transactions on Games. IEEE.
{{cite journal}}: CS1 maint: multiple names: authors list (link) - ^ Sanelli, Valerio and Cashmore, Michael and Magazzeni, Daniele and Iocchi, Luca (2017). Short-term human robot interaction through conditional planning and execution. Proc. of International Conference on Automated Planning and Scheduling (ICAPS). Archived from the original on 2019-08-16. Retrieved 2019-08-16.
{{cite conference}}: CS1 maint: multiple names: authors list (link) - ^ Peot, Mark A and Smith, David E (1992). Conditional nonlinear planning (PDF). Artificial Intelligence Planning Systems. Elsevier. pp. 189–197.
{{cite conference}}: CS1 maint: multiple names: authors list (link) - ^ Karlsson, Lars (2001). Conditional progressive planning under uncertainty. IJCAI. pp. 431–438.
- ^ Liu, Daphne Hao (2008). A survey of planning in intelligent agents: from externally motivated to internally motivated systems (Technical report). Technical Report TR-2008-936, Department of Computer Science, University of Rochester. Archived from the original on 2023-03-15. Retrieved 2019-08-16.
- ^ Alexandre Albore; Hector Palacios; Hector Geffner (2009). A Translation-Based Approach to Contingent Planning. International Joint Conference of Artificial Intelligence (IJCAI). Pasadena, CA: AAAI. Archived from the original on 2019-07-03. Retrieved 2019-07-03.
- ^ Littman, Michael L. (1997). Probabilistic Propositional Planning: Representations and Complexity. Fourteenth National Conference on Artificial Intelligence. MIT Press. pp. 748–754. Archived from the original on 2019-02-12. Retrieved 2019-02-10.
- ^ a b Jussi Rintanen (2004). Complexity of Planning with Partial Observability (PDF). Int. Conf. Automated Planning and Scheduling. AAAI. Archived (PDF) from the original on 2020-10-31. Retrieved 2019-07-03.
- ^ De Giacomo, Giuseppe; Rubin, Sasha (2018). Automata-Theoretic Foundations of FOND Planning for LTLf and LDLf Goals. IJCAI. Archived from the original on 2018-07-17. Retrieved 2018-07-17.
- ^ Palacios, Hector; Geffner, Hector (2009). "Compiling uncertainty away in conformant planning problems with bounded width". Journal of Artificial Intelligence Research. 35: 623–675. arXiv:1401.3468. doi:10.1613/jair.2708. Archived from the original on 2020-04-27. Retrieved 2019-08-16.
- ^ Albore, Alexandre; Ramírez, Miquel; Geffner, Hector (2011). Effective heuristics and belief tracking for planning with incomplete information. Twenty-First International Conference on Automated Planning and Scheduling (ICAPS). Archived from the original on 2017-07-06. Retrieved 2019-08-16.
- ^ Haslum, Patrik; Jonsson, Peter (2000). Some Results on the Complexity of Planning with Incomplete Information. Lecture Notes in Computer Science. Vol. 1809. Springer Berlin Heidelberg. pp. 308–318. doi:10.1007/10720246_24. ISBN 9783540446576.
conference: Recent Advances in AI Planning
Further reading
[edit]- Vlahavas, I. "Planning and Scheduling". EETN. Archived from the original on 2013-12-22.
External links
[edit]Automated planning and scheduling
View on GrokipediaIntroduction
Definition and Scope
Automated planning and scheduling is a subfield of artificial intelligence focused on the computational generation of action sequences and resource allocations to transition from an initial state to a desired goal state within a given environment.[5] Automated planning specifically involves synthesizing a course of action by selecting and ordering high-level actions that achieve objectives, often through deliberation over possible outcomes and interactions.[6] In contrast, automated scheduling entails assigning temporal slots and limited resources—such as time, personnel, or materials—to predefined tasks or actions while optimizing for constraints like deadlines or availability.[3] Together, these processes enable intelligent agents or systems to operate autonomously in dynamic settings, distinguishing them from mere reactive control by emphasizing proactive strategy formulation.[7] The scope of automated planning and scheduling encompasses domain-independent methods applicable to diverse real-world problems, including robotics, logistics, manufacturing, emergency response, and space operations.[5] These techniques model environments abstractly using state representations and action effects, allowing generalization across scenarios without reliance on problem-specific coding, unlike traditional search problems that operate on low-level state manipulations.[6] For instance, in robotics, simple path planning might compute trajectories between points, whereas full mission planning coordinates multiple actions like navigation, object manipulation, and environmental adaptation to complete complex tasks.[5] This broad applicability supports efficiency in resource-constrained systems, from optimizing supply chains in logistics to sequencing operations in manufacturing assembly lines.[7] Key objectives in automated planning and scheduling include ensuring completeness (guaranteeing a solution if one exists), optimality (selecting the most efficient plan by criteria like cost or duration), and computational efficiency (solving problems within practical time and space limits).[3] These goals address challenges in high-dimensional state spaces, where planning prioritizes action viability and goal attainment, while scheduling focuses on feasible temporal and resource distributions to minimize conflicts or delays.[6] By integrating these elements, the field advances autonomous decision-making, adapting to uncertainties and supporting human-AI collaboration in complex domains.[5]Historical Development
The field of automated planning and scheduling traces its origins to the early days of artificial intelligence, particularly with the development of the STRIPS (Stanford Research Institute Problem Solver) system in 1971, which introduced a foundational framework for situation calculus-based planning using theorem proving and means-ends analysis to generate sequences of operators transforming an initial world model to a goal state.[8] STRIPS, created by Richard Fikes and Nils Nilsson at SRI International, marked a pivotal shift toward automated reasoning about actions and their effects, laying the groundwork for domain-independent planning by separating action descriptions from specific problem instances.[9] During the 1980s and 1990s, the field evolved with the rise of domain-independent planners, exemplified by UCPOP (Universal Causal Partial Order Planner) in 1992, a sound and complete partial-order planner for the Action Description Language (ADL) that extended earlier non-linear planning approaches to handle complex dependencies more efficiently.[10] This period saw increased focus on partial-order planning techniques, influenced by key figures such as Austin Tate, whose earlier work on the Nonlin planner in the late 1970s laid the foundations for hierarchical partial-order planning, and whose subsequent work on the O-Plan system in the 1980s advanced hierarchical and resource-aware planning for real-world applications.[11] Standardization efforts culminated in 1998 with the introduction of the Planning Domain Definition Language (PDDL) by Drew McDermott, Malik Ghallab, and colleagues, designed as a neutral syntax for specifying planning problems to enable the first International Planning Competition (IPC), fostering comparability across planners.[12][13] The 2000s brought further advancements through the biennial IPC, which began in 1998 and drove algorithmic improvements by benchmarking planners on standardized PDDL domains, with the competition's third edition in 2002 highlighting progress in expressive modeling.[14] A major milestone was the 2003 extension to PDDL 2.1 by Maria Fox and Derek Long, which incorporated durative actions and temporal constraints to support metric-temporal planning, enabling the modeling of time-dependent and resource-limited scenarios.[15] Key contributors like Malik Ghallab and Dana Nau further shaped the field; Ghallab co-authored PDDL and emphasized integrated planning-acting systems, while Nau developed the SHOP (Simple Hierarchical Ordered Planner) series in the late 1990s, advancing hierarchical task network (HTN) methods for practical decomposition. In the 2010s and 2020s, automated planning integrated with machine learning, particularly post-2015, as seen in approaches like ML-Plan (2018), which used hierarchical planning to automate machine learning pipeline configuration, and subsequent works leveraging neural networks for heuristic guidance and learning from failed plans to enhance solver efficiency.[16] Recent advances emphasize hybrid systems combining symbolic planning with learning for real-time applications, as highlighted in the International Conference on Automated Planning and Scheduling (ICAPS), with the 2024 event in Banff focusing on AI-robotics integration through workshops on learning-enhanced planning for autonomous systems, and the 2025 conference in Melbourne continuing this trend with sessions on fused AI planning for robotic decision-making.[17][4]Fundamentals
Planning versus Scheduling
Automated planning and scheduling are distinct yet complementary processes in artificial intelligence systems designed to achieve complex objectives. Planning primarily involves the generation of a sequence of actions that logically transform an initial world state into a desired goal state, emphasizing the selection and ordering of actions based on their effects and preconditions, while initially abstracting away from temporal and resource details.[18] This core focus on logical transitions allows planners to explore high-level decision spaces, such as determining what actions to take in domains like robotics or logistics, without committing to execution timelines.[6] In contrast, scheduling centers on the optimization of action execution by assigning specific durations, start times, and resources to the planned sequence, subject to constraints like limited machine availability, worker shifts, or precedence relations.[6] Scheduling treats actions as fixed tasks post-planning and aims to minimize inefficiencies, such as idle times or overloads, often formulating the problem as a resource-constrained optimization.[18] For instance, while planning might outline the steps to assemble a product, scheduling would allocate those steps to available factory resources over a finite horizon to ensure feasibility.[19] The interplay between planning and scheduling is evident in integrated frameworks like Advanced Planning and Scheduling (APS) systems, particularly in manufacturing, where planning provides a feasible action outline that scheduling refines into a timed, resource-aware executable plan to handle dynamic uncertainties.[19] This separation enables modular development—planning for long-term feasibility and scheduling for short-term optimization—but integrated approaches, such as interleaved planning and scheduling, combine them to address real-world problems requiring both action choice and temporal allocation simultaneously.[6] Success metrics further highlight the distinction: planning evaluates outcomes by measures like total plan length or cumulative action cost, whereas scheduling prioritizes indicators such as makespan (the time from start to completion of all tasks) or total lateness (deviation from due dates).[20] Domain-independent planning approaches often exploit this dichotomy by first generating abstract plans before applying scheduling refinements.[6]Domain-Independent Planning
Domain-independent planning constitutes a core paradigm in artificial intelligence where algorithms generate plans for diverse problem domains without embedding domain-specific knowledge directly into the solver. These planners operate on abstract, general models of actions, states, and goals provided as input, enabling them to address a wide array of tasks through standardized representations rather than custom-coded logic. This generality distinguishes the approach from earlier, more rigid systems and underpins much of modern automated planning research. The key advantages of domain-independent planning lie in its reusability and broad applicability, allowing a single planner to tackle varied scenarios—such as optimizing supply chain logistics or designing strategies in complex games—by merely adjusting the input model. This eliminates the need for reprogramming the core algorithm for each new application, fostering rapid adaptation and reducing the expertise required for deployment across fields like robotics and resource allocation. Moreover, it promotes community-wide standardization, accelerating advancements through shared benchmarks and tools.[21] Despite these benefits, domain-independent planning grapples with inherent challenges, foremost among them the tension between expressive problem modeling and computational tractability. Classical planning tasks in this paradigm, exemplified by propositional STRIPS problems, are PSPACE-complete, implying that determining the existence of a plan can demand resources exponential in the input size, even for seemingly simple domains. This complexity stems from the exhaustive exploration of potentially enormous state spaces without tailored guidance, often necessitating heuristic approximations to achieve practical performance.[22] Evaluation of domain-independent planners relies heavily on empirical competitions like the International Planning Competition (IPC), which began in 1998[23] and periodically tests systems on standardized suites of diverse domains to measure coverage, speed, and solution quality. Unlike domain-dependent planners, which leverage handcrafted rules for superior efficiency in targeted scenarios, domain-independent methods prioritize versatility at the cost of raw performance, making them ideal for exploratory or evolving applications but less suited for highly optimized, fixed-domain needs.[24]Core Concepts
Automated planning and scheduling revolves around the fundamental problem of finding a sequence of actions that transforms an initial state of the world into a desired goal state. The state space consists of all possible configurations of the world, where each state represents a complete description of relevant facts or properties at a given point. The initial state specifies the starting configuration, the goal state defines the target configuration to achieve, and intermediate states are those reached during the execution of actions en route to the goal.[25] Actions, also known as operators, are the basic building blocks that enable changes in the state space. Each action is defined by preconditions, which are conditions that must hold true in the current state for the action to be applicable; effects, which describe the changes resulting from the action, typically partitioned into add lists (new facts made true) and delete lists (facts made false); and optionally, costs to quantify the expense of executing the action. In classical planning, actions are assumed to have deterministic effects, meaning that applying a valid action always produces a predictable new state without uncertainty.[8] Transitions between states occur when an applicable action is executed, updating the state according to the action's effects. This process is formally modeled in frameworks such as the STRIPS notation, which uses precondition-effect pairs to represent state changes, or situation calculus, a first-order logic formalism where situations denote world states and actions transform one situation into a successor situation via the "do" function. A plan is a valid sequence of actions (or, in more general cases, a policy mapping states to actions) that, when executed from the initial state, reaches a goal state without violating preconditions along the way.[8][26] Key assumptions underpin these core concepts in basic planning models. Deterministic effects ensure that outcomes are fully predictable given the action and state. Additionally, the closed world assumption is typically adopted, positing that all relevant facts are explicitly known in the state description—if a fact is not stated, it is false—contrasting with open world assumptions where unknown facts might be true.[27][28] A representative example is the blocks world domain, where the task involves stacking blocks on a table using a robotic arm. States are described by predicates such as on(blockA, blockB) (block A is on block B), ontable(blockC) (block C is on the table), and clear(blockD) (nothing is on block D). Actions include pickup(blockX), which requires handempty, clear(blockX), and ontable(blockX) as preconditions, adding holding(blockX) and deleting the others; and stack(blockX, blockY), which requires holding(blockX) and clear(blockY), adding on(blockX, blockY) and clear(blockX) while deleting holding(blockX) and clear(blockY). A goal might be to achieve on(blockA, blockB) and on(blockB, blockC), leading to a plan sequence like unstacking interfering blocks, moving them aside, and restacking to form the tower.[29]Modeling Languages
Planning Domain Definition Language (PDDL)
The Planning Domain Definition Language (PDDL) is a standardized, declarative language designed for specifying planning domains and problems in artificial intelligence planning, enabling domain-independent planners to generate solutions without requiring custom implementations for each task. Introduced in 1998 for the First Artificial Intelligence Planning Systems (AIPS) competition, PDDL separates the description of the world model (domain) from specific instances (problems), promoting reusability and comparability across planners.[12] This structure facilitates the encoding of classical planning tasks using predicate logic, where states are represented as sets of logical propositions, and actions modify these states according to defined rules.[12] PDDL's core structure consists of two primary files: a domain file defining the general model and a problem file instantiating it for a specific scenario. The domain file, enclosed in(define (domain <name>) ...), includes requirements (e.g., :strips for basic action models), types for object classification (e.g., (:types truck package location)), constants, predicates, and actions. Predicates are atomic formulas like (on ?x - block ?y - block), where ?x and ?y are variables typed as blocks, representing relationships in the state. Actions are specified with parameters, preconditions, and effects; for instance, an action might be (:action move :parameters (?x - block ?y - block ?z - block) :precondition (and (on ?x ?z) (on ?y ?z) (clear ?x)) :effect (and (on ?x ?y) (clear ?z) (not (on ?x ?z)) (not (clear ?y)))), ensuring preconditions hold before applying additive and deletive effects to the state. The problem file, (define (problem <name> (:domain <domain-name>)), declares objects (e.g., (truck1 - truck)), the initial state as a conjunction of grounded predicates, and goals as desired state conditions.[12]
PDDL has evolved through versions, starting with 1.2 in 1998, which supports basic STRIPS-style planning with added features like typing, equality, and conditional effects inspired by Action Description Language (ADL). Subsequent levels include PDDL 2.1 (2003), introducing durative actions and numeric fluents for temporal and resource modeling; PDDL 2.2 (2004), adding derived predicates computed from other state variables; PDDL 3.0 (2005), incorporating preferences for expressing soft constraints; and PDDL 3.1 (2008), further extending with features such as finite domain variables.[30][15][31][32] These versions maintain backward compatibility while expanding expressiveness. PDDL serves as the input format for the International Planning Competition (IPC), where planners are benchmarked on standardized domains, ensuring consistent evaluation.[33] Tools like VAL provide validation by simulating plan execution against PDDL specifications, checking for correctness and detecting errors in models or solutions.[34]
A limitation of PDDL is its lack of native support for learning domain models from data, as it is a static declarative language requiring manual specification; domain acquisition typically relies on external machine learning techniques.[35] For example, the Logistics domain models transportation tasks with types such as truck, package, and location; predicates like (at ?p - package ?l - location) and (in ?p - package ?t - truck); and actions such as loading a package into a truck at a location, with preconditions ensuring the package and truck are co-located, and effects updating their positions. Extensions to PDDL exist for temporal planning, but core versions focus on discrete, non-temporal actions.[36]
Extensions and Alternative Languages
Extensions to the Planning Domain Definition Language (PDDL) have been developed to address limitations in modeling temporal and continuous aspects of planning problems. PDDL 2.1 introduced durative actions, which allow actions to have durations and conditions/effects that occur at the start, over the interval, or at the end of the action, enabling the representation of temporal planning domains such as scheduling tasks with time constraints.[15] This extension supports leveled semantics, with Level 3 incorporating numeric fluents for modeling resources like fuel or energy that change continuously or incrementally during actions.[15] Building on PDDL 2.1, PDDL+ extends the language to handle continuous time-dependent effects and hybrid systems, incorporating processes that model ongoing changes (e.g., linear or piecewise constant numeric variations) alongside discrete events and durative actions.[37] This makes PDDL+ suitable for domains involving continuous dynamics, such as robotic control or chemical process planning, where state variables evolve over real-valued time. Temporal PDDL variants, often based on PDDL 2.1, facilitate scheduling by specifying temporal constraints on action execution and resource usage. For instance, PDDL 2.2, used in the 4th International Planning Competition, enhanced support for numeric fluents in resource allocation problems, allowing planners to optimize metrics like total cost or makespan in domains with quantitative constraints.[31] Alternative languages have emerged to support specialized planning paradigms beyond core PDDL capabilities. PPDDL (Probabilistic PDDL) extends PDDL for probabilistic planning by incorporating nondeterministic action effects and probabilistic goals, used in the probabilistic tracks of the 4th and 5th International Planning Competitions.[38] ANML (Action Notation Modeling Language), an XML-based language, emphasizes hierarchical task decomposition and timeline-based representations, providing greater expressiveness for concurrent actions and temporal constraints compared to PDDL's simpler action-centric model.[39] ANML's timeline structure allows specifying multiple concurrent activities on resources, trading off PDDL's modeling simplicity and broad planner support for enhanced concurrency handling in complex domains like spacecraft operations.[39] Domain-specific frameworks like ROSPlan integrate PDDL with the Robot Operating System (ROS) for robotics applications, enabling seamless planning and execution in real-time environments such as autonomous navigation or manipulation tasks.[40] PDDL remains the dominant modeling language in automated planning, as evidenced by its use in all tracks of the International Planning Competitions since 1998 and widespread adoption in academic and industrial tools. Alternatives like ANML and PPDDL find use in niche areas, such as multi-agent systems requiring hierarchical coordination or uncertainty modeling.Planning Algorithms
Classical Planning
Classical planning represents a foundational paradigm in automated planning, characterized by its deterministic and fully observable nature. In this framework, the world is modeled as a finite set of discrete states, where actions produce instantaneous transitions without duration or concurrency. Key assumptions include full observability, meaning the planner has complete knowledge of the current state; deterministic effects, ensuring each action yields a unique successor state; and a static environment, where no changes occur except those caused by the agent's actions. These restrictions simplify the problem to finding a sequence of actions that transforms an initial state into one satisfying a specified goal, abstracting away real-world complexities like uncertainty or time. The classical planning problem is formally defined as a tuple consisting of an initial state , a set of actions , and a goal condition . Each action is specified by preconditions (logical formulas that must hold for applicability) and effects (additions and deletions to the state). The objective is to find a plan such that applying the sequence to results in a state where holds, typically represented as a conjunction of positive literals. This formulation traces back to early systems like STRIPS, emphasizing propositional logic for state representation to enable efficient reasoning. Solution approaches in classical planning primarily involve search strategies over state, plan, or graph spaces. Forward search, or progression planning, explores from the initial state by applying applicable actions to generate successor states until the goal is reached, benefiting from early pruning of irrelevant paths. Backward search, or regression planning, starts from the goal and works in reverse, identifying preconditions that must hold to achieve subgoals, which is advantageous when goals are simpler to specify than full state expansions. Partial-order planning operates in plan space, beginning with an incomplete plan skeleton and refining it by adding causal links and resolving threats, allowing for flexible, least-commitment sequences that avoid unnecessary linearization. These methods form the basis for more advanced techniques but are computationally intensive due to the problem's inherent hardness. The computational complexity of classical planning is well-established: deciding plan existence is PSPACE-complete in general for propositional STRIPS domains, reflecting the exponential state space explosion, while simpler cases without delete effects (e.g., monotonic planning) are NP-complete. This hardness underscores the need for domain-specific optimizations, yet practical solvers leverage these bounds effectively on benchmark problems. The International Planning Competition (IPC), with its classical tracks running since 1998, has served as a key benchmark suite, evaluating planners on diverse domains like logistics and blocks world to track progress in scalability and solution quality. Over successive IPC events, classical track problems have grown in size, with recent editions featuring hundreds of instances to assess coverage and performance metrics such as plan length and runtime.[41]Heuristic Search Techniques
Heuristic search techniques form a cornerstone of solving classical planning problems by guiding the exploration of the state space toward goal states more efficiently than uninformed methods. In classical planning, where actions have uniform unit costs and the environment is deterministic and fully observable, these techniques rely on heuristic functions that estimate the distance from a given state to the goal. A key approach involves delete relaxation, which simplifies the planning task by ignoring the delete effects of actions—those that remove preconditions or add negative effects—allowing for the computation of lower-bound estimates on the true plan cost. This relaxation enables the construction of relaxed planning graphs (RPGs), layered structures that propagate achievable propositions and actions without deletions, providing a basis for deriving admissible heuristics.[42] Prominent delete relaxation heuristics include h_max and h_add, both computed efficiently in polynomial time from RPGs. The h_max heuristic estimates the cost to achieve each goal proposition independently by taking the maximum over the costs of unsatisfied goals, assuming no interactions between them; it is admissible because it underestimates the true cost by ignoring positive interactions. In contrast, h_add, the additive heuristic, sums the individual minimum costs to achieve each unsatisfied goal proposition, also ignoring delete effects but overestimating due to not accounting for actions that achieve multiple goals simultaneously. For a state and goal set , the additive heuristic is formally defined as where is the cost of action , typically 1 in unit-cost settings; this formulation overestimates the relaxed plan length since it overcounts costs for shared actions and disregards inter-goal dependencies, making h_add inadmissible. While h_max is often more informative for guiding search due to its focus on the hardest subgoals, h_add provides a looser estimate but can be faster to compute in some contexts. These heuristics were foundational in planners like HSP, demonstrating significant speedups over earlier methods like Graphplan.[42] Heuristic search algorithms integrate these estimates into best-first strategies to ensure completeness and, under certain conditions, optimality. The A* algorithm, adapted for planning, uses the evaluation function , where is the path cost from the initial state to node , and expands nodes in order of increasing ; with an admissible heuristic (where , the true optimal cost), A* guarantees an optimal solution upon reaching the goal. To address memory demands in large state spaces, Iterative Deepening A* (IDA*) employs a depth-bounded search with increasing cost thresholds based on -values, effectively trading space for time while preserving optimality if the heuristic is admissible. Consistency, a stronger property where for every successor of and edge cost , ensures A* and IDA* avoid re-expansions, further promoting optimality; h_max is consistent in unit-cost classical planning.[43][44] Practical implementations often combine heuristics with greedy or local search for efficiency. Enforced Hill-Climbing (EHC), introduced in the FF planner, performs a greedy best-first search that "enforces" progress by selecting the lowest-heuristic successor at each step, backtracking only if stuck; it uses the FF-heuristic, an extension of h_add and h_max that incorporates relaxed plan extraction to better capture action dependencies, yielding plans much faster than pure A* in satisficing scenarios. The LAMA planner builds on this by employing FF-heuristic-guided EHC for initial plan finding, followed by iterative cost-optimal searches using A* with a landmark-based pseudo-heuristic, achieving strong performance in international planning competitions. These strategies prioritize speed over optimality in the first phase, reserving exhaustive search for refinement.[45][46] Post-2000 advancements have enhanced heuristic accuracy through abstractions like pattern databases (PDBs) and landmark analysis. PDBs precompute exact costs for abstracted subproblems—subsets of propositions treated independently—storing them in lookup tables; the overall heuristic is the sum or max of PDB values for disjoint patterns covering the goals, providing admissible estimates that scale with memory allocation and outperform simple relaxations in optimal planning. Landmark heuristics, meanwhile, identify propositions that must be achieved in every plan (landmarks) along with ordering constraints, deriving a counting heuristic that penalizes states based on unsatisfied landmarks and their precedence; this approach, integrated into systems like LAMA, improves guidance for cost-sensitive search by capturing necessary plan structure without full abstraction. These methods have significantly boosted solver performance, with landmark techniques enabling anytime planning that refines solutions progressively.[47][48]Temporal Planning
Temporal planning extends classical planning frameworks to incorporate explicit representations of time, allowing actions to have durations and enabling the modeling of temporal constraints between events. This approach is essential for domains where timing is critical, such as robotics and resource allocation, where actions may overlap and must satisfy conditions over intervals rather than instantaneously. In temporal planning, actions are typically modeled as durative, spanning a period from initiation to completion, with effects and preconditions specified at precise points within that duration. A key innovation in temporal planning is the use of durative actions, which include conditions and effects categorized as occurring "at start" (upon initiation), "over all" (holding invariantly throughout the open interval of the action), or "at end" (upon completion). Temporal constraints further refine these models by specifying relationships such as overall durations (e.g., fixed or variable lengths) or conditions within specific intervals, ensuring synchronization between concurrent actions. These features were formalized in the PDDL 2.1 temporal profile, which integrates durative actions with the Planning Domain Definition Language to support metric time reasoning while maintaining compatibility with classical planning elements. Optimization objectives often focus on metrics like makespan minimization, defined as the total time from plan start to goal achievement, to produce efficient schedules. To solve temporal planning problems, specialized planners employ techniques such as partial-order planning combined with constraint propagation. For instance, the POPF planner uses forward-chaining partial-order search to generate temporally flexible plans, maintaining consistency through Simple Temporal Networks (STNs), which represent time-point constraints as difference constraints solvable via shortest-path algorithms like Bellman-Ford. Similarly, OPTIC extends this approach to handle preferences and time-dependent costs, also relying on STNs to enforce temporal consistency during heuristic-guided search. These solvers address the heightened complexity of temporal planning, which is EXPSPACE-complete for concurrent actions with metric time due to the exponential state space introduced by overlapping durations and continuous timing.[49][50][51] A representative application of temporal planning is satellite scheduling, where durative actions model observation tasks constrained by narrow visibility windows over ground targets. In such domains, planners must sequence maneuvers and data collections to maximize coverage while respecting orbital dynamics and temporal windows, often minimizing makespan to optimize mission timelines. For example, systems using PDDL-based temporal planning have been applied to generate nominal operations schedules for Earth-observing satellites, ensuring actions like pointing and imaging align precisely within fleeting opportunity intervals.Probabilistic Planning
Probabilistic planning addresses decision-making in environments where action outcomes are uncertain, extending classical planning by incorporating probabilistic transitions and rewards to model non-deterministic effects.[52] In contrast to deterministic classical planning, which assumes guaranteed outcomes, probabilistic planning seeks policies that maximize expected utility over possible future states.[53] The core models are Markov Decision Processes (MDPs) and Partially Observable Markov Decision Processes (POMDPs). An MDP is defined as a tuple , where is the set of states, is the set of actions, denotes the transition probabilities to next states given state and action , is the reward function, and is the discount factor.[54] POMDPs extend MDPs by adding partial observability, introducing an observation set and observation probabilities that map next states and actions to observations , requiring the agent to maintain a belief state over possible worlds.[55] Solutions to these models typically involve computing an optimal policy (for MDPs) or (for POMDPs, where is the belief space) that maximizes the expected discounted reward. For finite-horizon or discounted infinite-horizon MDPs, dynamic programming methods like value iteration and policy iteration are standard, solving the Bellman optimality equation: where is the optimal value function for state .[52] Value iteration iteratively updates value estimates until convergence, while policy iteration alternates between policy evaluation and improvement. POMDP solutions are computationally harder (PSPACE-complete in general), often using dynamic programming over the belief simplex or approximation techniques due to the continuous belief space.[55] Key planners for probabilistic domains include the Model-Based Planner (MBP), which generates policies for non-deterministic domains—a special case of probabilistic planning where outcomes are fully nondeterministic—using conformant planning techniques based on model checking to ensure goal achievement with certainty in the worst case.[56] Probabilistic-FF (PFF) extends the heuristic forward-search of the classical FF planner to non-observable probabilistic settings, using weighted model counting for heuristic evaluation to guide search toward high-probability plans.[53] For large state spaces, approximations like Real-Time Dynamic Programming (RTDP) provide efficient online planning by focusing updates along simulated trajectories from the initial state, converging to optimal policies under mild assumptions like acyclic state graphs.[57] RTDP, introduced by Barto et al., improves over full value iteration by leveraging heuristic guidance and real-time adaptation.[57] Performance in probabilistic planning is evaluated using metrics such as expected utility (discounted cumulative reward) or probability of success (likelihood of reaching a goal state).[54] A representative example is robotic navigation with sensor noise, modeled as a POMDP where the robot's position is partially observable due to noisy readings, and actions like "move forward" have probabilistic outcomes (e.g., slipping on terrain); the optimal policy balances exploration to update beliefs with exploitation to reach the goal, maximizing expected reward.[55]Advanced Planning Paradigms
Preference-Based Planning
Preference-based planning extends traditional automated planning by incorporating user-specified preferences to generate plans that not only achieve hard goals but also optimize for desirable qualities, such as efficiency or user satisfaction.[58] These preferences are modeled as soft constraints or goals that can be partially satisfied, allowing planners to rank plans based on how well they align with user desires beyond mere feasibility.[59] This approach addresses limitations in classical planning, where all valid plans are treated equally, by enabling the selection of higher-quality outcomes in domains like robotics and logistics.[60] In modeling preferences, PDDL 3.0 introduces soft goals and constraints, expressed using modal operators like:preferences to define desirable conditions that contribute to plan quality without being mandatory.[58] Preferences can be qualitative, such as lexicographic ordering where higher-priority preferences must be fully satisfied before considering lower ones, or quantitative, involving weighted sums that aggregate satisfaction scores across multiple criteria.[61] Qualitative models prioritize hierarchical user intents, while quantitative approaches allow fine-grained trade-offs through numerical utilities.[60]
Algorithms for preference-based planning often extend satisficing planners, such as LPG, by integrating preference semantics during search to favor plans that maximize preference satisfaction.[59] In cases with conflicting criteria, these methods compute Pareto-optimal plans, forming a set of non-dominated solutions where no plan improves one preference without degrading another.[62] Plan quality is typically measured as the sum of satisfied preferences minus penalties for violations, providing a scalar metric for ranking.[58]
Applications include personalized robotics, where preferences ensure safety takes precedence over speed in navigation tasks, adapting plans to individual user needs like accessibility. For instance, in travel planning, a system might generate itineraries preferring scenic routes over direct paths to enhance enjoyment, balancing time constraints with qualitative desires for aesthetics.[63]

