Recent from talks
Nothing was collected or created yet.
Optimal substructure
View on Wikipedia
In computer science, a problem is said to have optimal substructure if an optimal solution can be constructed from optimal solutions of its subproblems. This property is used to determine the usefulness of greedy algorithms for a problem.[1]
Typically, a greedy algorithm is used to solve a problem with optimal substructure if it can be proven by induction that this is optimal at each step.[1] Otherwise, provided the problem exhibits overlapping subproblems as well, divide-and-conquer methods or dynamic programming may be used. If there are no appropriate greedy algorithms and the problem fails to exhibit overlapping subproblems, often a lengthy but straightforward search of the solution space is the best alternative.
In the application of dynamic programming to mathematical optimization, Richard Bellman's Principle of Optimality is based on the idea that in order to solve a dynamic optimization problem from some starting period t to some ending period T, one implicitly has to solve subproblems starting from later dates s, where t<s<T. This is an example of optimal substructure. The Principle of Optimality is used to derive the Bellman equation, which shows how the value of the problem starting from t is related to the value of the problem starting from s.
Example
[edit]Consider finding a shortest path for traveling between two cities by car, as illustrated in Figure 1. Such an example is likely to exhibit optimal substructure. That is, if the shortest route from Seattle to Los Angeles passes through Portland and then Sacramento, then the shortest route from Portland to Los Angeles must pass through Sacramento too. That is, the problem of how to get from Portland to Los Angeles is nested inside the problem of how to get from Seattle to Los Angeles. (The wavy lines in the graph represent solutions to the subproblems.)
As an example of a problem that is unlikely to exhibit optimal substructure, consider the problem of finding the cheapest airline ticket from Buenos Aires to Moscow. Even if that ticket involves stops in Miami and then London, we can't conclude that the cheapest ticket from Miami to Moscow stops in London, because the price at which an airline sells a multi-flight trip is usually not the sum of the prices at which it would sell the individual flights in the trip.
Definition
[edit]A slightly more formal definition of optimal substructure can be given. Let a "problem" be a collection of "alternatives", and let each alternative have an associated cost, c(a). The task is to find a set of alternatives that minimizes c(a). Suppose that the alternatives can be partitioned into subsets, i.e. each alternative belongs to only one subset. Suppose each subset has its own cost function. The minima of each of these cost functions can be found, as can the minima of the global cost function, restricted to the same subsets. If these minima match for each subset, then it's almost obvious that a global minimum can be picked not out of the full set of alternatives, but out of only the set that consists of the minima of the smaller, local cost functions we have defined. If minimizing the local functions is a problem of "lower order", and (specifically) if, after a finite number of these reductions, the problem becomes trivial, then the problem has an optimal substructure.
Problems with optimal substructure
[edit]- Longest common subsequence problem
- Longest increasing subsequence
- Longest palindromic substring
- All-Pairs Shortest Path
- Any problem that can be solved by dynamic programming.
Problems without optimal substructure
[edit]- Longest path problem
- Addition-chain exponentiation
- Least-cost airline fare. Using online flight search, we will frequently find that the cheapest flight from airport A to airport B involves a single connection through airport C, but the cheapest flight from airport A to airport C involves a connection through some other airport D. However, if the problem takes the maximum number of layovers as a parameter, then the problem has optimal substructure. The cheapest flight from A to B that involves at most k layovers is either the direct flight; or the cheapest flight from A to some airport C that involves at most t layovers for some integer t with 0≤t<k, plus the cheapest flight from C to B that involves at most k−1−t layovers.
See also
[edit]References
[edit]- ^ a b Cormen, Thomas H.; Leiserson, Charles E.; Rivest, Ronald L.; Stein, Clifford (2009). Introduction to Algorithms (3rd ed.). MIT Press. ISBN 978-0-262-03384-8.
Optimal substructure
View on GrokipediaFundamentals
Definition
Optimal substructure is a fundamental property exhibited by certain optimization problems, where the goal is to find a solution that minimizes or maximizes an objective function among a set of feasible solutions, in contrast to decision problems that only determine whether a feasible solution exists.[4][5] This property enables the recursive decomposition of the problem into smaller subproblems, ensuring that the overall optimal solution can be efficiently constructed without compromising optimality.[2] Formally, a problem demonstrates optimal substructure if, for a given instance with solution set and associated subproblems with solution sets , the optimal solution to , denoted , satisfies , where is an efficient function that merges the subproblem solutions.[6] This requires that the subproblem solutions themselves be optimal, rather than merely feasible, to guarantee the global solution's optimality and prevent propagation of suboptimal choices.[2] In essence, the property allows the problem to be solved by building up from optimally solved components, a cornerstone exploited in paradigms like dynamic programming.[7] Mathematically, optimal substructure is often expressed through recurrence relations that define the optimal value for a problem instance in terms of optimal values for subinstances. For example, in partitioning problems such as matrix chain multiplication, it takes the form: (or using maximization where appropriate), where represents the optimal value for a problem of size , and captures the merging expense; this recursive structure derives from the decomposition principle inherent to the property.[8][9]Historical Background
The concept of optimal substructure emerged in the mid-20th century as a core element of dynamic programming, introduced by Richard Bellman to address complex multistage decision problems in operations research. Bellman, working at the RAND Corporation, developed this framework during the early 1950s to model sequential optimization under uncertainty, emphasizing that an optimal policy for the overall problem must consist of optimal policies for its subproblems—a property now known as the principle of optimality, which underpins optimal substructure.[10] This approach was motivated by practical needs in military logistics and economics, where breaking down large-scale problems into manageable recursive components allowed for efficient computation on early computers.[11] A key milestone came with Bellman's 1957 book Dynamic Programming, which formalized the principle and provided the theoretical foundation for applying optimal substructure to a wide range of optimization challenges, including inventory control and routing. In the late 1950s and 1960s, the idea extended to graph algorithms, notably through Joseph Kruskal's 1956 work on minimum spanning trees, where the optimal solution for the full graph incorporates optimal subtrees, demonstrating the property's utility beyond pure dynamic programming.[12] These developments highlighted optimal substructure as a unifying characteristic for problems amenable to recursive decomposition. During the 1970s and 1980s, optimal substructure integrated into broader algorithm design theory, particularly through analyses of divide-and-conquer paradigms, as explored in foundational texts like Aho, Hopcroft, and Ullman's The Design and Analysis of Computer Algorithms (1974), which connected it to efficient problem-solving strategies. By the 1990s, it gained widespread recognition in greedy algorithm contexts, with textbooks such as Cormen et al.'s Introduction to Algorithms (first edition, 1990) explicitly articulating the property as essential for verifying optimality in selection-based methods. While optimal substructure draws indirect roots from 18th-century mathematical optimization techniques, such as Joseph-Louis Lagrange's method of multipliers for constrained problems introduced in Mécanique Analytique (1788), its computational emphasis distinguishes it as a distinctly post-1950s innovation tailored to algorithmic efficiency.[13] As of 2025, the concept remains foundational in artificial intelligence and machine learning, particularly in reinforcement learning models that rely on Bellman equations to propagate optimality through substructures, with ongoing applications in areas like robotics and game AI showing no fundamental paradigm shifts.Examples and Illustrations
Basic Example
A fundamental illustration of optimal substructure is the rod-cutting problem, where a rod of length must be cut into integer-length pieces to maximize revenue based on given prices for each piece of length , with .[14] The optimal revenue for a rod of length satisfies the recurrence relation with base case . This formulation reveals the optimal substructure: the best solution for length is obtained by selecting the first cut of length that maximizes the price of that piece plus the optimal revenue for the remaining subproblem of length . Subproblems correspond to smaller rod lengths, each of which is solved optimally and combined without overlap or redundancy in the decision process.[14] To demonstrate, consider a rod of length 4 with prices , , , and . The optimal revenues for subproblems build incrementally:- For length 1: .
- For length 2: (first cut of length 2).
- For length 3: (first cut of length 3).
- For length 4: (first cut of length 2, followed by optimal for remaining length 2).[14]
| Length | Optimal revenue |
|---|---|
| 0 | 0 |
| 1 | 1 |
| 2 | 5 |
| 3 | 8 |
| 4 | 10 |
Further Examples
One prominent example of optimal substructure is the matrix chain multiplication problem, where the goal is to determine the most efficient way to parenthesize a sequence of matrices to minimize the total cost of multiplications, given that the cost depends on the dimensions of the matrices involved. The optimal solution exhibits optimal substructure because the best parenthesization for the chain from to can be found by considering all possible splits at some (where ) and combining the optimal costs for the subchains and , plus the cost of multiplying those two results. This leads to the recurrence: where is the scalar multiplications needed to multiply the subchain results, typically for dimension array . For instance, with matrices of dimensions 10×20, 20×30, 30×10, the optimal split first multiplies the second and third (cost 6000), then the first with that result (cost 2000, total 8000), demonstrating how subproblem optima merge without recomputation.[14] Another classic illustration appears in the longest common subsequence (LCS) problem for two strings and , where the objective is to find the longest subsequence present in both, preserving order but not necessarily contiguity. Optimal substructure holds here because the LCS of and depends on smaller subproblems: if , then LCS length is 1 plus LCS of and ; otherwise, it is the maximum of LCS() and LCS(). The recurrence is: with base cases and . For example, with "ABCBDAB" and "BDCAB", the LCS "BCAB" (length 4) builds incrementally from matching suffixes and prefixes in subproblems, efficiently combining their optimal lengths.[14] To demonstrate applicability in graph problems, consider the all-pairs shortest paths computed via the Floyd-Warshall algorithm on a weighted directed graph with vertices, where distances may include negative weights but no negative cycles. The algorithm leverages optimal substructure by iteratively refining shortest paths allowing intermediate vertices from a set , starting with direct edges and adding one vertex at a time; the optimal path from to using intermediates up to is the minimum of the direct path or paths via from the subproblems up to . This yields the recurrence: for all . In a simple graph with vertices A, B, C and edges A→B (2), B→C (3), A→C (6), initializing gives d[A,C]=6, but after considering B as intermediate, it updates to min(6, 2+3)=5, merging subpath optima to find the global shortest paths in time.[14]Applications in Algorithms
Dynamic Programming Problems
Dynamic programming is applied to optimization problems that possess optimal substructure, enabling the construction of solutions to larger instances from optimal solutions to their subproblems. This approach systematically solves subproblems and stores their results to build up the overall optimum, as formalized in the principle of optimality.[10] A classic example is the 0/1 knapsack problem, where given n items each with weight and value , and a knapsack capacity W, the goal is to select a subset maximizing total value without exceeding W, with each item either taken or not. The problem exhibits optimal substructure: the optimal solution for the first j items and capacity x either excludes item j (reducing to the subproblem with j-1 items) or includes it (reducing to the subproblem with j-1 items and capacity x - w_j, plus v_j).[15] The recurrence relation is: with base cases and for all x, j; if x < w_j, the second term is not considered.[15] This is solved using a bottom-up dynamic programming table, a 2D array of size (W+1) by (n+1), filled by iterating over capacities from 1 to W and items from 1 to n:for x = 0 to W
K[x][0] = 0
for j = 0 to n
K[0][j] = 0
for x = 1 to W
for j = 1 to n
if w_j > x
K[x][j] = K[x][j-1]
else
K[x][j] = max(K[x][j-1], K[x - w_j][j-1] + v_j)
for x = 0 to W
K[x][0] = 0
for j = 0 to n
K[0][j] = 0
for x = 1 to W
for j = 1 to n
if w_j > x
K[x][j] = K[x][j-1]
else
K[x][j] = max(K[x][j-1], K[x - w_j][j-1] + v_j)
for k = 1 to n
length[k] = 1
for j = 1 to k-1
if v_j < v_k and length[j] + 1 > length[k]
length[k] = length[j] + 1
max_length = max(length[1..n])
for k = 1 to n
length[k] = 1
for j = 1 to k-1
if v_j < v_k and length[j] + 1 > length[k]
length[k] = length[j] + 1
max_length = max(length[1..n])