Recent from talks
Nothing was collected or created yet.
Knuth's Algorithm X
View on WikipediaAlgorithm X is an algorithm for solving the exact cover problem. It is a straightforward recursive, nondeterministic, depth-first, backtracking algorithm used by Donald Knuth to demonstrate an efficient implementation called DLX, which uses the dancing links technique.[1][2]
Algorithm
[edit]The exact cover problem is represented in Algorithm X by an incidence matrix A consisting of 0s and 1s. The goal is to select a subset of the rows such that the digit 1 appears in each column exactly once.
Algorithm X works as follows:
- If the matrix A has no columns, the current partial solution is a valid solution; terminate successfully.
- Otherwise choose a column c (deterministically).
- Choose a row r such that Ar, c = 1 (nondeterministically).
- Include row r in the partial solution.
- For each column j such that Ar, j = 1,
- for each row i such that Ai, j = 1,
- delete row i from matrix A.
- delete column j from matrix A.
- for each row i such that Ai, j = 1,
- Repeat this algorithm recursively on the reduced matrix A.
The nondeterministic choice of r means that the algorithm recurses over independent subalgorithms; each subalgorithm inherits the current matrix A, but reduces it with respect to a different row r.
If column c is entirely zero, there are no subalgorithms and the process terminates unsuccessfully.
The subalgorithms form a search tree in a natural way, with the original problem at the root and with level k containing each subalgorithm that corresponds to k chosen rows. Backtracking is the process of traversing the tree in preorder, depth first.
Any systematic rule for choosing column c in this procedure will find all solutions, but some rules work much better than others. To reduce the number of iterations, Knuth suggests that the column-choosing algorithm select a column with the smallest number of 1s in it.
Example
[edit]For example, consider the exact cover problem specified by the universe U = {1, 2, 3, 4, 5, 6, 7} and the collection of sets S = {A, B, C, D, E, F}, where:
- A = {1, 4, 7};
- B = {1, 4};
- C = {4, 5, 7};
- D = {3, 5, 6};
- E = {2, 3, 6, 7}; and
- F = {2, 7}.
This problem is represented by the matrix:
1 2 3 4 5 6 7 A 1 0 0 1 0 0 1 B 1 0 0 1 0 0 0 C 0 0 0 1 1 0 1 D 0 0 1 0 1 1 0 E 0 1 1 0 0 1 1 F 0 1 0 0 0 0 1
Algorithm X with Knuth's suggested heuristic for selecting columns solves this problem as follows:
Level 0
Step 1—The matrix is not empty, so the algorithm proceeds.
Step 2—The lowest number of 1s in any column is two. Column 1 is the first column with two 1s and thus is selected (deterministically):
1 2 3 4 5 6 7 A 1 0 0 1 0 0 1 B 1 0 0 1 0 0 0 C 0 0 0 1 1 0 1 D 0 0 1 0 1 1 0 E 0 1 1 0 0 1 1 F 0 1 0 0 0 0 1
Step 3—Rows A and B each have a 1 in column 1 and thus are selected (nondeterministically).
The algorithm moves to the first branch at level 1…
- Level 1: Select Row A
- Step 4—Row A is included in the partial solution.
- Step 5—Row A has a 1 in columns 1, 4, and 7:
1 2 3 4 5 6 7 A 1 0 0 1 0 0 1 B 1 0 0 1 0 0 0 C 0 0 0 1 1 0 1 D 0 0 1 0 1 1 0 E 0 1 1 0 0 1 1 F 0 1 0 0 0 0 1
- Column 1 has a 1 in rows A and B; column 4 has a 1 in rows A, B, and C; and column 7 has a 1 in rows A, C, E, and F. Thus, rows A, B, C, E, and F are to be removed and columns 1, 4 and 7 are to be removed:
1 2 3 4 5 6 7 A 1 0 0 1 0 0 1 B 1 0 0 1 0 0 0 C 0 0 0 1 1 0 1 D 0 0 1 0 1 1 0 E 0 1 1 0 0 1 1 F 0 1 0 0 0 0 1
- Row D remains and columns 2, 3, 5, and 6 remain:
2 3 5 6 D 0 1 1 1
- Step 1—The matrix is not empty, so the algorithm proceeds.
- Step 2—The lowest number of 1s in any column is zero and column 2 is the first column with zero 1s:
2 3 5 6 D 0 1 1 1
- Thus this branch of the algorithm terminates unsuccessfully.
- The algorithm moves to the next branch at level 1…
- Level 1: Select Row B
- Step 4—Row B is included in the partial solution.
- Row B has a 1 in columns 1 and 4:
1 2 3 4 5 6 7 A 1 0 0 1 0 0 1 B 1 0 0 1 0 0 0 C 0 0 0 1 1 0 1 D 0 0 1 0 1 1 0 E 0 1 1 0 0 1 1 F 0 1 0 0 0 0 1
- Column 1 has a 1 in rows A and B; and column 4 has a 1 in rows A, B, and C. Thus, rows A, B, and C are to be removed and columns 1 and 4 are to be removed:
1 2 3 4 5 6 7 A 1 0 0 1 0 0 1 B 1 0 0 1 0 0 0 C 0 0 0 1 1 0 1 D 0 0 1 0 1 1 0 E 0 1 1 0 0 1 1 F 0 1 0 0 0 0 1
- Rows D, E, and F remain and columns 2, 3, 5, 6, and 7 remain:
2 3 5 6 7 D 0 1 1 1 0 E 1 1 0 1 1 F 1 0 0 0 1
- Step 1—The matrix is not empty, so the algorithm proceeds.
- Step 2—The lowest number of 1s in any column is one. Column 5 is the first column with one 1 and thus is selected (deterministically):
2 3 5 6 7 D 0 1 1 1 0 E 1 1 0 1 1 F 1 0 0 0 1
- Step 3—Row D has a 1 in column 5 and thus is selected (nondeterministically).
- The algorithm moves to the first branch at level 2…
- Level 2: Select Row D
- Step 4—Row D is included in the partial solution.
- Step 5—Row D has a 1 in columns 3, 5, and 6:
2 3 5 6 7 D 0 1 1 1 0 E 1 1 0 1 1 F 1 0 0 0 1
- Column 3 has a 1 in rows D and E; column 5 has a 1 in row D; and column 6 has a 1 in rows D and E. Thus, rows D and E are to be removed and columns 3, 5, and 6 are to be removed:
2 3 5 6 7 D 0 1 1 1 0 E 1 1 0 1 1 F 1 0 0 0 1
- Row F remains and columns 2 and 7 remain:
2 7 F 1 1
- Step 1—The matrix is not empty, so the algorithm proceeds.
- Step 2—The lowest number of 1s in any column is one. Column 2 is the first column with one 1 and thus is selected (deterministically):
2 7 F 1 1
- Row F has a 1 in column 2 and thus is selected (nondeterministically).
- The algorithm moves to the first branch at level 3…
- Level 3: Select Row F
- Step 4—Row F is included in the partial solution.
- Row F has a 1 in columns 2 and 7:
2 7 F 1 1
- Column 2 has a 1 in row F; and column 7 has a 1 in row F. Thus, row F is to be removed and columns 2 and 7 are to be removed:
2 7 F 1 1
- No rows and no columns remain:
- Step 1—The matrix is empty, thus this branch of the algorithm terminates successfully.
- As rows B, D, and F have been selected (step 4), the final solution in this branch is:
1 2 3 4 5 6 7 B 1 0 0 1 0 0 0 D 0 0 1 0 1 1 0 F 0 1 0 0 0 0 1
- In other words, the subcollection {B, D, F} is an exact cover, since every element is contained in exactly one of the sets B = {1, 4}, D = {3, 5, 6}, or F = {2, 7}.
- There are no more selected rows at level 3, thus the algorithm moves to the next branch at level 2…
- There are no more selected rows at level 2, thus the algorithm moves to the next branch at level 1…
- There are no more selected rows at level 1, thus the algorithm moves to the next branch at level 0…
There are no branches at level 0, thus the algorithm terminates.
In summary, the algorithm determines there is only one exact cover: S* = {B, D, F}.
Implementations
[edit]Knuth's main purpose in describing Algorithm X was to demonstrate the utility of dancing links. Knuth showed that Algorithm X can be implemented efficiently on a computer using dancing links in a process Knuth calls "DLX". DLX uses the matrix representation of the exact cover problem, implemented as doubly linked lists of the 1s of the matrix: each 1 element has a link to the next 1 above, below, to the left, and to the right of itself. (Technically, because the lists are circular, this forms a torus). Because exact cover problems tend to be sparse, this representation is usually much more efficient in both size and processing time required. DLX then uses dancing links to quickly select permutations of rows as possible solutions and to efficiently backtrack (undo) mistaken guesses.[1]
See also
[edit]References
[edit]- ^ a b Knuth, Donald (2000). "Dancing links". arXiv:cs/0011047.
- ^ Banerjee, Bikramjit; Kraemer, Landon; Lyle, Jeremy (2010-07-04). "Multi-Agent Plan Recognition: Formalization and Algorithms". Proceedings of the AAAI Conference on Artificial Intelligence. 24 (1): 1059–1064. doi:10.1609/aaai.v24i1.7746. ISSN 2374-3468.
- Knuth, Donald E. (2000), "Dancing links", in Davies, Jim; Roscoe, Bill; Woodcock, Jim (eds.), Millennial Perspectives in Computer Science: Proceedings of the 1999 Oxford-Microsoft Symposium in Honour of Sir Tony Hoare, Palgrave, pp. 187–214, arXiv:cs/0011047, Bibcode:2000cs.......11047K, ISBN 978-0-333-92230-9.
External links
[edit]- Knuth's paper - PDF file (also arXiv:cs/0011047)
- Knuth's Paper describing the Dancing Links optimization - Gzip'd postscript file.
Knuth's Algorithm X
View on GrokipediaIntroduction
Overview
Algorithm X is a recursive, nondeterministic, depth-first backtracking algorithm that finds all solutions to the exact cover problem.[2] The exact cover problem requires selecting a subcollection of disjoint subsets from a given collection such that their union covers the entire universe exactly once, with no overlaps or omissions.[2] This combinatorial optimization challenge is NP-complete, making it computationally intensive for large instances. Donald E. Knuth introduced Algorithm X in his 2000 paper "Dancing Links," framing it as a foundational method for constraint satisfaction problems expressible as exact covers.[2] The algorithm systematically prunes the search space by choosing rows that cover columns and recursively solving reduced subproblems, enabling exhaustive enumeration of solutions.[2] Its significance stems from the practical efficiency achieved through optimized implementations, such as Dancing Links (DLX), in addressing NP-complete tasks like constraint satisfaction problems. Knuth paired the algorithm with Dancing Links, an optimized doubly-linked list structure, to accelerate implementations.[2]Historical Development
Donald E. Knuth developed Algorithm X in the context of his ongoing research on combinatorial algorithms, as part of the fourth volume of The Art of Computer Programming, which addresses generating functions and backtracking techniques for exact cover problems.[3] The algorithm represents a straightforward recursive, depth-first backtracking approach to finding all solutions to the exact cover problem, building on foundational ideas in systematic search methods.[4] The first detailed description of Algorithm X appeared in Knuth's 2000 paper "Dancing Links," published in Millennial Perspectives in Computer Science: Celebrating Contributions of the Past 25 Years, the proceedings of the 1999 Oxford-Microsoft Symposium. This work was preceded by a lecture Knuth delivered at Stanford University on February 22, 2000, where he introduced the concepts to an academic audience.[5] Knuth's contributions emphasized the algorithm's generality and efficiency when paired with optimized data structures, marking a significant advancement in practical combinatorial solving. Additionally, the "dancing links" technique integral to its efficient implementation was first proposed by Hiroshi Hitotsumatsu and Kōhei Noshita in 1979, who applied it to accelerate solutions for the n-queens problem by enabling rapid deletion and restoration of linked list elements during search.[4] Following its publication, Algorithm X evolved through its integration with the Dancing Links (DLX) data structure, which Knuth designed to minimize overhead in covering and uncovering operations, leading to substantial performance gains in enumerating solutions for large instances.[4] This combination gained widespread adoption in the early 2000s for developing efficient software solvers targeting combinatorial puzzles, influencing implementations in areas like polyomino tiling and constraint satisfaction.[6] Knuth further refined and expanded the material in the 2019 fascicle 5C of The Art of Computer Programming, solidifying its place in algorithmic literature.Exact Cover Problem
Definition and Formulation
The exact cover problem is a classic decision problem in combinatorics and computer science. Given a finite universe consisting of elements and a collection of subsets of , the task is to determine whether there exists a subcollection such that every element of appears in exactly one subset belonging to . This ensures a complete partition of using the selected subsets without overlaps or omissions.[2] Formally, let where each for . An exact cover is a subset satisfying two conditions: and for all distinct . The problem requires finding such a (if it exists) or confirming that none does, with the constraints prohibiting partial covers and enforcing disjointness among selected subsets.[7] The problem admits a natural representation as a binary incidence matrix of size , where rows correspond to the subsets in and columns to the elements of ; entry if the -th subset contains the -th element, and otherwise. A solution then corresponds to selecting a set of rows such that the sum of these rows yields the all-ones vector, with exactly one per column and no overlapping s in any column.[2] The exact cover problem is NP-complete. This holds even for the restricted variant known as Exact Cover by 3-Sets, where each subset contains exactly three elements; the NP-completeness of this case is established via polynomial-time reduction from the 3-Dimensional Matching problem.[8]Relation to Other Problems
The [exact cover](/page/Linking to 'exact cover' twice is redundant, so only link once) problem is equivalent to the decision version of the set partitioning problem, which seeks a collection of disjoint subsets from a given family that precisely partitions the universe without overlaps or omissions.[9] Similarly, it aligns with the disjoint set cover problem, where the objective is to select subsets such that every element is included in exactly one subset, ensuring a partition-like coverage.[10] These equivalences highlight the problem's foundational role in combinatorial optimization, where feasibility checks for exact partitions underpin broader partitioning tasks. The perfect matching problem in bipartite graphs can be reformulated as an exact cover instance, with elements representing vertices and subsets representing edges (each covering its two endpoints exactly once), requiring full coverage without redundancy.[11] König's theorem states that the size of the minimum vertex cover equals the size of the maximum matching in bipartite graphs. It also generalizes the exact hitting set problem via duality: selecting subsets to cover elements exactly once mirrors selecting elements to hit given subsets exactly once in the dual set system.[12] In scheduling contexts, bin packing with fixed bin size emerges as a variant of exact cover, particularly when requiring exact utilization of bin capacity without waste; this reduces to partitioning items into subsets whose sizes sum precisely to the bin capacity, akin to an exact cover over item sizes.[13] The exact cover problem further connects to constraint satisfaction problems (CSPs), such as graph coloring, by modeling the exact partitioning of vertices into independent sets (color classes), where subsets represent valid color assignments ensuring no adjacent vertices share a color while covering all vertices precisely.[14] Within operations research, exact cover informs assignment problems by providing a framework for exact feasibility, as seen in the tail assignment problem, which reduces to exact cover to ensure all requirements (e.g., flights or tasks) are assigned without gaps or duplicates.[15] This contrasts with approximate set cover approaches in greedy algorithms, which permit overlaps and partial coverage to achieve efficiency at the cost of precision.[11]Core Algorithm
Procedure Steps
Algorithm X operates as a recursive backtracking procedure to solve the exact cover problem by systematically exploring partial solutions until a complete one is found or all possibilities are exhausted.[1] The algorithm begins by checking if the current subproblem has no remaining columns, indicating that the universe is fully covered with no elements left uncovered; in this case, the partial solution is output as a valid exact cover, and the procedure backtracks to explore further solutions if multiple are sought.[1] If columns remain, the algorithm selects a column from the current matrix, typically using a heuristic such as choosing the column with the smallest number of rows containing a 1 in that column to minimize branching and improve efficiency.[1] This selection is deterministic within the heuristic but crucial for guiding the search. Following selection, the column is covered: this involves removing from consideration and eliminating all rows that have a 1 in , effectively pruning the matrix to reflect that must be covered by one of the rows originally in it.[1] The procedure then iterates over each row that originally contained a 1 in (the candidate rows for covering ). For each such row , it includes in the partial solution and covers all columns where has a 1, which means removing those columns and all rows intersecting them from the matrix.[1] The algorithm recurses on this reduced subproblem to find a cover for the remaining elements. If the recursive call succeeds in finding a solution, it is propagated upward; otherwise, after the recursion returns without success, the columns covered by are uncovered (restored in reverse order), and the next row in is tried.[1] Once all rows in have been exhausted without success, the column is uncovered, restoring the matrix to its state before covering , and the procedure backtracks to the previous choice point.[1] This exhaustive depth-first search ensures that all possible combinations are explored, guaranteeing either all solutions or a determination that none exist.[1] The core structure can be expressed in pseudocode as follows:function search(k):
if R[h] = h: // no columns remain
output the solution O[1..k-1]
return
choose column c // e.g., with minimal size
cover column c
for each row [r](/page/R) in column c:
O[k] ← r
for each column j in row r except c:
cover column j
search(k + 1)
for each column j in row r except c:
uncover column j
uncover column c
function search(k):
if R[h] = h: // no columns remain
output the solution O[1..k-1]
return
choose column c // e.g., with minimal size
cover column c
for each row [r](/page/R) in column c:
O[k] ← r
for each column j in row r except c:
cover column j
search(k + 1)
for each column j in row r except c:
uncover column j
uncover column c
Backtracking Mechanism
The backtracking mechanism in Knuth's Algorithm X operates by systematically exploring a search tree to identify exact covers of a binary matrix, where each level of the tree corresponds to selecting an additional row that covers previously uncovered columns. At each choice point, the algorithm first selects a pivot column deterministically, using the minimum remaining values (MRV) heuristic: it chooses the column with the smallest number of remaining 1's (i.e., the fewest viable rows that cover it), which minimizes the branching factor and reduces the overall search space. This heuristic is applied dynamically based on the current state of the matrix after prior coverings, ensuring that the selection adapts to the evolving subproblem. Once the pivot column is chosen, the algorithm branches nondeterministically over each valid row that contains a 1 in position . For each such row, it temporarily covers all columns affected by (removing those columns and any rows that intersect them from consideration), then recurses to the next level of the search tree on the reduced matrix. This depth-first traversal continues until either all columns are covered (yielding a solution) or no further progress is possible, at which point the algorithm backtracks by uncovering the affected columns and rows to explore alternative branches. Pruning occurs implicitly through early termination: if, after selecting a pivot, the chosen column has zero remaining rows, the current subproblem is impossible, and the algorithm immediately backtracks without further exploration. Similarly, if no columns remain uncovered, a complete exact cover has been found, and the search may halt or continue to enumerate additional solutions. The nondeterministic nature of row selection allows Algorithm X to discover multiple solutions by persisting in the search after identifying the first one, simply by continuing to branch over unchosen rows at each level. Regarding efficiency, the worst-case time complexity is exponential, on the order of where is the number of columns, due to the potential size of the search tree in dense matrices. However, the MRV heuristic and the sparsity typical of exact cover instances make it highly practical; for example, solving a pentomino puzzle requires only about 10,000 nodes in the search tree with the heuristic, compared to over 100,000 without it. The overall runtime is dominated by the cost of matrix updates at each node, but these are amortized efficiently in sparse cases.[1]Dancing Links Implementation
Data Structure Design
The Dancing Links (DLX) implementation of Knuth's Algorithm X represents the exact cover problem as a sparse binary matrix using a network of circular doubly-linked lists, where only the positions containing 1s are explicitly stored as nodes.[2] This structure efficiently encodes the rows and columns of the matrix: each 1 in the matrix corresponds to a data node that participates in both a horizontal row list and a vertical column list, enabling rapid traversal and manipulation during the search process.[2] Nodes in the DLX structure fall into three primary types: column headers, row headers (or sentinels), and data nodes. Column header nodes serve as sentinels for each column, maintaining fields for left (L) and right (R) links to adjacent column headers, up (U) and down (D) links to form a vertical cycle through the data nodes in that column, a column pointer (C) referencing itself, a size counter (S) tracking the number of 1s in the column, and a name (N) for identification.[2] Row headers are optional sentinels that anchor the horizontal cycles for each row, linking to the first and last data nodes in that row via L and R pointers.[2] Data nodes, which represent the 1s, include L and R links for the horizontal row cycle, U and D links for the vertical column cycle, a C pointer to the corresponding column header, and an additional field (A) for application-specific data, such as row identifiers.[2] Initialization of the structure begins with a root header node (h) that horizontally links all column headers into a single circular list via their L and R pointers, excluding any initially covered columns.[2] For each row in the matrix, the positions with 1s are instantiated as data nodes, connected horizontally into a circular list (potentially with a row header as the cycle's sentinel), and vertically linked into the appropriate column's cycle by splicing between the column header's U and D pointers.[2] This setup ensures that every list—whether row or column—is a closed cycle, allowing seamless traversal without boundary checks.[2] Compared to dense array representations, the DLX structure offers significant advantages in time and space for sparse matrices typical of exact cover problems. Operations like covering and uncovering columns, essential for the backtracking search, can be performed in amortized O(1) time per affected node through simple link splicing, rather than scanning entire rows or columns.[2] Moreover, by storing only non-zero entries, the design achieves high memory efficiency, scaling well to large instances where the matrix might have millions of rows but only thousands of 1s per row, as seen in applications like polyomino puzzles.[2]Key Operations
The cover operation in Dancing Links removes a specified column and all associated rows from consideration during the search, effectively pruning the problem instance to focus on remaining possibilities. This is achieved by first excising from the horizontal circular list of column headers, followed by splicing out each row containing an element in from the vertical lists of other columns. Specifically, the algorithm iterates over each row in column (traversing downward via the down-links ), and for each such row, it iterates over the other columns in that row (traversing rightward via the right-links ), removing from its vertical list by updating the up- and down-links and , while also decrementing the size counter for the column of . These manipulations leverage the doubly-linked structure to perform each splice in constant time per node, ensuring efficiency proportional to the number of affected nodes.[2] The uncover operation reverses the effects of cover, restoring the column and rows to their prior state to enable backtracking in the depth-first search. It processes the rows in reverse order—from bottom to top using the up-links —and within each row, the columns from right to left using the left-links , which adheres to a last-in, first-out (LIFO) sequence to correctly reconstruct the links without interference from recursive calls. First, the size counters are incremented for the relevant columns, followed by reinserting each node into its vertical list via updates to and ; finally, the header is reinserted into the horizontal list. This symmetric reversal maintains the integrity of the data structure across recursive levels.[2] The following pseudocode illustrates the cover and uncover procedures, as defined in the original formulation, where , , , , , and denote the left, right, up, down, size, and column functions, respectively: Cover():L[R[c]] ← L[c]
R[L[c]] ← R[c]
for i ← D[c]; i ≠ c; i ← D[i]
for j ← R[i]; j ≠ i; j ← R[j]
U[D[j]] ← U[j]
D[U[j]] ← D[j]
S[C[j]] ← S[C[j]] − 1
L[R[c]] ← L[c]
R[L[c]] ← R[c]
for i ← D[c]; i ≠ c; i ← D[i]
for j ← R[i]; j ≠ i; j ← R[j]
U[D[j]] ← U[j]
D[U[j]] ← D[j]
S[C[j]] ← S[C[j]] − 1
for i ← U[c]; i ≠ c; i ← U[i]
for j ← L[i]; j ≠ i; j ← L[j]
S[C[j]] ← S[C[j]] + 1
U[D[j]] ← j
D[U[j]] ← j
L[R[c]] ← c
R[L[c]] ← c
for i ← U[c]; i ≠ c; i ← U[i]
for j ← L[i]; j ≠ i; j ← L[j]
S[C[j]] ← S[C[j]] + 1
U[D[j]] ← j
D[U[j]] ← j
L[R[c]] ← c
R[L[c]] ← c
