Recent from talks
Nothing was collected or created yet.
Alpha–beta pruning
View on Wikipedia| Class | Search algorithm |
|---|---|
| Worst-case performance | |
| Best-case performance |
Alpha–beta pruning is a search algorithm that seeks to decrease the number of nodes that are evaluated by the minimax algorithm in its search tree. It is an adversarial search algorithm used commonly for machine playing of two-player combinatorial games (Tic-tac-toe, Chess, Connect 4, etc.). It stops evaluating a move when at least one possibility has been found that proves the move to be worse than a previously examined move. Such moves need not be evaluated further. When applied to a standard minimax tree, it returns the same move as minimax would, but prunes away branches that cannot possibly influence the final decision.[1]
History
[edit]John McCarthy during the Dartmouth Workshop met Alex Bernstein of IBM, who was writing a chess program. McCarthy invented alpha–beta search and recommended it to him, but Bernstein was "unconvinced".[2]
Allen Newell and Herbert A. Simon who used what John McCarthy calls an "approximation"[3] in 1958 wrote that alpha–beta "appears to have been reinvented a number of times".[4] Arthur Samuel had an early version for a checkers simulation. Richards, Timothy Hart, Michael Levin and/or Daniel Edwards also invented alpha–beta independently in the United States.[5] McCarthy proposed similar ideas during the Dartmouth workshop in 1956 and suggested it to a group of his students including Alan Kotok at MIT in 1961.[6] Alexander Brudno independently conceived the alpha–beta algorithm, publishing his results in 1963.[7] Donald Knuth and Ronald W. Moore refined the algorithm in 1975.[8][9] Judea Pearl proved its optimality in terms of the expected running time for trees with randomly assigned leaf values in two papers.[10][11] The optimality of the randomized version of alpha–beta was shown by Michael Saks and Avi Wigderson in 1986.[12]
Core idea
[edit]A game tree can represent many two-player zero-sum games, such as chess, checkers, and reversi. Each node in the tree represents a possible situation in the game. Each terminal node (outcome) of a branch is assigned a numeric score that determines the value of the outcome to the player with the next move.[13]
The algorithm maintains two values, alpha and beta, which respectively represent the minimum score that the maximizing player is assured of and the maximum score that the minimizing player is assured of. Initially, alpha is negative infinity and beta is positive infinity, i.e. both players start with their worst possible score. Whenever the maximum score that the minimizing player (i.e. the "beta" player) is assured of becomes less than the minimum score that the maximizing player (i.e., the "alpha" player) is assured of (i.e. beta < alpha), the maximizing player need not consider further descendants of this node, as they will never be reached in the actual play.
To illustrate this with a real-life example, suppose somebody is playing chess, and it is their turn. Move "A" will improve the player's position. The player continues to look for moves to make sure a better one hasn't been missed. Move "B" is also a good move, but the player then realizes that it will allow the opponent to force checkmate in two moves. Thus, other outcomes from playing move B no longer need to be considered since the opponent can force a win. The maximum score that the opponent could force after move "B" is negative infinity: a loss for the player. This is less than the minimum position that was previously found; move "A" does not result in a forced loss in two moves.
Improvements over naive minimax
[edit]
The benefit of alpha–beta pruning lies in the fact that branches of the search tree can be eliminated.[13] This way, the search time can be limited to the 'more promising' subtree, and a deeper search can be performed in the same time. Like its predecessor, it belongs to the branch and bound class of algorithms. The optimization reduces the effective depth to slightly more than half that of simple minimax if the nodes are evaluated in an optimal or near optimal order (best choice for side on move ordered first at each node).
With an (average or constant) branching factor of b, and a search depth of d plies, the maximum number of leaf node positions evaluated (when the move ordering is pessimal) is O(bd) – the same as a simple minimax search. If the move ordering for the search is optimal (meaning the best moves are always searched first), the number of leaf node positions evaluated is about O(b×1×b×1×...×b) for odd depth and O(b×1×b×1×...×1) for even depth, or . In the latter case, where the ply of a search is even, the effective branching factor is reduced to its square root, or, equivalently, the search can go twice as deep with the same amount of computation.[14] The explanation of b×1×b×1×... is that all the first player's moves must be studied to find the best one, but for each, only the second player's best move is needed to refute all but the first (and best) first player move—alpha–beta ensures no other second player moves need be considered. When nodes are considered in a random order (i.e., the algorithm randomizes), asymptotically, the expected number of nodes evaluated in uniform trees with binary leaf-values is .[12] For the same trees, when the values are assigned to the leaf values independently of each other and say zero and one are both equally probable, the expected number of nodes evaluated is , which is much smaller than the work done by the randomized algorithm, mentioned above, and is again optimal for such random trees.[10] When the leaf values are chosen independently of each other but from the interval uniformly at random, the expected number of nodes evaluated increases to in the limit,[11] which is again optimal for this kind of random tree. Note that the actual work for "small" values of is better approximated using .[11][10]
A chess program that searches four plies with an average of 36 branches per node evaluates more than one million terminal nodes. An optimal alpha-beta prune would eliminate all but about 2,000 terminal nodes, a reduction of 99.8%.[13]

Normally during alpha–beta, the subtrees are temporarily dominated by either a first player advantage (when many first player moves are good, and at each search depth the first move checked by the first player is adequate, but all second player responses are required to try to find a refutation), or vice versa. This advantage can switch sides many times during the search if the move ordering is incorrect, each time leading to inefficiency. As the number of positions searched decreases exponentially each move nearer the current position, it is worth spending considerable effort on sorting early moves. An improved sort at any depth will exponentially reduce the total number of positions searched, but sorting all positions at depths near the root node is relatively cheap as there are so few of them. In practice, the move ordering is often determined by the results of earlier, smaller searches, such as through iterative deepening.
Additionally, this algorithm can be trivially modified to return an entire principal variation in addition to the score. Some more aggressive algorithms such as MTD(f) do not easily permit such a modification.
Pseudocode
[edit]The pseudo-code for depth limited minimax with alpha–beta pruning is as follows:[15]
function alphabeta(node, depth, α, β, maximizingPlayer) is
if depth == 0 or node is terminal then
return the heuristic value of node
if maximizingPlayer then
value := −∞
for each child of node do
value := max(value, alphabeta(child, depth − 1, α, β, FALSE))
if value ≥ β then
break (* β cutoff *)
α := max(α, value)
return value
else
value := +∞
for each child of node do
value := min(value, alphabeta(child, depth − 1, α, β, TRUE))
if value ≤ α then
break (* α cutoff *)
β := min(β, value)
return value
(* Initial call *) alphabeta(origin, depth, −∞, +∞, TRUE)
Implementations of alpha–beta pruning can often be delineated by whether they are "fail-soft," or "fail-hard". The pseudo-code illustrates the fail-soft variation. With fail-soft alpha–beta, the alphabeta function may return values (v) that exceed (v < α or v > β) the α and β bounds set by its function call arguments. In comparison, fail-hard alpha–beta limits its function return value into the inclusive range of α and β.
Heuristic improvements
[edit]Further improvement can be achieved without sacrificing accuracy by using ordering heuristics to search earlier parts of the tree that are likely to force alpha–beta cutoffs. For example, in chess, moves that capture pieces may be examined before moves that do not, and moves that have scored highly in earlier passes through the game-tree analysis may be evaluated before others. Another common, and very cheap, heuristic is the killer heuristic, where the last move that caused a beta-cutoff at the same tree level in the tree search is always examined first. This idea can also be generalized into a set of refutation tables.
Alpha–beta search can be made even faster by considering only a narrow search window (generally determined by guesswork based on experience). This is known as an aspiration window. In the extreme case, the search is performed with alpha and beta equal; a technique known as zero-window search, null-window search, or scout search. This is particularly useful for win/loss searches near the end of a game where the extra depth gained from the narrow window and a simple win/loss evaluation function may lead to a conclusive result. If an aspiration search fails, it is straightforward to detect whether it failed high (high edge of window was too low) or low (lower edge of window was too high). This gives information about what window values might be useful in a re-search of the position.
Over time, other improvements have been suggested, and indeed the Falphabeta (fail-soft alpha–beta) idea of John Fishburn is nearly universal and is already incorporated above in a slightly modified form. Fishburn also suggested a combination of the killer heuristic and zero-window search under the name Lalphabeta ("last move with minimal window alpha–beta search").
Other algorithms
[edit]Since the minimax algorithm and its variants are inherently depth-first, a strategy such as iterative deepening is usually used in conjunction with alpha–beta so that a reasonably good move can be returned even if the algorithm is interrupted before it has finished execution. Another advantage of using iterative deepening is that searches at shallower depths give move-ordering hints, as well as shallow alpha and beta estimates, that both can help produce cutoffs for higher depth searches much earlier than would otherwise be possible.
Algorithms like SSS*, on the other hand, use the best-first strategy. This can potentially make them more time-efficient, but typically at a heavy cost in space-efficiency.[16]
See also
[edit]References
[edit]- ^ Russell & Norvig 2021, p. 152-161.
- ^ McCarthy, John (2006-10-30). "The Dartmouth Workshop--as planned and as it happened". www-formal.stanford.edu. Retrieved 2023-10-29.
- ^ McCarthy, John (27 November 2006). "Human Level AI Is Harder Than It Seemed in 1955". Stanford University. Retrieved 2006-12-20.
- ^ Newell, Allen; Simon, Herbert A. (1 March 1976). "Computer science as empirical inquiry: symbols and search". Communications of the ACM. 19 (3): 113–126. doi:10.1145/360018.360022.
- ^ Edwards, D.J.; Hart, T.P. (4 December 1961). The Alpha–beta Heuristic (Technical report). Massachusetts Institute of Technology. hdl:1721.1/6098. AIM-030.
- ^ Kotok, Alan (2004) [1962]. "A Chess Playing Program". Artificial Intelligence Project. RLE and MIT Computation Center. Memo 41. Retrieved 2006-07-01.
- ^ Marsland, T.A. (May 1987). "Computer Chess Methods" (PDF). In Shapiro, S. (ed.). Encyclopedia of Artificial Intelligence. Wiley. pp. 159–171. ISBN 978-0-471-62974-0. Archived from the original (PDF) on 2008-10-30.
- ^ Knuth, Donald E.; Moore, Ronald W. (1975). "An analysis of alpha-beta pruning". Artificial Intelligence. 6 (4): 293–326. doi:10.1016/0004-3702(75)90019-3. S2CID 7894372.
- ^ Abramson, Bruce (1 June 1989). "Control strategies for two-player games". ACM Computing Surveys. 21 (2): 137–161. doi:10.1145/66443.66444. S2CID 11526154.
- ^ a b c Pearl, Judea (1980). "Asymptotic Properties of Minimax Trees and Game-Searching Procedures". Artificial Intelligence. 14 (2): 113–138. doi:10.1016/0004-3702(80)90037-5.
- ^ a b c Pearl, Judea (1982). "The Solution for the Branching Factor of the Alpha-Beta Pruning Algorithm and Its Optimality". Communications of the ACM. 25 (8): 559–64. doi:10.1145/358589.358616. S2CID 8296219.
- ^ a b Saks, M.; Wigderson, A. (1986). "Probabilistic Boolean Decision Trees and the Complexity of Evaluating Game Trees". 27th Annual Symposium on Foundations of Computer Science. pp. 29–38. doi:10.1109/SFCS.1986.44. ISBN 0-8186-0740-8. S2CID 6130392.
- ^ a b c Levy, David (January 1986). "Alpha-Beta Soup". MacUser. pp. 98–102. Retrieved 2021-10-19.
- ^ Russell & Norvig 2021, p. 155.
- ^ Russell & Norvig 2021, p. 154.
- ^ Pearl, Judea; Korf, Richard (1987), "Search techniques", Annual Review of Computer Science, 2: 451–467, doi:10.1146/annurev.cs.02.060187.002315,
Like its A* counterpart for single-player games, SSS* is optimal in terms of the average number of nodes examined; but its superior pruning power is more than offset by the substantial storage space and bookkeeping required.
Bibliography
[edit]- Russell, Stuart J.; Norvig, Peter. (2021). Artificial Intelligence: A Modern Approach (4th ed.). Hoboken: Pearson. ISBN 9780134610993. LCCN 20190474.
- Heineman, George T.; Pollice, Gary; Selkow, Stanley (2008). "7. Path Finding in AI". Algorithms in a Nutshell. Oreilly Media. pp. 217–223. ISBN 978-0-596-51624-6.
- Pearl, Judea (1984). Heuristics: Intelligent Search Strategies for Computer Problem Solving. Addison-Wesley. ISBN 978-0-201-05594-8. OCLC 1035596197.
- Fishburn, John P. (1984). "Appendix A: Some Optimizations of α-β Search". Analysis of Speedup in Distributed Algorithms (revision of 1981 PhD thesis). UMI Research Press. pp. 107–111. ISBN 0-8357-1527-2.
Alpha–beta pruning
View on GrokipediaBackground Concepts
Minimax Algorithm
The minimax algorithm is a recursive decision-making procedure designed for two-player, zero-sum games of perfect information, such as chess or tic-tac-toe, where one player (the maximizer) aims to maximize their payoff while the opponent (the minimizer) aims to minimize it, assuming both players adopt optimal strategies. The algorithm determines the optimal move by exhaustively exploring the game tree to compute the minimax value of each possible position, which represents the best achievable outcome for the maximizer given perfect play from both sides. This approach stems from the minimax theorem, originally proved by John von Neumann in 1928, which establishes that in finite, zero-sum games, the value of the game is equal to the maximum of the minimizer's minimum payoffs and the minimum of the maximizer's maximum payoffs.[3] Claude Shannon formalized the minimax algorithm for computational implementation in his 1950 paper on programming computers to play chess, emphasizing its role in simulating adversarial decision-making.[4] The recursive structure of minimax alternates between maximization and minimization levels in the game tree. At a maximizer node, the algorithm selects the child node with the highest minimax value, representing the move that guarantees the best possible outcome against the opponent's optimal response. Conversely, at a minimizer node, it selects the child with the lowest minimax value, ensuring the worst possible outcome for the maximizer. This backpropagation of values from leaf nodes upward ensures that each non-terminal node inherits the value of its optimal successor, effectively simulating the entire future of the game under perfect play. Game trees serve as the structural representation for this process, with nodes denoting game states and edges indicating legal moves.[1] The basic minimax algorithm can be expressed in pseudocode as follows, where the utility function assigns scores to terminal states (e.g., +1 for a win by the maximizer, -1 for a loss, and 0 for a draw), and successors generate all possible next states:function minimax_decision(state):
return argmax over actions in actions(state) of minimax_value(result(state, action))
function minimax_value(state):
if terminal(state):
return [utility](/page/Utility)(state)
else if maximizing_player(state):
return max over successors of minimax_value(successor)
else:
return min over successors of minimax_value(successor)
function minimax_decision(state):
return argmax over actions in actions(state) of minimax_value(result(state, action))
function minimax_value(state):
if terminal(state):
return [utility](/page/Utility)(state)
else if maximizing_player(state):
return max over successors of minimax_value(successor)
else:
return min over successors of minimax_value(successor)
Game Trees and Search Depth
In adversarial search for two-player zero-sum games, a game tree is a directed graph where the root node represents the initial or current board position, and each branch from a node corresponds to a legal move available to the player whose turn it is.[1] Successor nodes represent resulting positions, with levels alternating between the maximizing player (seeking to maximize the score) and the minimizing player (seeking to minimize it), capturing the competitive dynamics.[1] Terminal nodes, or leaves, depict endgame states with assigned utility values, typically +1 for a win by the maximizer, -1 for a loss (win by the minimizer), and 0 for a draw.[1] The branching factor quantifies the average number of legal moves from any given position, while the depth measures the number of plies (individual player moves) searched ahead from the root.[6] In complex games like chess, can reach 35 or more. For basic minimax implementations, practical depths were historically limited to 4–8 plies due to hardware constraints of the time, though modern hardware and optimizations enable searches of 20–40 plies or more as of 2025.[6] A primary challenge in traversing game trees is the exponential growth in nodes, with the full tree size approximating , rendering complete evaluation computationally prohibitive even at modest depths—for instance, a branching factor of 30 at depth 8 yields over 656 billion nodes.[6] This explosion contributes to the horizon effect, where finite search depth blinds the algorithm to long-term consequences, allowing opponents to delay inevitable threats (such as material loss) just beyond the evaluation horizon, leading to suboptimal decisions.[7] When the search reaches the specified depth without terminal states, non-terminal leaves require evaluation via static functions that heuristically assess position quality without further expansion.[8] These functions often compute metrics like material imbalance (e.g., assigning point values to pieces in chess, such as 1 for a pawn and 9 for a queen) combined with positional factors like king safety or pawn structure.[8] To address horizon effect errors in unstable or "noisy" positions—where captures or checks could alter evaluations dramatically—quiescence search selectively extends the tree beyond the principal depth limit until a quiet (tactically stable) position emerges, ensuring more reliable static assessments.[9] This technique prioritizes volatile lines, such as ongoing exchanges, while halting in calm scenarios to maintain efficiency.[9]Core Mechanism
Pruning Principle
Alpha–beta pruning optimizes the minimax algorithm by systematically eliminating subtrees in the game tree that cannot influence the final decision value at the root, thereby reducing the number of nodes evaluated without altering the optimal outcome. The core principle involves tracking a range of possible values bounded by alpha (the highest value guaranteed for the maximizer so far) and beta (the lowest value guaranteed for the minimizer so far); any subtree whose minimax value must lie entirely outside this [alpha, beta] interval is pruned, as it would not be selected by the rational player at the parent node. This approach leverages the adversarial nature of zero-sum games, where players alternate maximizing and minimizing the score, to discard irrelevant explorations early in the depth-first search.[1] Pruning decisions are made dynamically during the recursive traversal. At a maximizing node, after evaluating a successor to obtain value , if , exploration of remaining successors ceases (beta cutoff), since the minimizing opponent at the parent would reject any path allowing such a high value. Similarly, at a minimizing node, if a successor yields (alpha cutoff), sibling successors are skipped, as the maximizing player would avoid paths leading to such low values. These conditions arise because the current bounds reflect the best alternatives already discovered elsewhere in the tree, making further searches under the current node futile.[1] The algorithm preserves the minimax optimality because pruned subtrees cannot contain values that would change the parent's decision: for a beta cutoff, any unexplored value under the maximizing node is at most , which is already inferior to the maximizer's alpha from other branches; analogously, alpha cutoffs ensure unexplored minimizer subtrees cannot drop below alpha without contradicting the minimizer's incentives. As the search backtracks, bounds propagate and tighten, but the root's value remains unchanged since only non-improving options are discarded—a property formally demonstrated through inductive analysis of the recursive calls, confirming equivalence to full minimax.[2] Regarding efficiency, alpha-beta pruning matches minimax's worst-case complexity of , where is the average branching factor and is the search depth, occurring with poor ordering that delays cutoffs. With optimal ordering—exploring best moves first—the average case improves to , as pruning approximates a linear search along the principal variation, drastically cutting nodes evaluated; for instance, in chess-like games with , this shifts effective exploration from to roughly .[2]Alpha and Beta Values
In alpha-beta pruning, the alpha value represents the best (highest) value that the maximizing player (MAX) can guarantee so far during the search, serving as a lower bound on the final value of the node being evaluated.[2] Conversely, the beta value denotes the best (lowest) value that the minimizing player (MIN) can guarantee so far, acting as an upper bound on the node's value.[2] These dynamic bounds are maintained and propagated through the game tree to track the possible range of outcomes, enabling the algorithm to discard branches that cannot influence the root decision.[2] At the root of the search tree, alpha is initialized to negative infinity (or the lowest possible score in a bounded evaluation), and beta is set to positive infinity (or the highest possible score), establishing the widest possible interval for the initial search.[2] As the search descends into MAX nodes, after evaluating a child node with value , alpha is updated to , reflecting the improved lower bound for MAX.[2] In MIN nodes, beta is similarly updated to after a child's evaluation, tightening the upper bound for MIN.[2] Pruning occurs if, at any point, , indicating that the current branch cannot yield a value better than what has already been found elsewhere.[2] The bound propagation follows the recursive structure of minimax, adapted for efficiency. For a MAX node, the value is computed as: followed by iterative evaluation over children: with checks after each child: if , terminate early and return (beta cutoff); otherwise, update before proceeding.[2] Symmetrically, for a MIN node: with early termination if (alpha cutoff), and .[2] These updates ensure that tighter bounds are passed downward, progressively narrowing the search interval and reducing the number of nodes explored deeper in the tree.[2] As the search progresses, the alpha and beta values often represent fail-low or fail-high scenarios relative to the incoming bounds. A fail-low occurs when a node's returned value is less than or equal to the incoming alpha, confirming that it provides no improvement for MAX and justifying further pruning in sibling branches.[2] A fail-high happens when the value exceeds or equals the incoming beta, indicating irrelevance for MIN and allowing cutoff.[2] These scenarios reinforce the bounds' role in eliminating irrelevant subtrees without altering the root's optimal value.[2]Implementation
Pseudocode
The alpha-beta pruning algorithm is typically implemented as a recursive function that traverses the game tree, updating alpha and beta bounds during the search to enable early termination of unpromising branches. The following pseudocode, adapted from the standard formulation, assumes a fixed-depth search where the function evaluates nodes up to a specified depth or until a terminal state is reached.function alphabeta(node, depth, α, β, maximizingPlayer) returns value:
if depth == 0 or isTerminal(node):
return evaluate(node) // Heuristic evaluation for leaf nodes
if maximizingPlayer:
value := -∞
for each child in successors(node):
value := max(value, alphabeta(child, depth - 1, α, β, FALSE))
α := max(α, value)
if α >= β:
break // β cutoff: prune remaining branches
return value
else:
value := +∞
for each child in successors(node):
value := min(value, alphabeta(child, depth - 1, α, β, TRUE))
β := min(β, value)
if α >= β:
break // α cutoff: prune remaining branches
return value
function alphabeta(node, depth, α, β, maximizingPlayer) returns value:
if depth == 0 or isTerminal(node):
return evaluate(node) // Heuristic evaluation for leaf nodes
if maximizingPlayer:
value := -∞
for each child in successors(node):
value := max(value, alphabeta(child, depth - 1, α, β, FALSE))
α := max(α, value)
if α >= β:
break // β cutoff: prune remaining branches
return value
else:
value := +∞
for each child in successors(node):
value := min(value, alphabeta(child, depth - 1, α, β, TRUE))
β := min(β, value)
if α >= β:
break // α cutoff: prune remaining branches
return value
alphabeta(root, maxDepth, -∞, +∞, TRUE), assuming the root is a maximizing node.
For scenarios requiring variable depth to manage time constraints, an iterative deepening variant wraps the alpha-beta search in a loop that incrementally increases the depth until a time limit is approached, reusing results from shallower searches. A simple snippet for this fixed-depth iterative deepening approach is:
function iterativeDeepeningAlphaBeta(root, maxTime):
for depth := 1 to ∞:
if timeRemaining() < threshold:
break
value := alphabeta(root, depth, -∞, +∞, TRUE)
bestMove := trackBestMove(value) // Assuming move tracking
return bestMove
function iterativeDeepeningAlphaBeta(root, maxTime):
for depth := 1 to ∞:
if timeRemaining() < threshold:
break
value := alphabeta(root, depth, -∞, +∞, TRUE)
bestMove := trackBestMove(value) // Assuming move tracking
return bestMove
Step-by-Step Example
To illustrate the operation of alpha-beta pruning, consider a hypothetical three-ply game tree for a simple two-player zero-sum game, where terminal node values represent scores favoring the maximizing player (higher is better for MAX, lower for MIN). The tree consists of a MAX root node at depth 0, three MIN child nodes at depth 1 (labeled B, C, D), and each MIN node has three MAX grandchildren at depth 2, with terminal leaves at depth 3 providing the scores. For simplicity, each MAX grandchild has a single associated leaf value, yielding the following terminals (explored left-to-right):- Under B: 3, 12, 8 (so B = )
- Under C: 2, 4, 6
- Under D: 14, 5, 2
