Hubbry Logo
Backward inductionBackward inductionMain
Open search
Backward induction
Community hub
Backward induction
logo
7 pages, 0 posts
0 subscribers
Be the first to start a discussion here.
Be the first to start a discussion here.
Backward induction
Backward induction
from Wikipedia

Backward induction is the process of determining a sequence of optimal choices by reasoning from the endpoint of a problem or situation back to its beginning using individual events or actions.[1] Backward induction involves examining the final point in a series of decisions and identifying the optimal process or action required to arrive at that point. This process continues backward until the best action for every possible point along the sequence is determined. Backward induction was first utilized in 1875 by Arthur Cayley, who discovered the method while attempting to solve the secretary problem.[2]

In dynamic programming, a method of mathematical optimization, backward induction is used for solving the Bellman equation.[3][4] In the related fields of automated planning and scheduling and automated theorem proving, the method is called backward search or backward chaining. In chess, it is called retrograde analysis.

In game theory, a variant of backward induction is used to compute subgame perfect equilibria in sequential games.[5] The difference is that optimization problems involve one decision maker who chooses what to do at each point of time. In contrast, game theory problems involve the interacting decision of several players. In this situation, it may still be possible to apply a generalization of backward induction, since it may be possible to determine what the second-to-last player will do by predicting what the last player will do in each situation, and so on. This variant of backward induction has been used to solve formal games from the beginning of game theory. John von Neumann and Oskar Morgenstern suggested solving zero-sum, two-person formal games through this method in their Theory of Games and Economic Behaviour (1944), the book which established game theory as a field of study.[6][7]

Decision-making example

[edit]

Optimal-stopping problem

[edit]

Consider a person evaluating potential employment opportunities for the next ten years, denoted as times . At each , they may encounter a choice between two job options: a 'good' job offering a salary of or a 'bad' job offering a salary of . Each job type has an equal probability of being offered. Upon accepting a job, the individual will maintain that particular job for the entire remainder of the ten-year duration.

This scenario is simplified by assuming that the individual's entire concern is their total expected monetary earnings, without any variable preferences for earnings across different periods. In economic terms, this is a scenario with an implicit interest rate of zero and a constant marginal utility of money.

Whether the person in question should accept a 'bad' job can be decided by reasoning backwards from time .

  • At , the total earnings from accepting a 'good' job is ; the value of accepting a 'bad' job is ; the total earnings from rejecting the available job is . Therefore, if they are still unemployed in the last period, they should accept whatever job they are offered at that time for greater income.
  • At , the total earnings from accepting a 'good' job is because that job will last for two years. The total earnings from accepting a 'bad' job is . The total expected earnings from rejecting a job offer are now plus the value of the next job offer, which will either be with 1/2 probability or with 1/2 probability, for an average ('expected') value of . Therefore, the job available at should be accepted.
  • At , the total earnings from accepting a 'good' job is ; the total earnings from accepting a 'bad' job is . The total expected earnings from rejecting a job offer is now plus the total expected earnings from waiting for a job offer at . As previously concluded, any offer at should be accepted and the expected value of doing so is . Therefore, at , total expected earnings are higher if the person waits for the next offer rather than accepting a 'bad' job.

By continuing to work backwards, it can be verified that a 'bad' offer should only be accepted if the person is still unemployed at or ; a bad offer should be rejected at any time up to and including . Generalizing this example intuitively, it corresponds to the principle that if one expects to work in a job for a long time, it is worth picking carefully.

A dynamic optimization problem of this kind is called an optimal stopping problem because the issue at hand is when to stop waiting for a better offer. Search theory is a field of microeconomics that applies models of this type to matters such as shopping, job searches, and marriage.

Game theory

[edit]

In game theory, backward induction is a solution methodology that follows from applying sequential rationality to identify an optimal action for each information set in a given game tree. It develops the implications of rationality via individual information sets in the extensive-form representation of a game.[8]

In order to solve for a subgame perfect equilibrium with backwards induction, the game should be written out in extensive form and then divided into subgames. Starting with the subgame furthest from the initial node, or starting point, the expected payoffs listed for this subgame are weighed, and a rational player will select the option with the higher payoff for themselves. The highest payoff vector is selected and marked. To solve for the subgame perfect equilibrium, one should continually work backwards from subgame to subgame until the starting point is reached. As this process progresses, the initial extensive form game will become shorter and shorter. The marked path of vectors is the subgame perfect equilibrium.[9]

Multi-stage game

[edit]

The application of backward induction in game theory can be demonstrated with a simple example. Consider a multi-stage game involving two players planning to go to a movie.

  • Player 1 wants to watch The Terminator, and Player 2 wants to watch Joker.
  • Player 1 will buy a ticket first and tell Player 2 about her choice.
  • Next, Player 2 will buy his ticket.

Once they both observe the choices, the second stage begins. In the second stage, players choose whether to go to the movie or stay home.

  • As in the first stage, Player 1 chooses whether to go to the movie first.
  • After observing Player 1's choice, Player 2 makes his choice.

For this example, payoffs are added across different stages. The game is a perfect information game. The normal-form matrices for these games are:

Stage 1
Player 2

Player 1
Joker Terminator
Joker 3, 5 0, 0
Terminator 1, 1 5, 3
Stage 2
Player 2

Player 1
Go to Movie Stay Home
Go to Movie 6, 6 4, -2
Stay Home -2, 4 -2, -2
Extensive form for the Joker-Terminator game

The extensive form of this multi-stage game can be seen to the right. The steps for solving this game with backward induction are as follows:

  1. Analysis starts from the final nodes.
  2. Player 2 will observe 8 subgames from the final nodes to choose "go to movie" or "stay home".
    • Player 2 would make 8 possible comparisons in total, choosing the option with the highest payoff in each.
    • For example, considering the first subgame, Player 2's payoff of 11 for "go to movie" is higher than his payoff of 7 for "stay at home." Player 2 would therefore choose "go to movie."
    • The method continues for every subgame.
  3. Once Player 2's optimal decisions have been determined (bolded green lines in the extensive form diagram), analysis starts for Player 1's decisions in her 4 subgames.
    • The process is similar to step 2, comparing Player 1's payoffs in order to anticipate her choices.
    • Subgames that would not be chosen by Player 2 in the previous step are no longer considered because they are ruled out by the assumption of rational play.
    • For example, in the first subgame, the choice "go to movie" offers a payoff of 9 since the decision tree terminates at the reward (9, 11), considering Player 2's previously established choice. Meanwhile, "stay home" offers a payoff of 1 since it ends at (1, 9), so Player 1 would choose "go to movie."
  4. The process repeats for each player until the initial node is reached.
    • For example, Player 2 would choose "Joker" for the first subgame in the next iteration because a payoff of 11 ending in (9, 11) is greater than "Terminator" with a payoff of 6 at (6, 6).
    • Player 1, at the initial node, would select "Terminator" because it offers a higher payoff of 11 at (11, 9) than Joker, which has a reward of 9 at (9, 11).
  5. To identify a subgame perfect equilibrium, one needs to identify a route that selects an optimal subgame at each information set.
    • In this example, Player 1 chooses "Terminator" and Player 2 also chooses "Terminator." Then they both choose "go to movie."
    • The subgame perfect equilibrium leads to a payoff of (11,9).

Limitations

[edit]

Backward induction can be applied to only limited classes of games. The procedure is well-defined for any game of perfect information with no ties of utility. It is also well-defined and meaningful for games of perfect information with ties. However, in such cases it leads to more than one perfect strategy. The procedure can be applied to some games with nontrivial information sets, but it is not applicable in general. It is best suited to solve games with perfect information. If all players are not aware of the other players' actions and payoffs at each decision node, then backward induction is not so easily applied.[10]

Ultimatum game

[edit]

A second example demonstrates that even in games that formally allow for backward induction in theory, it may not accurately predict empirical game play in practice. This example of an asymmetric game consists of two players: Player 1 proposes to split a dollar with Player 2, which Player 2 then accepts or rejects. This is called the ultimatum game. Player 1 acts first by splitting the dollar however they see fit. Next, Player 2 either accepts the portion they have been offered by Player 1 or rejects the split. If Player 2 accepts the split, then both Player 1 and Player 2 get the payoffs matching that split. If Player 2 decides to reject Player 1's offer, then both players get nothing. In other words, Player 2 has veto power over Player 1's proposed allocation, but applying the veto eliminates any reward for both players.[11]

Considering the choice and response of Player 2 given any arbitrary proposal by Player 1, formal rationality prescribes that Player 2 would accept any payoff that is greater than or equal to $0. Accordingly, by backward induction Player 1 ought to propose giving Player 2 as little as possible in order to gain the largest portion of the split. Player 1 giving Player 2 the smallest unit of money and keeping the rest for themselves is the unique subgame-perfect equilibrium. The ultimatum game does have several other Nash Equilibria which are not subgame perfect and therefore do not arise via backward induction.

The ultimatum game is a theoretical illustration of the usefulness of backward induction when considering infinite games, but the ultimatum games theoretically predicted results do not match empirical observation. Experimental evidence has shown that a proposer, Player 1, very rarely offers $0 and the responder, Player 2, sometimes rejects offers greater than $0. What is deemed acceptable by Player 2 varies with context. The pressure or presence of other players and external implications can mean that the game's formal model cannot necessarily predict what a real person will choose. According to Colin Camerer, an American behavioral economist, Player 2 "rejects offers of less than 20 percent of X about half the time, even though they end up with nothing."[12]

While backward induction assuming formal rationality would predict that a responder would accept any offer greater than zero, responders in reality are not formally rational players and therefore often seem to care more about offer 'fairness' or perhaps other anticipations of indirect or external effects rather than immediate potential monetary gains.

Economics

[edit]

Entry-decision problem

[edit]

A dynamic game in which the players are an incumbent firm in an industry and a potential entrant to that industry is to be considered. As it stands, the incumbent has a monopoly over the industry and does not want to lose some of its market share to the entrant. If the entrant chooses not to enter, the payoff to the incumbent is high (it maintains its monopoly) and the entrant neither loses nor gains (its payoff is zero). If the entrant enters, the incumbent can "fight" or "accommodate" the entrant. It will fight by lowering its price, running the entrant out of business (and incurring exit costs—a negative payoff) and damaging its own profits. If it accommodates the entrant it will lose some of its sales, but a high price will be maintained and it will receive greater profits than by lowering its price (but lower than monopoly profits).

If the incumbent accommodates given the case that the entrant enters, the best response for the entrant is to enter (and gain profit). Hence the strategy profile in which the entrant enters and the incumbent accommodates if the entrant enters is a Nash equilibrium consistent with backward induction. However, if the incumbent is going to fight, the best response for the entrant is to not enter, and if the entrant does not enter, it does not matter what the incumbent chooses to do in the hypothetical case that the entrant does enter. Hence the strategy profile in which the incumbent fights if the entrant enters, but the entrant does not enter is also a Nash equilibrium. However, were the entrant to deviate and enter, the incumbent's best response is to accommodate—the threat of fighting is not credible. This second Nash equilibrium can therefore be eliminated by backward induction.

Finding a Nash equilibrium in each decision-making process (subgame) constitutes as perfect subgame equilibria. Thus, these strategy profiles that depict subgame perfect equilibria exclude the possibility of actions like incredible threats that are used to "scare off" an entrant. If the incumbent threatens to start a price war with an entrant, they are threatening to lower their prices from a monopoly price to slightly lower than the entrant's, which would be impractical, and incredible, if the entrant knew a price war would not actually happen since it would result in losses for both parties. Unlike a single-agent optimization which might include suboptimal or infeasible equilibria, a subgame perfect equilibrium accounts for the actions of another player, ensuring that no player reaches a subgame mistakenly. In this case, backwards induction yielding perfect subgame equilibria ensures that the entrant will not be convinced of the incumbent's threat knowing that it was not a best response in the strategy profile.[13]

Unexpected hanging paradox

[edit]

The unexpected hanging paradox is a paradox related to backward induction. The prisoner described in the paradox uses backwards induction to reach a false conclusion. The description of the problem assumes it is possible to surprise someone who is performing backward induction. The mathematical theory of backward induction does not make this assumption, so the paradox does not call into question the results of this theory.

Common knowledge of rationality

[edit]

Backward induction works only if both players are rational, i.e., always select an action that maximizes their payoff. However, rationality is not enough: each player should also believe that all other players are rational. Even this is not enough: each player should believe that all other players know that all other players are rational, and so on, ad infinitum. In other words, rationality should be common knowledge.[14]

Limited backward induction

[edit]

Limited backward induction is a deviation from fully rational backward induction. It involves enacting the regular process of backward induction without perfect foresight. Theoretically, this occurs when one or more players have limited foresight and cannot perform backward induction through all terminal nodes.[15] Limited backward induction plays a much larger role in longer games as the effects of limited backward induction are more potent in later periods of games.

A four-stage sequential game with a foresight bound

Experiments have shown that in sequential bargaining games, such as the Centipede game, subjects deviate from theoretical predictions and instead engage in limited backward induction. This deviation occurs as a result of bounded rationality, where players can only perfectly see a few stages ahead.[16] This allows for unpredictability in decisions and inefficiency in finding and achieving subgame perfect Nash equilibria.

There are three broad hypotheses for this phenomenon:

  1. The presence of social factors (e.g. fairness)
  2. The presence of non-social factors (e.g. limited backward induction)
  3. Cultural difference

Violations of backward induction is predominantly attributed to the presence of social factors. However, data-driven model predictions for sequential bargaining games (using the cognitive hierarchy model) have highlighted that in some games the presence of limited backward induction can play a dominant role.[17]

Within repeated public goods games, team behavior is impacted by limited backward induction; where it is evident that team members' initial contributions are higher than contributions towards the end. Limited backward induction also influences how regularly free-riding occurs within a team's public goods game. Early on, when the effects of limited backward induction are low, free riding is less frequent, whilst towards the end, when effects are high, free-riding becomes more frequent.[18]

Limited backward induction has also been tested for within a variant of the race game. In the game, players would sequentially choose integers inside a range and sum their choices until a target number is reached. Hitting the target earns that player a prize; the other loses. Partway through a series of games, a small prize was introduced. The majority of players then performed limited backward induction, as they solved for the small prize rather than for the original prize. Only a small fraction of players considered both prizes at the start.[19]

Most tests of backward induction are based on experiments, in which participants are only to a small extent incentivized to perform the task well, if at all. However, violations of backward induction also appear to be common in high-stakes environments. A large-scale analysis of the American television game show The Price Is Right, for example, provides evidence of limited foresight. In every episode, contestants play the Showcase Showdown, a sequential game of perfect information for which the optimal strategy can be found through backward induction. The frequent and systematic deviations from optimal behavior suggest that a sizable proportion of the contestants fail to properly backward induct and myopically consider the next stage of the game only.[20]

See also

[edit]

Notes

[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
Backward induction is a fundamental technique in for solving finite extensive-form games with , where players make sequential decisions. It proceeds by iteratively analyzing the game from its terminal nodes backward to the initial node, determining at each decision point the action that maximizes the player's payoff given the optimal play that follows. This process assumes that all players are rational and that this rationality is , even for hypothetical subgames that may not be reached in equilibrium. Although the method traces back to early analyses like Zermelo's 1913 work on chess, the associated concept of subgame perfect was introduced by in 1965 to refine earlier equilibrium concepts by eliminating non-credible threats or promises in sequential settings. This equilibrium ensures that strategies are optimal not only in the overall game but also in every subgame. Selten's formulation emphasizes sequential rationality, where players choose best responses at every information set, preventing equilibria supported only by implausible off-path behavior. Backward induction applies to diverse scenarios, such as the , where rational anticipation leads players to terminate cooperation early despite potential for higher joint payoffs, illustrating tensions between theory and observed human behavior. It underpins analyses in , including models with dynamic entry and bargaining games like the , where backward induction predicts that the proposer offers the smallest possible positive amount, which the responder accepts under . While powerful for theoretical prediction, empirical studies show that real-world players often deviate from backward induction due to , level-k thinking, or social preferences—such as rejecting unfair offers in the .

Fundamentals

Definition and Process

Backward induction is a systematic reasoning technique used to solve finite sequential decision problems or games of by beginning at the terminal states and working backwards to derive optimal actions at each prior decision point. This method, formalized in the foundational work of , ensures that strategies are optimal not only for the overall problem but also for every subproblem encountered along the way. The process of backward induction proceeds in a structured, iterative manner. First, identify the terminal nodes of the or , where payoffs or outcomes are fully determined and no further actions are possible. Second, at each decision node immediately preceding a terminal node, select the action that maximizes the payoff for the decision-maker at that node, assuming . Third, substitute the value of this resolved subproblem (the maximum payoff) as the effective payoff for the preceding node, effectively suboptimal branches. Fourth, repeat this backward traversal through all prior nodes until reaching the initial decision point, yielding the optimal strategy from the start. This technique relies on several key prerequisites: the problem must be finite, with a limited number of stages and a clear endpoint; decision-makers must possess about the structure, possible actions, and payoffs at every stage; and all agents are assumed to be rational, maximizing their expected , with this rationality being among them. These assumptions enable the method to predict equilibrium outcomes in sequential settings. In non-game contexts, such as single-agent sequential decision problems, backward induction functions as a general algorithm akin to dynamic programming. It can be outlined in pseudocode as follows:

function backward_induction(decision_tree): # Base case: terminal nodes if node.is_terminal(): return node.payoff # Recursive step: evaluate subproblems backward max_value = -infinity optimal_action = None for action in node.actions: sub_value = backward_induction(node.successor(action)) if sub_value > max_value: max_value = sub_value optimal_action = action # Assign optimal value and action to current node node.value = max_value node.optimal_action = optimal_action return max_value # Initiate from root optimal_strategy = backward_induction(root_node)

function backward_induction(decision_tree): # Base case: terminal nodes if node.is_terminal(): return node.payoff # Recursive step: evaluate subproblems backward max_value = -infinity optimal_action = None for action in node.actions: sub_value = backward_induction(node.successor(action)) if sub_value > max_value: max_value = sub_value optimal_action = action # Assign optimal value and action to current node node.value = max_value node.optimal_action = optimal_action return max_value # Initiate from root optimal_strategy = backward_induction(root_node)

This recursive formulation computes the value function and by solving subproblems from the end, ensuring computational tractability for finite horizons.

Basic Example

To illustrate backward induction, consider a simple single-player sequential with three stages, representing a simplified chain of investments where the decision-maker can stop at any stage to secure a payoff or continue, with payoffs increasing but carrying the risk of (payoff of 0) if the process extends beyond the final stage without stopping. At stage 1, stopping yields a payoff of 1; continuing leads to stage 2. At stage 2, stopping yields 3; continuing leads to stage 3. At stage 3, stopping yields 6; continuing beyond yields 0. The backward induction process begins at the final stage and works backward to determine the optimal choice at each . At stage 3, the decision-maker compares stopping (payoff 6) to continuing (payoff 0) and chooses to stop, yielding a value of 6 for reaching stage 3. Substituting this backward, at stage 2, the decision-maker compares stopping (payoff 3) to continuing (value 6 from stage 3) and chooses to continue, yielding a value of 6 for reaching stage 2. Finally, at stage 1, the decision-maker compares stopping (payoff 1) to continuing (value 6 from stage 2) and chooses to continue, yielding a value of 6 for the initial decision. This can be visualized through a , where nodes represent and branches represent choices (stop or continue), with terminal payoffs labeled:
  • 1 node: Branch to stop (payoff 1) or continue to 2.
  • 2 node: Branch to stop (payoff 3) or continue to 3.
  • 3 node: Branch to stop (payoff 6) or continue (payoff 0).
Backward reasoning prunes suboptimal branches, revealing the optimal path: continue at 1, continue at 2, stop at 3, for total payoff 6. A key insight is that forward reasoning—considering immediate payoffs without anticipating future optimal choices—might suggest stopping early at stage 1 for the secure 1, potentially misleading the decision-maker; backward induction ensures subgame perfection by guaranteeing optimality in every possible continuation. This approach relates to broader problems in .

Applications in Decision Theory

Optimal Stopping Problems

Optimal stopping problems are a class of decision problems in which an agent observes a sequence of random variables—such as offers or signals—and must decide at each step whether to stop and accept the current observation or continue to the next, with the goal of maximizing the expected reward. These problems are solved using backward induction for finite horizons, where the optimal strategy is computed by recursively evaluating decisions from the final period backward to the initial one, determining reservation values that threshold acceptance at each stage. The theoretical foundations of emerged in the mid-20th century within , with strong connections to dynamic programming as developed by Richard Bellman in the ; Bellman's framework formalized the structure of sequential decisions, enabling the analysis of stopping rules as special cases of multistage optimization. Mathematically, consider a finite-horizon problem over nn periods, where offers X1,X2,,XnX_1, X_2, \dots, X_n are drawn independently from a known distribution, and recall of previous offers is not allowed. The value function VkV_k at stage kk (with k=nk = n at the last stage and decreasing to 1) is defined as Vn=Xn,V_n = X_n, and for k=1,,n1k = 1, \dots, n-1, Vk=max(Xk,E[Vk+1]),V_k = \max(X_k, \mathbb{E}[V_{k+1}]), where the expectation is taken over the distribution of future offers. The optimal policy is to accept the offer at stage kk if XkE[Vk+1]X_k \geq \mathbb{E}[V_{k+1}], which defines a reservation value ξk=E[Vk+1]\xi_k = \mathbb{E}[V_{k+1}]; otherwise, continue to the next stage. This backward yields decreasing reservation values as earlier stages approach, reflecting the value of remaining opportunities. A classic illustration is the house-selling problem, a value-based variant of where a seller receives sequential offers for a house and seeks to maximize the expected sale price. Assume offers arrive daily for nn days from a known distribution (e.g., on [0,1]), with no permitted. Backward induction computes the ξk\xi_k at day kk (counting backward from the end) as ξk=E[max(Xk+1,ξk+1)]\xi_k = \mathbb{E}[\max(X_{k+1}, \xi_{k+1})], starting with ξn=0\xi_n = 0 (accept any final offer or get 0 if none). For example, with n=3n=3 and offers on [0,1], the computations yield ξ3=0\xi_3 = 0, ξ2=1/2\xi_2 = 1/2, and ξ1=5/8=0.625\xi_1 = 5/8 = 0.625, so the seller accepts on day 1 only if the offer exceeds 0.625, on day 2 if above 0.5, and always on day 3. This strategy achieves an expected payoff higher than always accepting the first offer or random stopping, highlighting backward induction's role in balancing immediate gain against future potential. Another seminal example is the secretary problem, originally posed in the late and solved in the early 1960s, where an employer interviews nn candidates sequentially in random order and must hire the best one by deciding irrevocably at each interview. Backward induction for the finite case computes the probability of selecting the best candidate by determining stopping thresholds based on relative ranks observed so far, though the infinite-horizon approximation famously suggests rejecting the first n/en/e candidates (about 37%) and then accepting the next record-best. The problem's solution, yielding a success probability approaching 1/e0.3681/e \approx 0.368, was first derived by D. V. Lindley in 1961, building on earlier formulations.

Economic Entry Decisions

In economic models of market entry, a potential entrant must decide whether to incur a fixed and compete in a market dominated by an firm, while anticipating the 's response to entry, such as accommodating the new competitor or aggressively fighting through increased output or pricing to reduce the entrant's profitability. This sequential decision structure creates a strategic interaction where the entrant's choice depends on the expected post-entry behavior of the . Backward induction is applied by first solving the post-entry subgame, in which the incumbent selects its output or pricing strategy to maximize its own profit given the entrant's presence, often minimizing the entrant's profit in the process. The entrant then evaluates this anticipated post-entry equilibrium payoff against the fixed entry cost; if the net payoff is negative, entry is deterred. This process ensures that only credible incumbent strategies—those optimal in every —influence the entrant's decision, leading to a . A simple entry game illustrates this: the entrant chooses to stay out (payoff: 0 for entrant, 10 for incumbent) or enter (paying F upon entry); upon entry, the chooses to accommodate (payoffs: 5 for entrant net of F, 5 for ) or fight (payoffs: -1 - F for entrant, -1 for ). Using backward induction, the incumbent's dominant in the post-entry is to accommodate, as 5 > -1, yielding an expected entrant payoff of 5 - F. If F > 5, the entrant stays out, achieving the equilibrium (stay out, accommodate if entered); however, this reveals that empty threats of fighting lack credibility without commitment mechanisms. The entrant's net payoff in such models is given by: πe=F+π(qe,qi(qe))\pi_e = -F + \pi(q_e, q_i(q_e)) where FF is the fixed entry cost, π()\pi(\cdot) is the entrant's variable profit, qeq_e is the entrant's output, and qi(qe)q_i(q_e) is the incumbent's best-response output, derived backward from the post-entry Cournot or . These models explain or limit pricing as rational, commitment-based strategies: an may pre-commit to high capacity , shifting its post-entry reaction to credibly deter entry by making the entrant's π(qe,qi(qe))\pi(q_e, q_i(q_e)) unprofitable. Dixit (1980) extends this framework by incorporating sunk capacity costs, showing that an can strategically overinvest in capacity to alter its , enabling limit output levels that block entry while maximizing long-term profits, as in blockaded monopoly outcomes when fixed costs are high. This highlights how irreversible investments serve as credible threats, distinguishing them from reversible actions that fail under backward induction.

Applications in Game Theory

Multi-Stage Games

In multi-stage games with , players alternate moves in a finite sequence, represented as an tree where each node denotes a decision point and branches indicate possible actions leading to terminal payoffs. Backward induction serves as the primary method to solve these games, determining optimal strategies by reasoning from the end of the tree backward to the initial node, thereby identifying the subgame perfect (SPNE). This equilibrium refines the standard by ensuring that strategies form a not only in the full game but also in every —a portion of the tree beginning at any node and including all subsequent branches. The process of backward induction begins at the terminal nodes (leaves) of the game tree, where payoffs are known, and proceeds upward by selecting the action that maximizes the payoff for the player at each decision node, assuming rational play thereafter. This backward solving eliminates non-credible threats or promises, as strategies off the equilibrium path must remain optimal in the subgames they define. For instance, in a simple two-player sequential game where Player 1 chooses between actions leading to a subgame where Player 2 responds, backward induction first resolves Player 2's best response in that subgame before determining Player 1's initial choice. The resulting profile survives iterated deletion of dominated strategies across all s, ensuring sequential . A prominent example is the centipede game, a finite multi-stage game introduced by Rosenthal in 1981, where two players alternate decisions to either "take" a growing pot (terminating the game with an advantageous payoff for the taker) or "pass" to increase the pot further, potentially benefiting both if continued. Despite incentives for cooperation through passing to reach higher joint payoffs, backward induction predicts immediate termination by the first player taking the pot, as rational anticipation of the opponent's defection at later stages unravels the game from the end. This outcome highlights the tension between theoretical prediction and intuitive cooperation in sequential settings. The connection to SPNE traces back to Zermelo's theorem, which established that in finite, perfect-information games like chess—modeled as alternating-move trees—one player can force a win, a draw, or the opponent a win; optimal strategies in such games can be determined via backward induction, formalizing them as those surviving deletion of inferior moves in subgames (though Zermelo's original proof used non-repetition). Selten later generalized this for broader extensive-form games, defining SPNE as a strategy profile where no player regrets any deviation in any subgame, applicable under assumptions of common knowledge of payoffs, perfect information (no simultaneous moves), and finite stages to avoid infinite regress. These conditions ensure the uniqueness or determinacy of outcomes in deterministic trees, as in Zermelo's analysis.

Ultimatum Game Analysis

The is a two-player, one-shot experiment where one player, the proposer, offers to split a fixed sum of , denoted as XX, with the other player, the responder. The proposer suggests an amount ss (where 0sX0 \leq s \leq X) for the responder, retaining XsX - s for themselves; the responder then decides to accept, receiving ss while the proposer gets XsX - s, or reject, resulting in both players receiving nothing. This setup models a simple sequential decision process, often analyzed under perfect assumptions. Applying backward induction to the begins at the responder's decision node, the final . In this , for any offer s>0s > 0, the responder's rational choice is to accept, as ss exceeds the alternative payoff of 0; for s=0s = 0, indifference holds, but acceptance is weakly dominant. Anticipating this, the proposer, at the initial node, selects the minimal positive offer ϵ>0\epsilon > 0 (approaching 0 in continuous terms), retaining nearly all of XX. The game's extensive form can be represented as a :
  • Proposer chooses s[0,X]s \in [0, X].
    • If s>0s > 0, Responder accepts: payoffs (XsX - s, ss).
    • If s>0s > 0, Responder rejects: payoffs (0, 0).
    • If s=0s = 0, Responder accepts or rejects: payoffs (XX, 0) or (0, 0).
This yields a where the proposer offers ϵ\epsilon and the responder accepts any sϵs \geq \epsilon. In contrast, experimental implementations deviate substantially from this prediction. The original study by Güth et al. (1982) found proposers offering an average of 35-40% of the stake, with responders rejecting offers below 20-30% approximately 20% of the time. A of 37 papers encompassing 75 experiments confirmed these patterns, showing average proposer offers of 40% of XX and rejection rates of 16% for low offers, with fair splits (around 50/50) common across cultures. These findings underscore the tension between backward induction's rational benchmark and in anonymous, one-shot interactions, where fairness norms and aversion to inequity drive proposers toward equitable divisions and responders toward punishing unfairness, even at personal cost.

Limitations and Paradoxes

The unexpected hanging paradox, also known as the surprise examination paradox, illustrates a self-referential logical puzzle that undermines backward induction when applied to predictions involving knowledge and surprise. In its standard setup, a judge informs a condemned prisoner that the execution will take place at noon on one weekday of the following week—typically Monday through Friday—and that it will occur on a day the prisoner cannot anticipate in advance. The prisoner reasons iteratively: the hanging cannot be on Friday, the final day, because if it has not occurred by Thursday noon, the prisoner would deduce and thus expect it, violating the surprise condition. With Friday eliminated, Thursday becomes the effective last day and is similarly ruled out, as the prisoner would then expect it; this backward elimination continues through Wednesday, Tuesday, and Monday, leading the prisoner to conclude that no execution is possible under the judge's truthful announcement. However, the prisoner is then hanged unexpectedly, for example on Tuesday, satisfying the conditions and exposing the flawed reasoning. This process directly parallels backward induction, where outcomes are determined by starting from terminal states and working reversely through , but here it generates a contradiction because the self-referential about the prisoner's own disrupts the chain. The emerges from the announcement's dependence on the prisoner's epistemic state, creating a loop where the of surprise precludes its own fulfillment across all days. Philosophically, the puzzle probes the interplay of surprise, knowledge, and common knowledge in predictive logic, revealing tensions in assuming perfect rationality and information. Resolutions often attribute the error to the announcement's self-refuting quality: W.V. Quine contended that the initial elimination of the last day fails because it presupposes the statement's unconditional truth, ignoring how the prisoner's deduction alters the epistemic landscape. Other approaches invoke non-monotonic logic, where beliefs can be revised upon new information without preserving all prior inferences, or highlight the induction's collapse in finite self-referential scenarios, mimicking infinite regresses that prevent complete elimination. The paradox also links to epistemic logic dilemmas, such as Fitch's paradox of knowability, which similarly challenges the formalization of universal truth-knowability through self-reference. The paradox gained prominence in the 1940s through oral circulation in logic circles, with possible origins in a 1943–1944 Swedish civil-defense puzzle, before broader exposure via academic discussions and Martin Gardner's 1963 column.

Common Knowledge of Rationality

of refers to an epistemic condition in strategic interactions where all players are rational, everyone knows that all players are rational, everyone knows that everyone knows this, and so on ; this infinite hierarchy also extends to the of the game's structure and payoffs. This concept, formalized by , underpins the logical foundations of solution concepts in by ensuring that players' beliefs about others' actions align perfectly with across all levels of mutual awareness. In backward induction, of plays a crucial role by guaranteeing that players disregard non-credible threats or promises, as rational behavior in subgames is anticipated with certainty due to the shared of . For instance, in Reinhard Selten's chain-store paradox, an firm faces sequential entry decisions by potential competitors over a finite number of periods; backward induction, supported by of , implies the always accommodates entrants rather than fighting, as aggressive responses become non-credible in later periods and unravel backward, undermining any reputation-based deterrence. Without this epistemic assumption, players might sustain through beliefs in irrationality, but enforces the induction outcome by eliminating off-path behaviors as unreachable. A representative example is the finitely repeated , where players face a sequence of simultaneous-move dilemmas; under of , backward induction unravels potential , leading both players to defect in every period, as the last-period (mutual defection) is anticipated and propagates backward. Cristina Bicchieri's analysis highlights how this epistemic condition resolves paradoxes in such games by clarifying that deviations from induction require uncertainty about others' , which precludes. Aumann's 1976 framework provides the formal foundation, modeling through partitions in a state space where is a property of states, implying that in games of , of yields the backward induction solution as the unique rationalizable outcome. This extends to broader implications, such as iterative elimination of non-rationalizable strategies converging to the induction path, linking to refinements like rationalizability in multi-stage games. Establishing of proves challenging in practice, particularly in long finite games, where the infinite hierarchy of mutual becomes cognitively burdensome, often resulting in "induction failure" as players deviate from predicted paths due to incomplete epistemic alignment. In extended sequential settings, this difficulty amplifies, as sustaining the regress over numerous stages strains assumptions of perfect mutual foresight, leading to observed breakdowns in induction despite theoretical predictions.

Bounded Rationality Constraints

Bounded rationality, introduced by Herbert Simon in the 1950s, posits that decision-makers operate under constraints of limited computational capacity, incomplete information, and finite foresight, leading them to pursue satisficing behaviors rather than exhaustive optimization. In the context of backward induction, these limitations undermine the assumption of perfect rationality, as agents cannot feasibly compute or anticipate outcomes across extended decision horizons, resulting in approximations of inductive reasoning rather than full unraveling. Level-k thinking models formalize these constraints by assuming players engage in bounded levels of induction, where a level-k player best-responds to beliefs about others' level-(k-1) reasoning, often truncating at shallow depths due to cognitive limits. Rosenthal's exemplifies this: while backward induction predicts immediate to avoid unraveling, experimental observations show players terminating play early—reflecting bounded foresight that prevents full anticipation of distant equilibria—rather than adhering to the predicted path. Theoretical frameworks like Levinthal's 1997 model of adaptation on rugged landscapes highlight how bounded computation restricts exhaustive search in strategic interactions, mirroring game-theoretic settings where agents navigate complex payoff structures without infinite processing power. provides supporting evidence, with studies demonstrating that such limits lead to deviations from backward induction predictions in multi-stage scenarios. These constraints have key implications for finitely repeated games, where bounded rationality sustains by preventing the complete backward induction unraveling that would otherwise enforce throughout.

Advanced Extensions

Subgame Perfect Equilibrium

(SPE), also referred to as subgame perfect Nash equilibrium (SPNE), is a refinement of the concept for extensive-form games, requiring that the strategy profile constitutes a in every of the original . This solution concept was introduced by in 1965 as a way to ensure sequential throughout the game tree, extending the logic of backward induction beyond settings. In games with , SPE aligns directly with the outcomes of backward induction applied to multi-stage structures. However, in imperfect information environments, such as signaling games, computing SPE involves backward induction starting from terminal nodes, where players' actions are evaluated based on sequential rationality given their beliefs about unobserved types or histories. Beliefs at information sets are updated using Bayes' rule whenever possible, ensuring that strategies are optimal conditional on all available at each decision point. A classic illustration is the beer-quiche signaling game analyzed by In-Koo and David M. Kreps in , where a sender of unknown type (strong or weak) chooses between drinking or in the morning, observed by a receiver who then decides whether to challenge the sender to a fight later that day. The strong type prefers for breakfast and gets payoffs of 3 (beer + fight), 2 (quiche + fight), 2 ( + no fight), 1 (quiche + no fight), while the weak type prefers quiche and avoiding a fight, getting 0 (beer + fight), 1 (quiche + fight), 1 ( + no fight), 2 (quiche + no fight). Applying backward induction, the receiver's optimal response depends on beliefs about the sender's type after observing the breakfast choice. In a pooling equilibrium where both types choose quiche, off-equilibrium beliefs (e.g., if is observed) can be set such that the receiver believes the sender is strong and challenges, deterring deviation by the weak type but allowing the strong type to potentially separate by choosing if beliefs support it. Separating equilibria emerge where the strong type chooses (signaling strength, leading to challenge) and the weak chooses quiche (pooling avoidance), verified by backward induction ensuring no profitable deviations at each information set given Bayesian-updated beliefs. For computation in more complex cases, trembling-hand perfection serves as a refinement of SPE, requiring that the equilibrium is the limit of Nash equilibria in perturbed games where players make small mistakes (trembles) with positive probability. Selten developed this concept in 1975 to address non-uniqueness in SPE, particularly in games with implausible off-equilibrium behaviors. The key advantage of SPE over standard backward induction in perfect information games is its ability to handle hidden information, where pure induction alone may fail without specifying beliefs at non-singleton information sets, ensuring credibility across all subgames.

Experimental Evidence

Laboratory experiments on backward induction have been conducted since the 1980s, primarily using games like the and , where predictions of full unraveling to the are frequently rejected in favor of more cooperative or non-equilibrium play. A comprehensive survey by Camerer (2003) analyzes over 100 experimental studies across various strategic interactions, revealing consistent patterns of partial unraveling in finite-horizon games, where players exhibit limited foresight beyond a few steps rather than complete backward induction. Key empirical investigations include McKelvey and Palfrey's (1992) experiments with the , a canonical test of backward induction in finite perfect-information games, which demonstrate that while initial play deviates substantially from the induction prediction, repeated interactions lead to gradual convergence toward the subgame perfect outcome as subjects gain experience. Similarly, level-k models developed by Nagel (1995) and Stahl and Wilson (1995) provide a better fit to observed in guessing and p-beauty games than full assumptions, positing that players reason iteratively to bounded depths (typically k=1 or 2), explaining failures of complete induction without invoking . Several factors influence adherence to backward induction in experiments, including stake sizes, where higher incentives promote deeper reasoning and closer alignment with predictions; repetition, which fosters learning and partial unraveling; and clarity of instructions, which reduces about the game's . Field evidence from real-world settings, such as auctions and , further highlights deviations; for instance, analysis of over 40 years of data from the TV "" reveals persistent failures of backward induction even in high-stakes environments, with contestants often ignoring later-stage incentives. Post-2010 developments incorporate neuroeconomic methods, such as fMRI studies showing distinct activations for shallow versus deep reasoning in sequential games; Coricelli and Nagel (2012) found increased activity associated with abstract backward induction learning, but persistent reliance on experiential heuristics for limited depths. Recent learning models integrate Bayesian updates to capture how players revise beliefs about opponents' over time, better explaining gradual convergence in repeated games than static induction. In , 2020s simulations using AI agents, such as in multi-agent environments, test induction robustness and reveal that even advanced algorithms deviate under noise or incomplete information, informing behavioral insights beyond lab settings.

References

Add your contribution
Related Hubs
User Avatar
No comments yet.