Recent from talks
Nothing was collected or created yet.
Best response
View on WikipediaIn game theory, the best response is the strategy (or strategies) which produces the most favorable outcome for a player, taking other players' strategies as given.[1] The concept of a best response is central to John Nash's best-known contribution, the Nash equilibrium, the point at which each player in a game has selected the best response (or one of the best responses) to the other players' strategies.[2]
Correspondence
[edit]
Reaction correspondences, also known as best response correspondences, are used in the proof of the existence of mixed strategy Nash equilibria.[3][4] Reaction correspondences are not "reaction functions" since functions must only have one value per argument, and many reaction correspondences will be undefined, i.e., a vertical line, for some opponent strategy choice. One constructs a correspondence b(·), for each player from the set of opponent strategy profiles into the set of the player's strategies. So, for any given set of opponent's strategies σ−i, bi(σ−i) represents player i's best responses to σ−i.

Response correspondences for all 2 × 2 normal form games can be drawn with a line for each player in a unit square strategy space. Figures 1 to 3 graphs the best response correspondences for the stag hunt game. The dotted line in Figure 1 shows the optimal probability that player Y plays 'Stag' (in the y-axis), as a function of the probability that player X plays Stag (shown in the x-axis). In Figure 2 the dotted line shows the optimal probability that player X plays 'Stag' (shown in the x-axis), as a function of the probability that player Y plays Stag (shown in the y-axis). Note that Figure 2 plots the independent and response variables in the opposite axes to those normally used, so that it may be superimposed onto the previous graph, to show the Nash equilibria at the points where the two player's best responses agree in Figure 3.
There are three distinctive reaction correspondence shapes, one for each of the three types of symmetric 2 × 2 games: coordination games, discoordination games, and games with dominated strategies (the trivial fourth case in which payoffs are always equal for both moves is not really a game theoretical problem). Any payoff symmetric 2 × 2 game will take one of these three forms.
Coordination games
[edit]
Games in which players score highest when both players choose the same strategy, such as the stag hunt and battle of the sexes, are called coordination games. These games have reaction correspondences of the same shape as Figure 3, where there is one Nash equilibrium in the bottom left corner, another in the top right, and a mixing Nash somewhere along the diagonal between the other two.
Anti-coordination games
[edit]
Games such as the game of chicken and hawk-dove game in which players score highest when they choose opposite strategies, i.e., discoordinate, are called anti-coordination games. They have reaction correspondences (Figure 4) that cross in the opposite direction to coordination games, with three Nash equilibria, one in each of the top left and bottom right corners, where one player chooses one strategy, the other player chooses the opposite strategy. The third Nash equilibrium is a mixed strategy which lies along the diagonal from the bottom left to top right corners. If the players do not know which one of them is which, then the mixed Nash is an evolutionarily stable strategy (ESS), as play is confined to the bottom left to top right diagonal line. Otherwise an uncorrelated asymmetry is said to exist, and the corner Nash equilibria are ESSes.
Games with dominated strategies
[edit]
Games with dominated strategies have reaction correspondences which only cross at one point, which will be in either the bottom left, or top right corner in payoff symmetric 2 × 2 games. For instance, in the single-play prisoner's dilemma, the "Cooperate" move is not optimal for any probability of opponent Cooperation. Figure 5 shows the reaction correspondence for such a game, where the dimensions are "Probability play Cooperate", the Nash equilibrium is in the lower left corner where neither player plays Cooperate. If the dimensions were defined as "Probability play Defect", then both players best response curves would be 1 for all opponent strategy probabilities and the reaction correspondences would cross (and form a Nash equilibrium) at the top right corner.
Other (payoff asymmetric) games
[edit]A wider range of reaction correspondences shapes is possible in 2 × 2 games with payoff asymmetries. For each player there are five possible best response shapes, shown in Figure 6. From left to right these are: dominated strategy (always play 2), dominated strategy (always play 1), rising (play strategy 2 if probability that the other player plays 2 is above threshold), falling (play strategy 1 if probability that the other player plays 2 is above threshold), and indifferent (both strategies play equally well under all conditions).

While there are only four possible types of payoff symmetric 2 × 2 games (of which one is trivial), the five different best response curves per player allow for a larger number of payoff asymmetric game types. Many of these are not truly different from each other. The dimensions may be redefined (exchange names of strategies 1 and 2) to produce symmetrical games which are logically identical.
Matching pennies
[edit]One well-known game with payoff asymmetries is the matching pennies game. In this game one player, the row player (graphed on the y dimension) wins if the players coordinate (both choose heads or both choose tails) while the other player, the column player (shown in the x-axis) wins if the players discoordinate. Player Y's reaction correspondence is that of a coordination game, while that of player X is a discoordination game. The only Nash equilibrium is the combination of mixed strategies where both players independently choose heads and tails with probability 0.5 each.

Dynamics
[edit]In evolutionary game theory, best response dynamics represents a class of strategy updating rules, where players strategies in the next round are determined by their best responses to some subset of the population. Some examples include:
- In a large population model, players choose their next action probabilistically based on which strategies are best responses to the population as a whole.
- In a spatial model, players choose (in the next round) the action that is the best response to all of their neighbors.[5]
Importantly, in these models players only choose the best response on the next round that would give them the highest payoff on the next round. Players do not consider the effect that choosing a strategy on the next round would have on future play in the game. This constraint results in the dynamical rule often being called myopic best response.
In the theory of potential games, best response dynamics refers to a way of finding a Nash equilibrium by computing the best response for every player:
Theorem—In any finite potential game, best response dynamics always converge to a Nash equilibrium.[6]
Smoothed
[edit]
Instead of best response correspondences, some models use smoothed best response functions. These functions are similar to the best response correspondence, except that the function does not "jump" from one pure strategy to another. The difference is illustrated in Figure 8, where black represents the best response correspondence and the other colors each represent different smoothed best response functions. In standard best response correspondences, even the slightest benefit to one action will result in the individual playing that action with probability 1. In smoothed best response as the difference between two actions decreases the individual's play approaches 50:50.
There are many functions that represent smoothed best response functions. The functions illustrated here are several variations on the following function:
where E(x) represents the expected payoff of action x, and γ is a parameter that determines the degree to which the function deviates from the true best response (a larger γ implies that the player is more likely to make 'mistakes').
There are several advantages to using smoothed best response, both theoretical and empirical. First, it is consistent with psychological experiments; when individuals are roughly indifferent between two actions they appear to choose more or less at random. Second, the play of individuals is uniquely determined in all cases, since it is a correspondence that is also a function. Finally, using smoothed best response with some learning rules (as in Fictitious play) can result in players learning to play mixed strategy Nash equilibria.[7]
See also
[edit]References
[edit]- ^ Fudenberg & Tirole (1991), p. 29; Gibbons (1992), pp. 33–49.
- ^ Nash (1950).
- ^ Fudenberg & Tirole (1991), Section 1.3.B.
- ^ Osborne & Rubinstein (1994), Section 2.2.
- ^ Ellison (1993).
- ^ Nisan et al. (2007), Section 19.3.2.
- ^ Fudenberg & Levine (1998).
Bibliography
[edit]- Ellison, G. (1993), "Learning, Local Interaction, and Coordination" (PDF), Econometrica, 61 (5): 1047–1071, doi:10.2307/2951493, JSTOR 2951493
- Fudenberg, D.; Levine, David K. (1998), The Theory of Learning in Games, Cambridge, Massachusetts: MIT Press
- Fudenberg, Drew; Tirole, Jean (1991), Game Theory, Cambridge, Massachusetts: MIT Press, ISBN 9780262061414 Book preview.
- Gibbons, R. (1992), A Primer in Game Theory, Harvester-Wheatsheaf, S2CID 10248389
- Nash, John F. (1950), "Equilibrium points in n-person games", Proceedings of the National Academy of Sciences of the United States of America, 36 (1): 48–49, Bibcode:1950PNAS...36...48N, doi:10.1073/pnas.36.1.48, PMC 1063129, PMID 16588946
- Nisan, N.; Roughgarden, T.; Tardos, É.; Vazirani, V. V. (2007), Algorithmic Game Theory (PDF), New York: Cambridge University Press
- Osborne, M. J.; Rubinstein, Ariel (1994), A Course in Game Theory, Cambridge, Massachusetts: MIT Press
- Young, H. P. (2005), Strategic Learning and Its Limits, Oxford University Press
Best response
View on GrokipediaFundamentals
Definition
In game theory, a best response is a strategy selected by a player that maximizes their expected payoff given the strategies chosen by the other players in the game.[6] This concept is central to analyzing strategic interactions in normal-form games, where players simultaneously choose actions without knowledge of others' choices.[1] Formally, in an -player normal-form game with strategy sets for each player and payoff function , a pure strategy is a best response to the strategy profile of the other players if it satisfies That is, for all .[6][1] For mixed strategies, where each player randomizes over their pure strategies according to a probability distribution with , a mixed strategy is a best response to if where the expected payoff is given by [1][6] In normal-form games, best responses are often illustrated using payoff matrices for finite two-player games. Consider a generic two-player game where Player 1 chooses rows (strategies A or B) and Player 2 chooses columns (strategies X or Y), with payoffs as follows:| X | Y | |
|---|---|---|
| A | (3, 2) | (1, 4) |
| B | (4, 1) | (2, 3) |
Properties and Relation to Equilibria
The best response correspondence , which maps opponents' strategies to the set of optimal strategies for player , exhibits key mathematical properties that underpin equilibrium analysis in game theory. Under the assumption that player 's payoff function is continuous and quasi-concave in their own strategy, the best response correspondence is nonempty, convex-valued, and upper hemicontinuous.[9] These properties ensure that the joint best response correspondence over all players maps the compact, convex strategy space into itself in a manner suitable for fixed-point theorems. Specifically, in games with compact convex strategy sets and continuous quasi-concave payoffs, Kakutani's fixed-point theorem guarantees the existence of at least one mixed strategy Nash equilibrium, as the best response correspondence satisfies the theorem's conditions of upper hemicontinuity and convex values.[3] Uniqueness of the best response for a given profile of opponents' strategies holds when the payoff function is strictly quasi-concave in the player's own strategy, implying a single optimal response rather than a set. This strictness eliminates flat portions in the payoff landscape, ensuring the argmax is a singleton. In contrast, quasi-concavity alone suffices for existence and convexity but permits multiple best responses, leading to a correspondence with positive dimension. A strategy profile constitutes a Nash equilibrium if and only if each player's strategy is a best response to the others', formalized as for all players .[3] This fixed-point characterization highlights that Nash equilibria are precisely the intersection points of the best response correspondences across players. When multiple best responses exist for some players—due to payoff indifference—games can admit sets of equilibria, including pure strategy ones (where all players play deterministic strategies) and mixed strategy ones (involving randomization). Such multiplicity arises in non-strictly concave settings and motivates refinements like trembling-hand perfect equilibria, which select robust outcomes as limits of approximate equilibria under small perturbations to strategies, ensuring stability against minor errors in play.[10]Best Response Correspondences
In Coordination Games
Coordination games are a class of strategic interactions in which players receive higher payoffs when their actions align, creating incentives for mutual strategy selection to achieve preferred outcomes. These games typically feature multiple Nash equilibria, where each player's strategy is a best response to the others', reflecting the mutual reinforcement of coordinated choices. A canonical example is the Stag Hunt game, in which two hunters decide whether to pursue a stag (requiring cooperation) or a hare (pursuable independently). The payoff structure incentivizes matching: mutual stag yields (2, 2), mutual hare (1, 1), stag against hare (0, 1), and hare against stag (1, 0).[11] In this setup, a player's best response is to hunt stag if the opponent's probability of choosing stag exceeds 0.5, hare otherwise, and any mixture at exactly 0.5. When visualized in the mixed strategy space , where each axis represents a player's probability of selecting stag, the best response correspondences form L-shaped boundaries: for player 1, a horizontal line at probability 0 up to the opponent's 0.5, then a vertical line at 1 beyond 0.5, symmetric for player 2. These boundaries converge to the pure equilibria at (0,0) (all hare) and (1,1) (all stag), intersecting at the mixed equilibrium (0.5, 0.5). The Stag Hunt exhibits two pure Nash equilibria—all players choosing stag or all choosing hare—and one mixed Nash equilibrium where each plays stag with probability 0.5.[11] Among these, the all-stag equilibrium is payoff dominant due to its higher joint payoffs, while the all-hare may be risk dominant if the payoff advantage of stag is sufficiently small, as risk dominance prioritizes equilibria resilient to perturbations in beliefs about opponents' play. Another illustrative coordination game is the Battle of the Sexes, where two players prefer different joint activities but value coordination over mismatch. With payoffs structured as opera (2, 1) mutually, ballet (1, 2) mutually, and (0, 0) for mismatches, the best response for the player preferring opera is to choose it if the opponent's probability exceeds 1/3, ballet otherwise; symmetrically, the other player's threshold is 2/3 for ballet. This asymmetry leads to best responses that favor joint play, yielding two pure Nash equilibria (mutual opera, mutual ballet) and one mixed equilibrium where probabilities are 2/3 and 1/3 for preferred actions, respectively.[12]In Anti-Coordination Games
Anti-coordination games constitute a class of symmetric two-player strategic interactions in which players receive higher payoffs when they select differing actions, incentivizing strategic divergence rather than alignment. A foundational example is the Hawk-Dove game, originally formulated to model animal conflicts over resources, where "Hawk" represents an aggressive strategy and "Dove" a passive one. In this setup, the payoff matrix yields positive returns for mismatched play: a Hawk confronting a Dove secures the full resource value , while a Dove yields to a Hawk without cost; mutual Doves share each; but mutual Hawks engage in costly conflict, netting where is the injury cost.[13] The best response correspondence in anti-coordination games reflects this mismatch incentive, mapping an opponent's mixed strategy to the player's optimal counter-strategy. For the Hawk-Dove game, if the opponent plays Hawk with probability , the expected payoff to playing Hawk is , while playing Dove yields . The best response switches from pure Hawk (when ) to pure Dove (when ), forming a decreasing step function. Visualized in the unit square of mixed strategies (with axes for each player's Hawk probability), these correspondences appear as inverse L-shaped boundaries delineating regions of dominance, intersecting along the diagonal at the symmetric mixed equilibrium where .[13][14] Equilibria in anti-coordination games include two pure-strategy asymmetric Nash equilibria—(Hawk, Dove) and (Dove, Hawk)—where no player benefits from unilateral deviation, alongside a unique symmetric mixed-strategy Nash equilibrium at the intersection of best responses. In the Hawk-Dove game, this mixed equilibrium has each player adopting Hawk with probability , ensuring indifference between strategies. From an evolutionary perspective, the mixed strategy qualifies as an evolutionarily stable strategy (ESS), as a population converging to it resists invasion by mutant pure strategies, provided the cost exceeds the benefit ; pure equilibria, by contrast, are unstable to perturbations favoring the opposite strategy.[13][14] A illustrative variant is the Chicken game, akin to Hawk-Dove but framed in human brinkmanship scenarios like mutually assured destruction in diplomacy. Here, "Straight" (aggressive, Hawk-like) against "Swerve" (yielding, Dove-like) rewards the aggressor with high prestige while the yielder avoids catastrophe; mutual Straight results in mutual loss, and mutual Swerve yields modest coordination. Best responses emphasize de-escalation to perceived aggression—Swerve against Straight—but risk exploitation if both hesitate, highlighting how anti-coordination structures amplify tension in high-stakes mismatched incentives.[13]In Games with Dominated Strategies
In games with dominated strategies, the analysis of best responses is streamlined because suboptimal strategies are systematically excluded, leading to predictable player behavior and unique outcomes. A strategy for player is strictly dominated by another strategy if, for every possible strategy profile of the opponents, the payoff to player from exceeds that from .[15] Consequently, a strictly dominated strategy cannot constitute a best response to any conceivable beliefs about opponents' actions, as the dominating strategy always yields a superior payoff.[15] This property facilitates iterative elimination of dominated strategies, a process that refines the strategy space until only rationalizable strategies remain, where best responses are confined to the surviving options.[16] In such iterations, the best response at each step invariably selects the dominant strategy, progressively narrowing choices and often culminating in a singleton set of rationalizable strategies for each player.[16] For instance, in the Prisoner's Dilemma, "Defect" strictly dominates "Cooperate" for both players, as defection provides a higher payoff irrespective of the opponent's decision to cooperate or defect.[17] The best response correspondence in these games graphically manifests as a collapse to a single point or a horizontal line, indicating that the optimal response remains fixed at the dominant strategy across the full range of opponents' possible plays, effectively pruning all dominated alternatives from the feasible set.[15] This reduction ensures that games solvable through iterated dominance possess a unique pure-strategy Nash equilibrium, exemplified by the (Defect, Defect) outcome in the Prisoner's Dilemma, where mutual defection is the only intersection of best responses.[18]In Asymmetric Games
In payoff-asymmetric games, players receive different payoffs for the same strategy profile, resulting in best response correspondences that lack the symmetry found in payoff-symmetric games, where one player's best response to a strategy mirrors the other's. This asymmetry arises because each player's utility maximization depends on their unique payoff structure, leading to non-identical reaction functions even when strategies are comparable. For instance, in games like the Battle of the Sexes, one player may prefer one coordination outcome while the other prefers a different one, causing best responses to favor distinct pure strategies depending on the opponent's choice. The shapes of best response correspondences in these games can vary significantly, including straight lines (as in linear demand Cournot duopolies with differing costs), kinked functions (as in Stackelberg leader-follower models where the follower's response shifts at boundary points), and S-curves (as in smoothed or quantal response approximations to discontinuous reactions). Other possible shapes encompass downward-sloping lines (reflecting strategic substitutes) and upward-sloping lines (indicating strategic complements), yielding up to five distinct forms that influence the number and location of equilibria; for example, intersecting kinked or S-shaped responses can produce multiple Nash equilibria, while straight lines often yield unique intersections. These diverse shapes highlight how payoff differences prevent the mirroring of best responses, complicating equilibrium selection compared to symmetric cases. A representative example is the generalized matching pennies game with unequal gains, where the row player receives X >1 when both select action 1 (e.g., heads for row, heads for column), and 0 otherwise, while the column player receives 1 when actions differ, and 0 when they match with row action 2.[19] In this setup, the best responses are step functions: the row player (high payoff) chooses action 1 if the column player's probability of action 2 is less than X/(X+1) (>0.5 for X>1); the column player (low payoff) chooses action 1 if the row player's probability of action 1 is less than 0.5. The mixed Nash equilibrium has the row player mixing 50-50 on actions, and the column player selecting action 2 with probability X/(X+1) >0.5 (action 1 with 1/(1+X) <0.5). This reflects the column player's incentive to avoid the row's higher-stakes outcome more cautiously. Experimental data confirm deviations from this equilibrium due to own-payoff effects, with the row player (high X) observed to select action 1 more frequently than 0.5, e.g., around 0.60 for X=9, unlike the symmetric case (X=1) where both play 0.5.[19] Asymmetry impacts stability by ensuring best responses do not symmetrically oppose or complement each other, potentially creating multiple intersection points that are asymptotically stable under best response dynamics in some directions but unstable in others, unlike the unique cycling in symmetric zero-sum games like standard matching pennies. This non-mirroring property often results in equilibria where one player's strategy exerts greater influence, altering the robustness of outcomes to perturbations.[19]In Matching Pennies
The Matching Pennies game is a canonical example of a two-player zero-sum game in noncooperative game theory, where each player simultaneously selects either Heads or Tails.[20] If the choices match, Player 1 receives a payoff of +1 and Player 2 receives -1; if they mismatch, Player 1 receives -1 and Player 2 receives +1.[20] The payoff matrix for Player 1 (with Player 2's payoffs as the negative) is as follows:| Player 1 \ Player 2 | Heads | Tails |
|---|---|---|
| Heads | +1 | -1 |
| Tails | -1 | +1 |
Best Response Dynamics
Formulation
Best response dynamics describe the evolution of strategies in repeated or evolutionary game-theoretic settings, where players or populations update their strategies by selecting myopic best responses to the current strategies of others.[21] In these dynamics, agents focus solely on maximizing immediate payoffs against the prevailing strategy profile, without anticipating or accounting for future adjustments by opponents.[21] In continuous-time formulations, the dynamics for a player's strategy in a normal-form game are given by the differential inclusion where denotes the set of best responses for player to the strategies of others, and the dot represents the time derivative.[22] This setup models the instantaneous adjustment toward the best response, often resulting in a discontinuous vector field due to the set-valued nature of . In population games, the aggregate dynamics extend this to the population state , yielding where is the payoff vector to strategies, and is the set of payoff-maximizing strategy distributions.[21] Here, the fraction of the population adopting each strategy shifts toward those offering the highest fitness against the average population behavior, reflecting an evolutionary process where higher-payoff strategies proliferate.[21] Discrete-time versions, such as those derived from fictitious play, approximate these updates iteratively in finite strategy spaces.[21] In a multi-player normal-form game, the process proceeds as follows (pseudocode for synchronous updates):Initialize strategy profile x^0 for all players
For t = 1, 2, ..., T:
For each player i:
x_i^t = argmax_{s_i} u_i(s_i, x_{-i}^{t-1})
// x^t is the profile at time t
Initialize strategy profile x^0 for all players
For t = 1, 2, ..., T:
For each player i:
x_i^t = argmax_{s_i} u_i(s_i, x_{-i}^{t-1})
// x^t is the profile at time t
