Mixed Strategies and Strictly vs. Non-Strictly Determined Zero-Sum Games
Question: What is the difference between a pure strategy and a mixed strategy? And how is this difference related to the distinction between a strictly determined vs. non-strictly determined zero-sum game?
Answer: A pure strategy is a complete plan of action for playing a game. In the very simple payoff matrix games we examine in POLI 388, a (pure) strategy for the row player is simply a row in the matrix and likewise a pure strategy for the column player is simply a column in the matrix. A mixed strategy is a probability distribution (or lottery) over pure strategies. For example, if a player has two pure strategies s1 and s2, a mixed strategy would be: {play s1 with a probability of .75 and play s2 with a probability of .25}.
Here is a general description of a zero-sum game. Remember that the payoff matrix for a zero-sum game may have just one number in each cell, representing the payoff for the row player P1. The payoff for the column player P2 is simply the negative of the payoff for P1 (and therefore the sum of the two payoffs associated with each cell is always zero). Given the convention that only P1’s payoffs are explicitly shown in the matrix, we can say that P1 is trying to maximize and P2 is trying to minimize payoffs.
Suppose Player P1 (who wants to maximize) looks at the worst thing that can happen when he plays each of his pure strategies (i.e., the minimum possible payoff for each strategy, sometimes called its security level) and then chooses the strategy that gives the maximum of these minimum payoffs (i.e., the strategy with highest security level). This is called his maximin (pure) strategy and its security level is P1’s maximin payoff (with respect to pure strategies). That is, P1 can guarantee himself this payoff, regardless of what P2 does. Now consider things from P2’s point of view. P2 (who wants to minimize) likewise looks at the worst thing that can happen when he plays each of his pure strategies (i.e., the maximum possible payoff for each strategy) and then chooses the strategy that gives the minimum of these maximum payoffs. This is called his minimax (pure) strategy and its security level is P1’s minimax payoff (with respect to pure strategies). So P2 can hold P1 down to this payoff, regardless of what P1 does.
Clearly it must be true in all zero-sum games that
maximin payoff for P1 ≤ minimax payoff for P2 .
That is, the payoff that P2 can hold P1 down to (regardless of what P1 does) can (by definition) never be less than the payoff the P1 can guarantee himself (regardless of what P2 does).
If maximin payoff = minimax payoff, the zero-sum game is strictly determined. P1 and P2 just identify their maximin and minimax (respectively) strategies and play them. Neither player has “wriggle-room” to try to outfox the other by trying “find out” his opponent’s strategy or deceive him about his own. Moreover, neither player will ever regret his strategic choice, because the resulting outcome is a (pure-strategy) equilibrium.
However, if maximin payoff < minimax payoff, the zero-sum game is non-strictly determined. If P1 and P2 just identify their maximin and minimax (respectively) strategies and play them, at least one of them will regret his strategic choice, because the resulting outcome is not a (pure-strategy) equilibrium. Each player has “wriggle-room” to try to outfox the other by trying “find out” his opponent’s strategy or deceive him about his own.
The question then arises whether P1 can devise any other strategies so that P1 can increase his maximin payoff and/or whether P2 can devise any other strategies so that P2 can reduce P1’s minimax payoff (so that the gap between them closes). The answer is to this question is yes. On the one hand, P1 can increase his maximin payoff by using mixed strategies. That is, some mixed strategy guarantees P1 a higher (expected) payoff than his maximin pure strategy does. (Note that this guarantee is with respect to whatever (pure or mixed) strategy P2 plays, not with respect to the outcome of the lottery of payoffs that the mixed strategy entails.) In like manner, P2 can reduce P1’s minimax payoff by using mixed strategies. That is, some mixed strategy for P2 holds P1 down to a lower (expected) payoff than P2’s minimax pure strategy does.
Indeed, it turns out that, when both the strategy sets of both players are expanded to include all possible mixtures of their pure strategies and their maximin and minimax mixed strategies and corresponding maximin and minimax payoffs are calculated, we return to the equality maximin payoff for P1 = minimax payoff for P2 that is true of strictly determined games with respect to pure strategies only. Moreover, when P1 and P2 use their maximin and minimax (respectively) mixed strategies, the result is an equilibrium, i.e., given the strategy choice of the other player, neither player could do better by using any other pure or mixed strategy.
If a player has three (or more pure) strategies, his optimal (maximin or minimax) mixed strategy may place zero probability on one (or more) of the player’s pure strategies. In particular, a player’s optimal mixed strategy always puts zero probability on any (sequentially) dominated pure strategy.
Finally, let’s note that it is really impossible to tell whether a player is using a mixed strategy in a “single-shot” game. If a P1 (the pitcher) has two pure strategies s1 (fast ball) and s2 (curveball) and P1 actually plays s2 (throws a curve ball), you can’t tell whether the P1 chose the pure strategy s2 or some mixed strategy (that puts non-zero probability on s2. But if the game is repeated many times, a mixed strategy such as {play s1 with a probability of .75 and play s2 with a probability of .25} reveals itself as {P1 plays s1 75% of the time and plays s2 25% of the time}, which can be observed. (Reread Dixit and Nalebuff, Chapter 7 on zero-sum duels between baseball pitchers and batters or between tennis players with this discussion in mind.)