Hard hat area, construction in progress

Logic — the Art of Reasoning

Mathematics  — the Art of Studying Patterns Using Logic

The Prisoner’s Dilemma Resolved

(for Players, Not Prisoners)

Uri Geva
MathVentures, a Division of Ten Ninety

Document Version 0.2 (draft)
Copyright © 2001–04 Uri Geva, Ten Ninety, All Rights Reserved.
This work may be redistributed under the terms of the GNU General Public License
as published by the Free Software Foundation.
This work comes with absolutely no warranty!



 

 Prisoner's Dilemma Resolved - Download
(PDF file, requiring Adobe Acrobat Reader, 30KB)

Abstract

The Prisoner’s Dilemma depends on two necessary conditions. First the two parties are rational and reasonable and act only after carefully evaluating their options, opting for the best strategy. And second, there is no communications between the two concerned parties. This paper will show that it is impossible to enforce the second condition and, therefore, there is no dilemma.
 

Introduction

The famous Prisoner’s Dilemma is more than a game-theory conundrum. It is a demonstration of the value of communications. It demonstrates unequivocally that, without communications, even rational people following careful analysis of their options, reason for and then take action that hurts their own interests, even when another more favorable alternative is available. And, indeed, there are real-world situations that are analogous to the Prisoner’s Dilemma.

Plenty of deliberation, reflection and study have been devoted to the logical reasoning that leads the parties facing the Prisoner’s Dilemma to opt for one choice or the other. This paper focuses on the communications aspect of this dilemma or, more correctly, on the lack thereof.

It is obvious that if the two parties, known as the prisoners or as the players, are able to communicate, then neither one faces a dilemma. For they can coordinate their action and select the one that is most favorable to both of them.

Click to Return to the Top of This Page

Background

Let’s start by briefly reviewing the basics and setting up common ground for the discussion.

Consider Bob and Carol who were arrested on circumstantial evidence that they had committed some crime. Since the district attorney did not have direct evidence to link either one with the crime, she separated bot suspects immediately upon their arrest and then made the following identical offer to both:

The Prisoner’s Dilemma, attributed to A.W. Tucker, is a situation in which two parties must take a single action out of two alternatives, both of them having identical choices. The outcome of their action depends on the combination of the action they make together. In other words, the outcome is determined only after the action of both parties is known. However, both parties know in advance, before they start contemplating which alternative to adopt, all possible outcomes. The last condition of the dilemma is, in my view and as I shall demonstrate, the most critical restriction: The two parties must not be able to communicate. In fact, in some description of the Prisoner’s Dilemma the two parties are separated even prior to being introduced to the dilemma and its conditions.

Carol and Bob face this dilemma only once. Therefore, each must make the most careful analysis before making up their mind and taking action. Their choices and the outcome resulting from each choice combination is summarized in the following table: 


Click to Return to the Top of This Page
 
Carol’s Action
Confess
Don’t Confess
Bob’s Action
Confess
3-year jail term
3-year jail term
5-year jail term
½-year jail term
Don’t Confess
½-year jail term
5-year jail term
2-year jail term
2-year jail term

Carol weighed her options as follows:

Carol concludes that her best choice is to confess.

Bob is also a rational person and his reasoning is identical to Carol’s. So he decides to confess.

The result of their decisions is that both Carol and Bob confess and both end up going to jail for two years.

Clearly, both would have preferred to receive only one-year jail term. But this requires that both Bob and Carol had to be assured that their partner would not confess or, if they were able to communicate, they could coordinate the choices they would make and the action they would take.

Carol and Bob faced the Prisoner’s Dilemma only once. In other variations the two parties face the dilemma many times. In these situations the Prisoner’s Dilemma is converted into a game in which both player play against the third party, which sets the rules. This third party can be considered as The Bank. Since the total sum of wins and losses does not equal zero, such these are non-zero-sum games.

While all of the essential conditions that were imposed on Bob and Carol also apply to the Prisoner’s Dilemma games. The Bank may vary the terms it sets to reward or punish the players, thereby increasing or decreasing the players’ motivation to opt for one choice or the other. But these variations do not alter the fundamental conditions that the players, just as Carol and Bob, face. However, a single factor is different and as, we shall see, it has the potential to alter the game completely. In fact, it removes the dilemma altogether.

In Games and Decisions Introduction and Critical Survey [DL&HR01] R. Duncan Luce and Howard Raiffe assert that "There appear to be no way around this dilemma." (p. 97.) This is true with for the single iteration dilemma. It is also true with respect to games in which the number of iterations, number of turns the players face the dilemma, is relatively low. However, if the players can play a large number of iteration, I will show, that this assertion is wrong. For in multi-iteration games, there is impossible for The Bank to enforce the one and most critical rule of the dilemma, the rule of isolation and non-communication. For under these conditions The Bank cannot prevent the players from communicating with each other and coordinating their actions.

This realization has significant implication beyond the arenas where such games, as the Prisoner’s Dilemma, are played. For, as game theorists and economists have pointed out, there are real-world situations that are analogous to the Prisoner’s Dilemma. 


Click to Return to the Top of This Page
One version of the game is shown by this table:
 
Carol’s Play
Action A
Action B
Bob’s Play
Action A
Lose $0.50
Lose $0.50
Lose $1.00
Win $2.00
Action B
Win $2.00
Lose $1.00
Win $0.10
Win $0.10

Again both players, being rational and analyzing their choices figure out, independently, that their Action A is the best choice no matter what is their opponent’s action. So both Bob and Carol each lose fifty cents. Now, since each round of the game is identical to the previous one, neither player has a reason to alter his or her strategy. Therefore, both will continue to lose money.

All loses are collected by The Bank and The Bank pays all of the winnings.

Click to Return to the Top of This Page
The Kink that Removes the Dilemma

At one point Carol gets tired of losing. She realizes that if they both opt for Action B, they will each win a dime. So she on the next round she plays Action B. As a result Bob wins one dollar and she loses one dollar. Seeing that, Bob has no reason to alter his strategy. If Carol continues to play Action B, she would lose money ten times faster than if she played Action A. Eventually, seeing that Bob is collecting money faster than she is losing, Carol decides that is better to lose slow and reverts to Action A.

Does this end all considerations of the game?

Is there some strategy that will let Carol exploit the fact that can play the game as long as she’d like. In other words, can Carol make a long run of sacrifices so that in the end she will recoup all her loses and start winning? In other words, can Carol get Bob to consistently cooperate with her so both of them will opt only for Action A?

The answer is yes. Such a strategy does exist. For another fundamental premise of the Prisoner Dilemma is that both prisoners and, in the current case, both players, are rational persons who analyze the situation and opt for the best strategy.

Carol decides to change Bob’s mind so he will no longer view each round (iteration) of the game as an independent of all of the other rounds (iterations). She does it by altering her choices opting for Action A or Action B in what may appear as an arbitrary, random pattern.

But Bob is rational and reasonable and since he knows that Carol is also rational and reasonable he concludes that there must be a reason behind what appears to be her madness. So he examines her choices. Soon he discovers a pattern. Carol used Action A to represent a 0 (zero) and Action B to represent a 1 (one) and, like a computer, she is using these characters to compose a message. Carol employs the game itself as the channel of communication to transmit the message to Bob. (If Bob and Carol are scouts and not computer geeks, she may use Action A to represent a dash and Action B for a dot in Morse code.)

Once Bob arrives at this conclusion it does not take him long to decode Carol’s message. After all, since the message is for him, she makes it as simple and as easy to decypher.
 

Click to Return to the Top of This Page


Soon after that Bob and Carol exchange messages and, to The Bank’s surprise and chagrin, both Carol and Bob coordinate their play and continuously take only Action B. As expected, since both win, neither has a reason to select Action A anymore and eventually Carol recoup all her loses and from them on wins.

Carol knows that, due to her sacrifice, she lost more than Bob and, thus, when the game is over his winnings will be greater than hers. But the alternative is to continue to lose of as long as the game lasts. So she discard this option.

Interestingly, if the game can last as long the players want it to last, The Bank, the entity that sets the rules for the game, cannot prevent Carol from discovering and implementing this strategy. Neither can The Bank prevent Bob from eventually figuring it out and then cooperating with Carol. By changing the amounts of winnings and losses The Bank can increase or decrease the initial motivation to experimentation taking Action A or Action B. But eventually different ratios between winnings, losses and between winnings and losses can only accelerate or slowdown Carol from searching for a long-term strategy. It cannot prevent her from it.

It is also not possible for The Bank to prevent Carol from transmitting the message encoded by her action to Bob. For an implicit rule of the game is that each player knows the result of each round (iteration) prior to the execution of the next round (iteration). If The Bank decides to let Bob and Carol play 100 rounds (iterations) before it lets each of them know the results of each iteration, then it might as well let them know only the cumulative result of all 100 iterations. For Carol encode her message by her cumulative action in each set of 100 iteration. That is, Carol will select Action A for 100 consecutive iterations to represent a 0 and 100 consecutive Action B to represent a 1. Whatever The Bank decides, Carol can work around it.

But, as it turns out, Carol does not have to devise a formal encoding syntax for her message to Bob. After all, Bob is as rational and as reasonable as she is. All Carol needs is to be persistent in her attempts to change his mind. She reasons that, although the outcome of each round depends on how both of them played, Bob and she are not playing against each other. Each of them is playing against the bank. In this game they depend on each other, they are not rivals. Since one player wants to win, she or he must get their counterpart to trust and cooperate with them.

Carol may take the following course of action. As long as Bob sticks with Action A, she takes Action B for ten consecutive rounds then she switches to Action A for 5 rounds and back to Action B for 10 rounds and so on until Bob, wondering what she is up to, decides to try for Action B.
 

Click to Return to the Top of This Page


Once Bob recognizes the 10-5-10-5… pattern of Carol’s action, he has no reason to try Action B while she is in the midst of the 5 Action A rounds. So he takes Action B when Carol is taking Action B and, lo-and-behold, they both win. From that point on, Bob has no reason to revert to Action A. But if he does, Carol resumes her 10-5-10-5 pattern of action.

Carol has a strategy just in case Bob makes the mistake of making Action B when Carol takes Action A. In this case Bob suffers a single massive lose, the largest possible loss in a single round, the sort of a sacrifice Carol has been making for some time now. He may not like it and reverts to taking Action A. Yet, regardless to Bob’s following moves, as soon as he takes Action B, on the following round (iteration) Carol takes Action B regardless to the rule of her pattern. If Bob stuck with Action B on the following rounds, there is no problem and they both win. If not, Carol restart the 10 Action B followed by 5 Action A pattern.

The purpose of this pattern is to get Bob to realize that as long as they both opt for Action A, they both lose; that he should experiment with Action B while Carol is doing so.

Bob may fear that after 10 consecutive iterations that both Carol and he took Action B, she will take Action A on the next iteration. If he is not willing to risk it as Carol has done for so long, and he takes Action A, Carol resume the 10-5-10 pattern again.

The point is that reasonable people can make sense of what may appear as random and arbitrary choice of actions. Eventually the parties, Bob and Carol, will reach an agreement and execute well-concerted actions such that both will win continuously.

And this is exactly what happens in the market place. Whether two gas stations on opposite corners of the same street intersection are engaged in price wars, or a few oil companies are attempting to increase their revenues or several airlines are struggling to improve their bottom line, they all engage in indirect implicit exchange of messages.

No law or regulation, anti-trust or otherwise, can stop this sort of exchanges. For in some cases, the interacting parties themselves are unaware of what they are doing. All regulators can do is, like The Bank, monitor the contents of the transmissions and make sure that no explicit price fixing occurs by means of encoded messages.

In the real world, The Bank is the public and the players are all who crave a piece of people’s money*. In this non-zero-sum economy, there are no limits — no time limit on how long the game can last, no limit on how many rounds the game can have and no limit on how much money the players can make. In the marketplace the players can and do communicate with other another, coordinate their moves. If under the sterile conditions of the Prisoner’s Dilemma game, not only that the prisoners can communicate, there is no way to preventing them from doing so, how could the public do any better? The best we can do is maintain a reasonable rate of revenue to the players. For the public, The Bank, should preserve their role as the rule setter and do not relinquish it to the players.

Click to Return to the Top of This Page


                  Some readers may question the validity of this statement on the ground that it implies that the real-world market is not a zero-sum game. Since consumers have only finite resources, any spent resource, cannot be spend elsewhere. Therefore, market economy should be considered as a zero-sum game. However, in the context of this paper the resources spend on the particular even that is considered to be the Prisoner’s Dilemma game are unlimited. Consider the customer of two gas stations. If the prices of either or both stations are sufficiently low, not only the regular customers of these stations will purchase at the station with the lowest price, but customers who ordinarily purchase gasoline elsewhere will come to buy gas at this station. In other words, the size of the shopping public at this intersection increases, drawing resources from other segments of the market. This "local" public, which with respect to this local price war (the Prisoner’s Dilemma game) is The Bank, has unlimited resources. Therefore, in this case it is not a zero-sum game.

References

MathVentures Home
Click to Return to the Top of This Page
Table of Contents


Copyright  © 1993-2005 Ten Ninety, All Rights Reserved
Last Update: Nov. 21, 2005