scribblings on the bedlam wall: Game Theory XI: Metagames: The Punishing Prisoner's Dilemma

Now, let us look at another perspective on the Prisoner's Dilemma. Suppose we allow the players to communicate. What happens?

The following material is flagged Green Level. It is intended to reflect material that the author believes to be a matter of consensus among experts in the field. This belief may be incorrect, however; and as the author is not an expert and does not have an expert fact-checking the article, errors may creep in.

Let us suppose that two of the players in the Prisoner's Dilemma make an arrangement. The players decide this: if either defects, that person must make a side payment to the other, equal to the harm inflicted by the defection. While the offer is available, the game looks like this (as again, B>A>F>E and 2A>B+E) and, because A-E>B-A (as proven below), the dominant strategies are highlighted in green:
2A>B+E
2A-E>B
A-E>B-A

Response to offer: Accept

Response to offer: Reject

Offer

	Cooperate	Defect
Cooperate	(A,A)	(A,B-[A-E])
Defect	(B-[A-E],A)	(F, F)

=(A,A)

	Cooperate	Defect
Cooperate	(A,A)	(E,B)
Defect	(B,E)	(F, F)

=(F,F)

Do not offer

	Cooperate	Defect
Cooperate	(A,A)	(E,B)
Defect	(B,E)	(F, F)

=(F,F)

	Cooperate	Defect
Cooperate	(A,A)	(E,B)
Defect	(B,E)	(F, F)

=(F,F)

As you can see, applying metagame logic here allows for a rationale for players to cooperate: deterrence. Because a side payment is required for defection, each player benefits more from cooperation than from defection.
Let us examine what happens with pure punishment (no payment is made to someone who is defected against; instead, the defector simply loses the utility in question), with punishments of severity C and D, where C>B-A and D>F-E:

Response to offer: Accept

Response to offer: Reject

Offer

	Cooperate	Defect
Cooperate	(A,A)	(E,B-C)
Defect	(B-C,E)	(F-D, F-D)

=(A,A)

	Cooperate	Defect
Cooperate	(A,A)	(E,B)
Defect	(B,E)	(F, F)

=(F,F)

Do not offer

	Cooperate	Defect
Cooperate	(A,A)	(E,B)
Defect	(B,E)	(F, F)

=(F,F)

	Cooperate	Defect
Cooperate	(A,A)	(E,B)
Defect	(B,E)	(F, F)

=(F,F)

So, even with no reparations made (or, depending on how one looks at it, no protection against the selfishness of others), each person's interests are served by taking this approach.
Canny readers may note that this is similar to the Hobbesian view of government: that without it, each person would defect (resulting in life being "nasty, brutish, and short" as each person pursues their own gain at the expense of everyone else), but the establishment of a justice system (preferably one as draconian as possible) results in each person cooperating out of self-interest. There are a few problems with this perspective, some of which I will cover here:

As a general rule, people who break laws do not do so rationally, or at least with the expectation of being caught. If that were the case, there would be no crime in societies with harsh justice systems and near-universal surveillance.
The game only takes place over one turn, or with each turn uninfluenced by previous turns. In the real world, future games are affected by previous ones and the benefits of cooperation especially are impacted by punishment. This will be covered in a future installment of the Topic.
The game assumes that justice is perfect: that is, that there is never a false conviction or a false acquittal.

Let us look at the third of these. We will have a probability P(punishment|cooperation), which is the probability of punishment given cooperation (that is, the odds that someone cooperating will be falsely convicted) and a probability P(punishment|defection), which is the probability of punishment given defection (that is, the odds that someone defecting will be convicted).
Remember that C and D are expected utilities. In other words, they are the actual severity of the punishment multiplied by the odds that the punishment will be carried out, or SP(punishment|defection), where S is the true severity of the punishment, or more accurately, S(P[punishment|defection]-P[punishment|cooperation]). So, the true severity of the punishment must be C or D divided by P(punishment|defection)-P(punishment|cooperation). So, for pure deterrence to work, a justice system must hand out stricter punishments if it makes more mistakes, and (assuming a strictest possible sentence), there is a theoretical point of inaccuracy at which the system simply cannot function.

<<Metagames: Extortion | Game Theory | Zero-Sum Metagames: Randomness>>

scribblings on the bedlam wall

Pages

Wednesday, December 21, 2011

Game Theory XI: Metagames: The Punishing Prisoner's Dilemma

No comments:

Post a Comment