scribblings on the bedlam wall: Game Theory: Part VI: Cyclic Games: The Iterated Prisoner's Dilemma I: Reciprocal Relationships

First of all, I know you're out there. I can hear you breathing. Especially you in Russia. And you, the one who reads this in Chrome. Please start saying something. Go ahead and comment; go ahead and follow the blog with the thing in the upper right corner. I promise I don't bite unless bitten.

The following material is flagged Green Level. It is intended to reflect material that the author believes to be a matter of consensus among experts in the field. This belief may be incorrect, however; and as the author is not an expert and does not have an expert fact-checking the article, errors may creep in.

So far, we have only looked at games that happen in one turn. But the world does not work that way. We rarely ever meet someone just once. What someone may think of us in the future must be taken into consideration, as must what someone has already done to us.

Let us explore what happens when we run through the Prisoner's Dilemma again and again.

First, when playing an iterated game, it is possible to change one's strategies in response to what one's opponent has done on the last round. For instance, if I have a tendency to make a particular move, you might adjust your moves to cope with that.
But for our understanding of how a game works to make sense, we should define rules for how we will change our strategies. A few possible ways of handling the Prisoner's Dilemma are listed below:

Altruistic: Always cooperate.
Sociopathic: Always defect.
Tit-For-Tat: Start with cooperate, afterward repeat opponent's last move.
Cynical Tit-For-Tat: Start with defect, afterward repeat opponent's last move.
Grim Trigger: Cooperate until opponent defects, always defect afterward.

So, we have a few ways of doing this. Let's see how they work out, by running each of them against the others and putting together their totals (Total so far for a given pair is listed in parentheses, using the column only. The row's player is listed first, then the column's.).
Also, this time around we will be using a general version of the Prisoner's Dilemma:

	Cooperate	Defect
Cooperate	(A,A)	(E,B)
Defect	(B,E)	(F, F)

where B>A>F>E
and 2A>B+E (this is in place to ensure that two mutual cooperations are better than exchanging between cooperation and defection)
First pass:

	Altruistic	Sociopathic	Tit-For-Tat	Cynical Tit-For-Tat	Grim Trigger
Altruistic	C,C (A)	D,C (B)	C,C (A)	D,C (B)	C,C (A)
Sociopathic	C,D (E)	D,D (F)	C,D (E)	D,D (F)	C,D (E)
Tit-For-Tat	C,C (A)	D,C (B)	C,C (A)	D,C (B)	C,C (A)
Cynical Tit-For-Tat	C,D (E)	D,D (F)	C,D (E)	D,D (F)	C,D (E)
Grim Trigger	C,C (A)	D,C (B)	C,C (A)	D,C (B)	C,C (A)
Totals:	3A+2E	3B+2F	3A+2E	3B+2F	3A+2E

Second pass:

	Altruistic	Sociopathic	Tit-For-Tat	Cynical Tit-For-Tat	Grim Trigger
Altruistic	C,C (2A)	D,C (2B)	C,C (2A)	C,C (B+A)	C,C (2A)
Sociopathic	C,D (2E)	D,D (2F)	D,D (E+F)	D,D (2F)	D,D (E+F)
Tit-For-Tat	C,C (2A)	D,D (B+F)	C,C (2A)	C,D (B+E)	C,C (2A)
Cynical Tit-For-Tat	C,C (2E)	D,D (2F)	D,C (B+E)	D,D (2F)	D,C (B+E)
Grim Trigger	C,C (2A)	D,D (B+F)	C,C (2A)	C,D (B+E)	C,C (2A)
Totals:	6A+4E	8F+2B	6A+B+2E+F	A+3B+2E+4F	6A+B+2E+F

Third pass:

	Altruistic	Sociopathic	Tit-For-Tat	Cynical Tit-For-Tat	Grim Trigger
Altruistic	C,C (3A)	D,C (3B)	C,C (3A)	C,C (B+2A)	C,C (3A)
Sociopathic	C,D (3E)	D,D (3F)	D,D (E+2F)	D,D (3F)	D,D (E+2F)
Tit-For-Tat	C,C (3A)	D,D (B+2F)	C,C (3A)	D,C (2B+E)	C,C (3A)
Cynical Tit-For-Tat	C,C (3E)	D,D (3F)	C,D (B+2E)	D,D (3F)	D,D (B+E+F)
Grim Trigger	C,C (3A)	D,D (B+2F)	C,C (3A)	D,D (B+E+F)	C,C (3A)
Totals:	9A+6E	5B+10F	9A+B+3E+2F	2A+4B+2E+7F	9A+B+2E+3F

Fourth pass:

	Altruistic	Sociopathic	Tit-For-Tat	Cynical Tit-For-Tat	Grim Trigger
Altruistic	C,C (4A)	D,C (4B)	C,C (4A)	C,C (B+3A)	C,C (4A)
Sociopathic	C,D (4E)	D,D (4F)	D,D (E+3F)	D,D (4F)	D,D (E+3F)
Tit-For-Tat	C,C (4A)	D,D (B+3F)	C,C (4A)	C,D (2B+2E)	C,C (4A)
Cynical Tit-For-Tat	C,C (4E)	D,D (4F)	D,C (2B+2E)	D,D (4F)	D,D (B+E+2F)
Grim Trigger	C,C (4A)	D,D (B+3F)	C,C (4A)	D,D (B+E+2F)	C,C (4A)
Totals:	12A+8E	6B+14F	12A+2B+3E+3F	3A+4B+3E+10F	12A+B+2E+5F

And since this looks stable, we can get a general state for the Xth pass:

	Altruistic	Sociopathic	Tit-For-Tat	Cynical Tit-For-Tat	Grim Trigger
Altruistic	C,C (XA)	D,C (XB)	C,C (XA)	C,C (B+[X-1]A)	C,C (XA)
Sociopathic	C,D (XE)	D,D (XF)	D,D (E+[X-1]F)	D,D (XF)	D,D (E+[X-1]F)
Tit-For-Tat	C,C (XA)	D,D (B+[X-1]F)	C,C (XA)	C,D/D,C ([X/2]B+[X/2]E)	C,C (XA)
Cynical Tit-For-Tat	C,C (XE)	D,D (XF)	D,C/C,D ([X/2]B+[X/2]E)	D,D (XF)	D,D (B+E+[X-2]F)
Grim Trigger	C,C (XA)	D,D (B+[X-1]F)	C,C (XA)	D,D (B+E+[X-2]F)	C,C (XA)
Totals:	3XA+ 2XE	[2X+2]B+ [3X-2]F	3XA+ [X/2]B+ [X/2+1]E+ [X-1]F	[X-1]A+ [X/2+2]B+ [X/2+1]E+ [3X-2]F	3XA+ B+ 2E+ [2X-3]F

So, let's compare these totals. Let's start by comparing the Tit-For-Tat ruleset to the sociopathic one.

(2X+2)B+(3X-2)F ? 3XA+(X/2)B+(X/2+1)E+(X-1)F
2XB+2B+3XF-2F ? 3XA+XB/2+XE/2+E+XF-F
2XB+2B+3XF-2F ? 3XG-XB-XE+E+XF-F(substitute B+E+G for A, where G=[2A-B-E]/2)
3XB+2B+XE-E+2XF-F ? 3XG
(B-E)+(B-F)+X(B+E)+2X(B+F) ? 3XG
(B-E)+(B-F) ? X(3G-3B-E-2F)

So, in other words, as long as 3B+E+2F<3(2A-[B+E])/2 (in this case, at least; and remember that 2A-[B+E]>0), eventually pure greed will lose out to the Tit-For-Tat strategy.
If you like, you can check Tit-For-Tat against other strategies. You can also try putting together something that beats the strategies I gave you.
Next time, a problem with this solution.
<<Non-Zero-Sum Games: The Simple Prisoner's Dilemma | Game Theory | Cyclic Games: The Iterated Prisoner's Dilemma II: (Electric Boogaloo:) The End of the Game>>

2 comments:

Robert Warren GilmoreDecember 19, 2011 at 4:17 AM
I'm afraid to comment because you may have covered my question in a later post that I haven't read yet.

Ah, what the hell. My question: You mentioned in a previous post that a game has a value, basically based on the rational choices that each player would make, taking into account expected value. Does value take into account the sort of predictive decision making that you talked about? And does prediction of the other player always cause an infinite chain of decision changing, or are there non-trivial zero-sum games where a prediction chain would lead to a final outcome?
JamesDecember 19, 2011 at 9:36 PM
Sort of. The value of an outcome to a player is determined by looking at what that player has after that outcome (in terms of things like money, possessions, time, and so forth), figuring out how much each is worth, and adding them up. The value of a game is based on what happens if you state the outcomes of the game in terms of the values of its outcomes, and then figure out what happens. And yes, if you bring things like empathy into the game and make each player's utility function partly dependent on the other's, you can get an infinite series.

pleasebecivil
civilitymakesthingseasier
pleesebecivil
civiltymakesthingseasier
pleasebecivilized
civilitymakesthingseesier

scribblings on the bedlam wall

Pages

Wednesday, October 19, 2011

Game Theory: Part VI: Cyclic Games: The Iterated Prisoner's Dilemma I: Reciprocal Relationships

2 comments: