Other Games - Evolutionary Computation for Modeling and Optimization Daniel Ashlock pdf

In this section, we will touch briefly on several other games that are eas- ily programmable as artificial life systems. Two are standard modifications of Prisoner’s Dilemma; the third is a very different game, called Divide the Dollar.

The payoff matrix we used in Section 6.2 is the classic matrix appearing on page 8 of The Evolution of Cooperation. It is not the only one that game theorists allow. Any payoff matrix of the form given in Figure 6.8 for which S < Y < X < RandS+R <2X is said to be a payoff matrix for Prisoner’s Dilemma. The ordering of the 4 payoffs is intuitive. The second condition is required to make alternation of cooperation and defection worth less than sustained cooperation. We will begin this section by exploring the violation of that second constraint.

The Graduate School game is like Prisoner’s Dilemma, save that alternating cooperation and defection scores higher, on average, than sustained cooperation. The name is intended to refer to a situation in which both mem- bers of a married couple wish to go to graduate school. The payoff for going to school is higher than the payoff for not going, but attending at the same time causes hardship. For the iterated version of this game, think of two preschool- ers with a tricycle. It is more fun to take turns than it is to share the tricycle, and both those options are better than fighting over who gets to ride. We will use the payoff matrix given in Figure 6.9.

For the Graduate School game, we must redeﬁne out terms. Complete cooperation consists in two players alternating cooperation and defection. Par-

Player 2

C

D

C (X,X) (S,R)

Player 1

D (R,S) (Y,Y)

Fig. 6.8. General payoﬀ matrix for Prisoner’s Dilemma. (Prisoner’s Dilemma re- quires thatS < Y < X < RandS+R <2X.)

Player 2

C

D

C (3,3) (0,7)

Player 1

D (7,0) (1,1)

164 Evolutionary Computation for Modeling and Optimization

tial cooperation is exhibited when players both make the cooperative play together. Defection describes two players defecting.

Experiment 6.10 Use the software from Experiment 6.7 with the payoff ma- trix modified to play the Graduate School game. As in Experiment 6.5, save the final ecologies. Also, count the number of generations in which an ecology has a score above 3; these are generations in which it is clear there is complete cooperation taking place. Answer the following questions.

(i) Is complete cooperation rare, occasional, or common?

(ii) Is the self-play string histogram materially diﬀerent from that in Experi- ment 6.6?

(iii) What is the fraction of the populations that have a dominant strategy? A game is said to beoptionalif the players may decide whether they will or will not play. Let us construct an optional game built upon Iterated Prisoner’s Dilemma by adding a third move called “Pass.” If either player makes the play “Pass,” both score 0, and we count that round of the game as not played. Call this game Optional Prisoner’s Dilemma. The option of refusing to play has a profound effect on Prisoner’s Dilemma, as we will see in the next experiment. Experiment 6.11 Modify the software from Experiment 6.5 withn= 150to work on finite state automata with input and response alphabets {C, D, P}. Scoring is as in Prisoner’s Dilemma, save that if either player makes the P move, then both score zero. In addition to a player’s score, save the number of times he actually played instead of passing or being passed by the other player. First, run the evolutionary algorithm as before, with fitness equal to total score. Next, change the fitness function to be score divided by number of plays. Comment on the total level of cooperation as compared to the nonoptional game and also comment on the differences between the two types of runs in this experiment.

At this point, we will depart radically from Iterated Prisoner’s Dilemma to a game with a continuous set of moves. The game Divide the Dollaris played as follows. An inﬁnitely wealthy referee asks two players to write down what fraction of a dollar they would like to have for their very own. Each player writes a bid down on a piece of paper and hands the paper to the referee. If the bids total at most one dollar, the referee pays both players the amount they bid. If the bids total more than a dollar, both players receive nothing.

For now, we will keep the data structure for playing Divide the Dollar simple. A player will have a gene containing 6 real numbers (yes, we will allow fractional cents). The ﬁrst is the initial bid. The next 5 are the amount to bid if the last payout p (in cents) from the referee was 0, 0 < p ≤ 25, 25< p≤50, 50< p≤75, or p >75, respectively.

Experiment 6.12 Modify the software from Experiment 3.1 to work on the 6-number genome for Divide the Dollar given above. Set the maximum muta- tion size to be 3.0. Use a population size of 36. Replace the ﬁtness function

with the total cash a player gets in a round robin tournament with each pair playing 50 times. Run 50populations, saving the average ﬁtness and the low and high bids accepted in each generation of each population, for 60 genera- tions. Graph the average, over the populations, of the per generation ﬁtness and the high and low bids.

Divide the Dollar is similar to Prisoner’s Dilemma in that it involves cooperation and defection: high bids in Divide the Dollar are a form of defection; bids of 50 (or not far below) are a form of cooperation. Low bids, however, are a form of capitulation, a possibility not available in Prisoner’s Dilemma. Also, in Divide the Dollar the result of one player cooperating (say bidding 48) and one defecting (say bidding 87) is zero payoﬀ for both. From this discussion, it seems that single moves of Divide the Dollar do not map well onto single moves of Prisoner’s Dilemma. If we deﬁne cooperation to be making bids that result in a referee payout, we can draw one parallel, however.

Experiment 6.13 Following Experiment 6.7, modify the software from Ex- periment 6.12 so that it also saves the fraction of bids with payouts in each generation. Run30populations as before and graph the average fraction of ac- ceptance of bids per generation over all the populations. Modify the software to use tournament selection with tournament size 6 and do the experiment again. What were the eﬀects of changing the tournament size? Did they par- allel Experiment 6.7?

There is an infinite number of games we could explore, but we have done enough for now. We will return to game theory in future chapters once we have developed more artificial life machinery. If you have already studied game theory, you will notice that the treatment of the subject in this chapter differs radically from the presentation in a traditional game theory course. The ap- proach is experimental (an avenue only recently opened to students by large, cheap digital computers) and avoids lengthy and difficult mathematical analy- ses. If you found this chapter interesting or entertaining, you should consider taking a mathematical course in game theory. Such a course is sometimes found in a math department, occasionally in a biology department, but most often in an economics department.

Problems

Problem 220.In the Graduate School game, is it possible for a ﬁnite state automaton to completely cooperate with a copy of itself? Prove your answer. Write a paragraph about the eﬀect this might have on population diversity as compared to Prisoner’s Dilemma.

Problem 221.Suppose we have a pair of ﬁnite state automata of the sort we used to play Prisoner’s Dilemma or the Graduate School game. If the automata havenstates, what is the longest they can continue to play before

166 Evolutionary Computation for Modeling and Optimization

they repeat a set of states and responses they were both in before. If we were to view the pair of automata as a single ﬁnite state device engaged in self play, how many states would it have and what would be its input and response alphabets?

Problem 222.Take all of the one-state ﬁnite state automata with input and response alphabets{C, D}, and discuss their quality as strategies for playing the Graduate School game. Which pairs work well together? Hint: there are 8 such automata.

Problem 223. Essay.Explain why it is meaningless to speak of a single ﬁnite state automaton as coding a good strategy for the Graduate School game. Problem 224.Find an error-correcting strategy for the Graduate School game.

Problem 225. Essay.Find a real-life situation to which Optional Prisoner’s Dilemma would apply and write-up the situation in the fashion of the story of the drug dealer and his customer in Section 6.2.

Problem 226.Are the data structures used in Experiments 6.12 and 6.13 ﬁnite state automata? If so, how many states do they have and what are their input and response alphabets.

Problem 227.Is a pair of the data structures used in Experiments 6.12 and 6.13 a ﬁnite state automaton? Justify your answer carefully.

Problem 228. Essay. Describe a method of using finite state automata to play Divide the Dollar. Do not change the set of moves in the game to a discrete set, e.g., the integers 1–100, and then use that as the automaton’s input and response alphabet. Such a finite state automaton would be quite cumbersome, and more elegant methods are available. It is just fine to have the real numbers in the range 0–100 as your response alphabet, you just cannot use them directly as input.

Problem 229.To do this problem you must ﬁrst do Problem 228. Assume that misunderstanding a bid in Divide the Dollar consists in replacing the bid b with (100−b). Using the ﬁnite state system you developed in Problem 228, explain what an error-correcting strategy is and give an example of one.

Ordered Structures

The representations we have used thus far have all been built around arrays or vectors of similar elements, be they characters, real numbers, the ships’ systems from Sunburn, or states of a finite state automaton. The value at one state in a gene has no effect on what values may be present at another location, except for nonexplicit constraints implied by the fitness function.

In this chapter, we will work with ordered lists of items calledpermuta- tions, in which the list contains a specified collection of items once each. Just as we used the simple string evolver in Chapter 2 to learn how evolutionary algorithms worked, we will start with easy problems to learn how systems for evolving ordered genes work. The first section of this chapter is devoted to implementing two different representations for permutations: a direct repre- sentation storing permutations as lists of integers 0,1, . . . , nvarying only the order in which the integers appear, and therandom keyrepresentation, which stores a permutation as an array of real numbers. To test these representations, we will use them to minimize the number of reversals in a permutation, in effect to sort it, and to maximize the order of a permutation under compo- sition.

The second section of the chapter will introduce theTraveling Salesman problem. This problem involves ﬁnding a minimum-length cyclic tour of a set of cities. The third section will combine permutations with a greedy algorithm to permit us to evolve packings of various sizes of objects into containers with ﬁxed capacity; this is an interesting problem with a number of applications.

The last section will introduce a highly technical mathematical problem, that of locating Costas arrays. Used in the processing and interpretation of sonar data, some orders of Costas arrays are not known to exist. The author would be overjoyed if anyone finding one of these unknown arrays would inform him. The dependencies of the experiments in this chapter are given in Figure 7.1. Notice that there are several sets of experiments that do not share code. The basic definition of a permutation is simple: an order in which to list a collection of items, no two of which are the same. To work with structures of this type, we will need a bit of algebra and a cloud of definitions.

168 Evolutionary Computation for Modeling and Optimization

Exp 7.1 Exp 7.2

Exp 7.7

Exp 7.4 Exp 7.9 Exp 7.10 Exp 7.13

Exp 7.12 Exp 7.11 Exp 7.14 Exp 7.16 Exp 7.17 Exp 7.18 Exp 7.5 Exp 7.15 Exp 7.19 Exp 7.20 Exp 7.21 Exp 7.23 Exp 7.25 Exp 7.24 Exp 7.26 Exp 7.27

Exp 7.3 Exp 7.6 Exp 7.8

Exp 7.22

Ch 13

1 Evolving permutations to maximize reversals.

2 Explore the impact of permutation and population size. 3 Evolve permutations to maximize the permutation’s order. 4 Change permutation lengths when maximizing order. 5 Stochastic hill-climber baseline for maximizing order. 6 Introducing random key encodings for reversals. 7 Maximizing order with random key encodings. 8 Introducing the Traveling Salesman problem. 9 Random-key-encoded TSP.

10 Adding more cities to the TSP.

11 Exploring the diﬃculty of diﬀerent city arrangements. 12 Using random city coordinates.

13 Population seeding with the closest-city heuristic. 14 Population seeding with the random key encoding. 15 Closest-city and city-insertion heuristics.

16 Population seeding with the city-insertion heuristic. 17 Testing greedy packing fitness on random permutations. 18 Stochastic hill-climber for greedy packing fitness. 19 Evolving solutions to the Packing problem. 20 Exploring different cases of the Packing problem. 21 Problem case generator for the Packing problem. 22 Population seeding for the Packing problem. 23 Evolving Costas arrays.

24 Varying the mutation size. 25 Varying the crossover type.

26 Finding the largest size array you can. 27 Population seeding to evolve Costas arrays.

Deﬁnition 7.1 Apermutationof the setN={0,1, . . . , n−1}is a bijection of N with itself.

Theorem 1.There aren! :=n·(n−1)· · · · ·2·1diﬀerent permutations of n items.

Proof:

Order thenitems. There are nchoices of items onto which the ﬁrst item may be mapped. Since a permutation is a bijection, there are n−1 items onto which the second item may be mapped. Continuing in like fashion, we see the number of choices of destination for the ith item isn−i+ 1. Since these choices are made independently of one another with past choice not inﬂuencing present choice among the available items, the choices multiply, yielding the stated number of permutations.2

Example 9.There are several ways to represent a permutation. Suppose the permutationf is:f(0) = 0, f(1) = 2, f(2) = 4, f(3) = 1,andf(4) = 3. It can be represented thus in two-linenotation:

0 1 2 3 4 0 2 4 1 3

Two line notation lists the set in “standard” order in its ﬁrst line and in the permuted order in the second line. One linenotation,

0 2 4 1 3,

is two-line notation with the ﬁrst line gone. Another notation commonly used is calledcyclenotation. Cycle notation gives permutations as a list of disjoint cycles, ordered by their leading items, with each cycle tracing how a group of points are taken to one another. The cycle notation for our example is

(0)(1 2 4 3),

because 0 goes to 0, 1 goes to 2 goes to 4 goes to 3 returns to 1.

Be careful! If the items in a permutation make a single cycle, then it is easy to confuse one-line and cycle notation.

Example 10.Here is a permutation of the set {0,1,2,3,4,5,6,7} shown in one-line, two-line, and cycle notation.

Two line: 0 1 2 3 4 5 6 7 2 3 4 7 5 6 0 1 One line: 2 3 4 7 5 6 0 1 Cycle: (0 2 4 5 6)(1 3 7)

170 Evolutionary Computation for Modeling and Optimization

A permutation uses each item in the set once. The only real content of the permutation is the order of the list of items. Since permutations are functions, they can be composed.

Deﬁnition 7.2 Multiplicationof permutations is done by composing them. (f∗g)(x) :=f(g(x)). (7.1) Deﬁnition 7.3 The permutation that takes every point to itself is theiden- tity permutation. We give it the namee.

Since permutations are bijections, it is possible to undo them, and so permutations have inverses.

Deﬁnition 7.4 Theinverseof a permutationf(x)is the permutationf−1₍_x₎ such that

f(f−1(x)) =f−1(f(x)) =x.

In terms of the multiplication operation, the above would be written f∗f−1=f−1∗f =e.

Example 11.Suppose we have the permutations in cycle notationf = (02413) andg= (012)(34). Then f∗g=f(g(x)) = (0314)(2), g∗f =g(f(x)) = (0)(1423), f∗f =f(f(x)) = (04321), g∗g=g(g(x)) = (021)(3)(4), f−1=f−1(x) = (03142), g−1=g−1(x) = (021)(34).

Cycle notation may seem sort of weird at ﬁrst, but it is quite useful. The following deﬁnition and theorem will help you to see why.

Definition 7.5 The order of a permutation is the smallest number k such that if the permutation is composed with itselfktimes, the result is the identity permutatione. The order of the identity is 1, and all permutations of a finite set have finite order.

Theorem 2.The order of a permutation is the least common multiple of the lengths of its cycles in cycle notation.

Proof:

Consider a single cycle. If we repeat the action of the cycle a number of times less than its length, then its ﬁrst item is taken to some other member of the cycle. If the number of repetitions is a multiple of the length of the

cycle, then each item returns to its original position. It follows that, for a permutation, the order of the entire permutation is a common multiple of its cycle lengths. Since the action of the cycles on their constituent points is independent, it follows that the order is the least common multiple. 2

Deﬁnition 7.6 The cycle type of a permutation is a list of the lengths of the permutation’s cycles in cycle notation. The cycle type is an unordered partition ofn into positive pieces.

Example 12.Ifn= 5, then the cycle type ofe(the identity) is 1 1 1 1 1. The cycle type of (01234) is 5. The cycle type of (012)(34) is 3 2.

Max

n

Order

n

Order

n

Order

1

13

60 25 1260

2

14

84 26 1260

3 15 105

27 1540

4 16 140

28 2310

5

6 17 210

29 2520

6 18 210

30 4620

7

12 19 420

31 4620

8

15 20 420

32 5460

9

20 21 420

33 5460

10

30 22 420

34 9240

11

30 23 840

35 9240

12

60 24 840

36 13,860

Table 7.1.The maximum order of a permutation ofnitems.

Table 7.1 gives the maximum order possible for a permutation ofnitems (n≤36). The behavior of this number is sort of weird, growing abruptly with n sometimes and staying level other times. More values for the maximum order of a permutation ofnitems may be found in theOn Line Encyclopedia of Integer Sequences[51].

172 Evolutionary Computation for Modeling and Optimization

Permutation

Order

(012)(34)

6

In document Evolutionary Computation for Modeling and Optimization Daniel Ashlock pdf (Page 177-187)