Finite State Predictors - Evolutionary Computation for Modeling and Optimization Daniel Ashlo

A ﬁnite state automaton consists of an input alphabet, an response alphabet, a collection of states, a transition function, a response function, an initial state, and an initial response. The states are internal markers used as memory, like the tumblers of a combination lock that “remember” whether the user is currently dialing in the second or third number in the combination. The transition function encodes how the automaton moves from one state to another. The response function encodes the responses produced by the automaton, depending on the current state and input.

An example may help make some of this clear. Consider a thermostat. The thermostat makes a decision every little while and must not change abruptly from running the furnace to running the air-conditioner and vice versa. The input alphabet for the thermostat is {hot, okay, cold}. The response alphabet of a thermostat is {air-conditioner, do-nothing, furnace}. The states are

{ready, heating, cooling, just-heated, just-cooled}. The initial response is do- nothing. The initial state, transition function and response function are shown in Figure 6.2.

The thermostat uses the “just-cooled” and “just-heated” states to avoid going from running the air-conditioner to the furnace (or the reverse) abruptly. As an added benefit, the furnace and air-conditioner don’t pop on and off; the “just” states slow the electronics down to where they don’t hurt the poor machinery. If this delay were not needed, we wouldn’t need to distinguish between states and responses. We could let the states be the set of responses and “do” whatever state we were in. A finite state automaton that does this is called a Moore machine. The more usual type of finite state automaton, with an explicitly separate response function, is termed aMealeymachine. In general, we will use the Mealey architecture.

Notice that the transition functiont(shown in the second column of Figure 6.2) is a function from the set of ordered pairs of states and inputs to the set of states; i.e., t(state,input) is a member of the set of states. The response functionr(in the third column) is a function from the set of ordered pairs of states and inputs to the set of responses, i.e., r(state,input) is a member of the response alphabet.

Colloquially speaking, the automaton sits in a state until an input comes. When an input comes, the automaton generates a response (with its response

146 Evolutionary Computation for Modeling and Optimization Initial State: ready

When current state make a transition and respond

and input are to state with

(hot, ready) cooling air-conditioner (hot, heating) just-heated do-nothing (hot, cooling) cooling air-conditioner (hot, just-heated) ready do-nothing (hot, just-cooled) ready do-nothing (okay, ready) ready do-nothing (okay, heating) just-heated do-nothing (okay, cooling) just-cooled do-nothing (okay, just-heated) ready do-nothing (okay, just-cooled) ready do-nothing (cold, ready) heating furnace (cold, heating) heating furnace (cold, cooling) just-cooled do-nothing (cold, just-heated) ready do-nothing (cold, just-cooled) ready do-nothing

Fig. 6.2.A thermostat as a ﬁnite state automaton.

function) and moves to a new state (which is found by consulting the transition function). The initial state and initial response specify where to start.

For the remainder of this section, the input and response alphabets will both be{0,1}, and the task will be to learn to predict the next bit of an input stream of bits.

0 0/1

1/0

A

_B

Fig. 6.3.A ﬁnite state automaton diagram.

A finite state automaton of this type is shown in Figure 6.3. It has two states: state A and state B. The transition function is specified by the arrows in the diagram, and the arrow labels are of the form input/response. The initial response is on an arrow that does not start at a state and that points to the initial state. This sort of diagram is handy for representing automata on paper. Formally: the finite state automaton’s response function is r(A,0) = 1, r(A,1) = 0, r(B,0) = 1, r(B,1) = 0, and the initial response is 0. Its

transition function ist(A,0) =A,t(A,1) =B,t(B,0) =B,t(B,1) =A. The initial state isA. Initial response:0 Initial state:A State If 0 If 1 A 1→A 0→B B 1→B 0→A

Fig. 6.4. A ﬁnite state automaton table.

Figure 6.4 specifies the finite state automaton shown in Figure 6.3 in tabular format. This is not identical to the tabular format used in Figure 6.2. It is less explicit about the identity of the functions it is specifying and much easier to read. The table starts by giving the initial response and initial state of the finite state automaton. The rest of the table is a matrix with rows indexed by states and columns indexed by inputs. The entries of this matrix are of the form response→state. This means that when the automaton is in the state indexing the row and sees the response indexing the column, it will make the response given at the tail of the arrow, and then make a transition to the state at the arrow’s head.

You may want to develop a computer data structure for representing ﬁnite state automata (FSA). You should deﬁnitely build a routine that can print an FSA in roughly the tabular form given in Figure 6.4; it will be an invaluable aid in debugging experiments.

So that we can perform crossover with ﬁnite state automata, we will describe them as arrays of integers and then use the usual crossover operators for arrays. We can either group the integers describing the transition and response functions together, termed functional grouping, or we can group the integers describing individual states together, termed structuralgrouping. In Example 5, both these techniques are shown. Functional grouping places the integers describing the transition function and those describing the response function in contiguous blocks, making it easy for crossover to preserve large parts of their individual structure. Structural groupings place descriptions of individual states of an FSA into contiguous blocks, making their preservation easy. Which sort of grouping is better depends entirely on the problem being studied.

Example 5.We will change the ﬁnite state automaton from Figure 6.4 into an array of integers in the structural and functional manners. First we strip the ﬁnite state automaton down to the integers that describe it (setting A = 0 andB = 1) as follows:

148 Evolutionary Computation for Modeling and Optimization Initial response:0 Initial state:A State If 0 If 1 A 1→A 0→B B 1→B 0→A 0 0 1 0 0 1 1 1 0 0

To get the structural grouping gene we simply read the stripped table from left to right, assembling the integers into the array

0010011100. (6.1)

To get the functional gene we note that the pairs of integers in the stripped version of the table above are of the form

response state .

We thus take the ﬁrst integer (the response) in each pair from left to right, and then the second integer (the transition state) in each pair from left to right to obtain the gene

0010100110. (6.2)

Note that in both the functional and structural genes the initial response and initial state are the ﬁrst two integers in the gene.

We also want a deﬁnition of point mutation for a ﬁnite state automaton. Pick at random any one of the initial response, the initial state, any transition state, and any response; replace it with a randomly chosen valid value.

Now that we know how to do crossover and mutation, we can run an evolutionary algorithm on a population of finite state automata. For our first such evolutionary algorithm, we will use a task inspired by a Computer Recreations column in Scientific American. Somewhat reminiscent of the string evolver, this task starts with a reference string. We will evolve a population of finite state automata that can predict the next bit of the reference string as that string is fed to them one bit at a time.

We need to define the alphabets for this task and the fitness function. The reference string is over the alphabet{0,1}, which is also the input alphabet and the response alphabet of the automaton. The fitness function is called the String Prediction fitness function, computed as follows. Pick a reference string in{0,1}∗and a number of bits to feed the automaton. Bits beyond the length of the string are obtained by cycling back though the string again. Initialize fitness to zero. If the first bit of the string matches the initial response of the FSA, fitness is +1. After this, we use the input bits as inputs to the FSA, checking the response of the FSA against the next bit from the string; each time they match, fitness is +1. The finite state automaton is being scored on its ability to correctly guess the next bit of the input.

Example 6.Compute the String Prediction ﬁtness of the ﬁnite state automaton in Figure 6.3 on the string 011 with 6 bits.

Step FSM String bit State after Fitness guess guess 0 0 0 A +1 1 1 1 A +1 2 0 1 B - 3 0 0 A +1 4 1 1 A +1 5 0 1 B - Total ﬁtness: 4

The String Prediction fitness function gives us the last piece needed to run our first evolutionary algorithm on finite state automata.

Experiment 6.1 Write or obtain software for randomly generating, print- ing, and handling file input/output of finite state automata as well as the variation operators described above. Create an evolutionary algorithm using size-4 tournament selection, two-point crossover, single-point mutation, and String Prediction fitness. Use the structural grouping for your crossover. Run 30 populations for up to 1000 generations, recording time-to-solution (or the fact of failure) for populations of 100 finite state automata with

(i) Reference string001,6 bits, 4-state FSA, (ii) Reference string001111,12bits, 4-state FSA, (iii) Reference string 001111,12bits, 8-state FSA.

Deﬁne “solution” to consist in having at least one creature whose ﬁtness equals the number of bits used. Graph the fraction of populations that have succeeded as a function of the number of generations for all 3 sets of runs on the same set of axes.

Experiment 6.2 Redo Experiment 6.1 with functional grouping used to rep- resent the automaton for crossover. Does this make a diﬀerence?

Let’s try another fitness function. TheSelf-Driving Length functionis computed as follows. Start with the finite state automaton in its initial state with its initial response. Thereafter, use the last response as the current input. Eventually, the automaton must simultaneously repeat both a state and response. The number of steps it takes to do this is its Self-Driving Length fitness.

Example 7.For the following FSAs with input and response alphabet{0,1}, ﬁnd the Self-Driving Length ﬁtness.

150 Evolutionary Computation for Modeling and Optimization Initial response:1 Initial state:D State If 0 If 1 A 1→B 0→B B 1→A 0→B C 1→C 0→D D 0→A 0→C Time-step by time-step:

Step Response State

1 1 D 2 0 C 3 1 C 4 0 D 5 0 A 6 1 B 7 0 B 8 1 A 9 0 B

So, in time step 9, the automaton repeats the response/state pair 0, B. We therefore put its Self-Driving Length ﬁtness at 8.

Notice that in our example we have all possible pairs of states and responses. We can do no better. This implies that success in the Self-Driving Length ﬁtness function is a score of twice the number of states (at least over the alphabet{0,1}).

Experiment 6.3 Rewrite the software for Experiment 6.1 to use the Self- Driving Length fitness function. Run 30 populations of 100 finite state au- tomata, recording time to success (achieving the maximum possible fitness value) and cutting the automata off after 2000 generations. Graph the frac- tion of populations that succeeded after k generations, showing the fraction of failures on the left side of the graph as the distance below one. Do this experiment for automata with 4,6, and 8 states. Also report the successful strings.

It is easy to describe a finite state automaton that does not use some of its states. The Self-Driving Length fitness function encourages the finite state automaton to use as many states as possible. In Experiment 6.1 the string 001111, while possible for a 4-state automaton to predict, was difficult. The string 111110 would prove entirely impossible for a 4-state automaton (why?) and very difficult for a 6-state automaton.

There is a very large local optimum in Experiment 6.1 for an automaton that predicts the string 111110; automata that just churn out 1’s get relatively

high ﬁtness. Of the automata that churn out only 1’s, many use few states. The more states involved, the easier to have one that is associated with a response of 0, either initially or by a mutation. A moment’s thought shows, in fact, that 1-making automata that use a large number of states are more likely to have children that don’t, and so there is substantial evolutionary pressure to stay in the local optimum associated with a population of FSAs generating 1’s, and using a small number of states to do so. This leaves only extremely low probability evolutionary paths to an automaton that predicts 111110.

Where possible, when handed lemons, make lemonade. In Chapter 5, we introduced the lexical product of fitness functions. When attempting to op- timize for the String Prediction fitness function in difficult cases like 111110, the Self-Driving Length fitness function is a natural candidate for a lexical product; it lends much greater weight to the paths out of the local optimum described above. Let us test this intuition experimentally.

Experiment 6.4 Modify the software from Experiment 6.1 to optionally use either the String Prediction fitness function or the lexical product of String Prediction with Self-Driving Length, with String Prediction dominant. Report the same data as in Experiment 6.1, but running6- and8-state automata with both the plain and lexical fitness functions on the reference string111110using 12 bits. In your write-up, document the differences in performance and give all reasons you can imagine for the differences, not just the one suggested in the text.

Experiment 6.4 is an example of an evolutionary algorithm in which lexical products yield a substantial gain in performance. Would having more states cause more of a gain? To work out the exact interaction between additional states and the solutions present in a randomly generated population, you would need a couple of stiff courses in finite state automata or combinatorics. In the next section, we will leave aside optimization of finite state automata and proceed with coevolving finite state automata.

Problems

Problem 197.Suppose thatAis an alphabet of sizen. Compute the size of the set{s∈A∗:|s| ≤k} for any nonnegative integerk.

Problem 198.How many strings are there in{0,1}2m_{with exactly}_m_ones?

Problem 199.Notice that in Experiment 6.1 the number of bits used is twice the string length. What diﬀerence would it make if the number of bits were equal to the string length?

Problem 200.If we adopt the definition of success given in Experiment 6.1 for a finite state automaton predicting a string, is there any limit to the length of a string on which a finite state automaton with nstates can succeed?

152 Evolutionary Computation for Modeling and Optimization

Problem 201.Give the structural and functional grouping genes for the following FSAs with input and response alphabet {0,1}.

(i) Initial response: 1 Initial state: B State If 0 If 1 A 1→A 1→C B 1→B 0→A C 0→C 0→A (ii) Initial response: 1 Initial state: A State If 0 If 1 A 1→A 0→B B 1→C 1→A C 1→B 0→C (iii) Initial response: 0 Initial state: D State If 0 If 1 A 1→B 1→D B 1→C 0→A C 0→D 1→B D 0→A 0→C (iv) Initial response: 0 Initial state: D State If 0 If 1 A 0→B 0→D B 0→C 1→A C 1→D 0→B D 1→A 1→C

Problem 202.For each of the ﬁnite state automata in Problem 201, give the set of all strings the automaton in question would count as a success if the string were used in Experiment 6.1 with a number of bits equaling twice its length.

Problem 203.Prove that the maximum possible value for the Self-Driving Length ﬁtness function of an FSA with input and response alphabet{0,1}is twice the number of states in the automaton.

Problem 204.Give an example that shows that Problem 203 does not imply that the longest string a finite state automaton can succeed on in the String Prediction fitness function is of length 2nfor ann-state finite state automaton. Problem 205.In the text it was stated that a 4-state automatoncannotsuc- ceed, in the sense of Experiment 6.1, on the string 111110. Explain irrefutably why this is so.

Problem 206.Problems 203, 204, and 205 all dance around an issue. How do you tell whether a string is too “complex” for ann-state ﬁnite state automaton to completely predict? Do your level best to answer this question, over the input and response alphabet {0,1}.

Problem 207.Work Problem 203 over, assuming that the ﬁnite state automaton uses the input and response alphabets {0,1, . . . , n−1}. You will have to conjecture what to prove and then prove it.

In document Evolutionary Computation for Modeling and Optimization Daniel Ashlock pdf (Page 159-167)