Evolution of the Learning Rule - ISR Encyclopedia Of Artificial Intelligence Aug 2008 eBook ELO

Another interesting approximation to the development of ANNs by means of EC is the evolution of the learning rule. This idea emerges because a training algorithm works differently when it is applied to networks with different architectures. In fact, and given that a priori, the expert usually has very few knowledge about a network, it is preferable to develop an automatic system to adapt the learning rule to the architecture and the problem to be resolved.

There are several approximations to the evolution of the learning rule (Crosher, 1993) (Turney, Whitley & Anderson, 1996), although most of them are based only on how the learning can modify or guide the evolution, and in the relation between the architecture and the connection weights. Actually, there are few works that focus on the evolution of the learning rule in itself (Bengio & Bengio, Cloutier & Gecsei, 1992) (Ribert, Stocker, Lecourtier & Ennaji, 1994).

One of the most common approaches is based on setting the parameters of the BP algorithm: learning rate and momentum. Some authors propose methods

in which an evolutionary process is used to find these

parameters while leaving the architecture constant (Kim, Jung, Kim & Park, 1996). Other authors, on the other hand, propose codifying these BP algorithm parameters together with the network architecture inside of the individuals of the population (Harp, Samad & Guha, 1989).

FUTURE TRENDS

The evolution of ANNs has been a research topic since some decades ago. The creation of new EC and, in general, new AI techniques and the evolution and improvement of the existing ones allow the development of new methods of automatically developing of ANNs. Although there are methods that (more or less) automatically develop ANNs, they are usually not very

efficient, since evolution of architectures, weights and

learning rules at once leads to having a very big search

space, so this feature definitely has to be improved.

CONCLUSION

The world of EC has provided a set of tools that can be applied to optimization problems. In this case, the

problem is to find an optimal architecture and/or weight

value set and/or learning rule. Therefore, the development of ANNs was converted into an optimization problem. As the described techniques show, the use of EC techniques has made possible the development of ANNs without human intervention, or, at least, mini- mising the participation of the expert in this task.

As has been explained, these techniques have some problems. One of them is the already explained permutation problem. Another problem is the loss of

efficiency: the more complicated the structure to evolve is (weigths, learning rule, architecture), less efficient

the system will be, because the search space becomes much bigger. If the system has to evolve several things at a time (for example, architecture and weights so the ANN development is completely automated), this loss

of efficiency increases. However, these systems still

work faster than the whole manual process of designing and training several times an ANN.

REFERENCES

Alba E., Aldana J.F. & Troya J.M. (1993) Fully automatic ANN design: A genetic approach. Proc. Int. Workshop Artificial Neural Networks (IWANN’93), Lecture Notes in Computer Science. (686) 399-404. Andersen H.C. & Tsoi A.C. (1993) A constructive algorithm for the training of a multilayer perceptron based on the genetic algorithm. Complex systems 7 (4) 249-268.

Angeline P.J., Suders G.M. & Pollack J.B. (1994) An evolutionary algorithm that constructs recurrent neural networks. IEEE Trans. Neural Networks. (5) 54-65. Bengio S., Bengio Y., Cloutier J. & Gecsei J. (1992) On the optimization of a synaptic learning rule. Pre- prints of the Conference on Optimality in Artificial and Biological Neural Networks.

Cantú-Paz E. & Kamath C. (2005) An Empirical Com- parison of Combinatios of Evolutionary Algorithms and

Neural Networks for Classification Problems. IEEE Transactions on systems, Man and Cybernetics – Part B: Cybernetics. 915-927.

ANN Development with EC Tools

A

Crosher D. (1993) The artificial evolution of a gener- alized class of adaptive processes. Preprints of AI’93 Workshop on Evolutionary Computation. 18-36. Greenwood G.W. (1997) Training partially recurrent neural networks using evolutionary strategies. IEEE Trans. Speech Audio Processing. (5) 192-194.

Harp S.A., Samad T. & Guha A. (1989) Toward the genetic synthesis of neural networks. Proc. 3rd Int. Conf. Genetic Algorithms and Their Applications. 360-369.

Haykin, S. (1999). Neural Networks (2nd ed.). Engle- wood Cliffs, NJ: Prentice Hall.

Holland, J.J. (1975) Adaptation in natural and artifi- cial systems. Ann Arbor, MI: University of Michigan Press.

Husbands P., Harvey I., Cliff D. & Miller G. (1994) The use of genetic algorithms for the development of sensorimotor control systems. From Perception to Ac- tion. (P. Gaussier and JD Nicoud, eds.). Los alamitos CA: IEEE Press.

Kim H., Jung S., Kim T. & Park K. (1996) Fast learning method for backpropagation neural network by evolutionary adaptation of learning rates. Neurocomputing, 11(1) 101-106.

Lovell D.R. & Tsoi A.C. (2002) The Performance of the Neocognitron with various S-Cell and C-Cell Transfer Functions, Intell. Machines Lab., Dep. Elect. Eng., Univ. Queensland, Tech. Rep.

McCulloch W.S., & Pitts, W. (1943) A Logical Calculus of Ideas Immanent in Nervous Activity. Bulletin of Mathematical Biophysics. (5) 115-133.

Miller G.F., Todd P.M. & Hedge S.U. (1989) Designing neural networks using genetic algorithms. Proceed- ings of the Third International Conference on Genetic algorithms. San Mateo, CA: Morgan Kaufmann. 379- 384.

Montana D. & David L. (1989) Training feed-forward neural networks using genetic algorithms. Proc. 11th Int. Joint Conf. Artificial Intelligence. San Mateo, CA: Morgan Kaufmann. 762-767.

Rabuñal, J.R. & Dorado J. (2005) Artificial Neural Networks in Real-Life Applications. Idea Group Inc.

Ribert A., Stocker E., Lecourtier Y. & Ennaji A. (1994) Optimizing a Neural Network Architecture with an Adaptive Parameter Genetic Algorithm. Lecture Notes in Computer Science. Springer-Verlag. (1240) 527-535.

Rumelhart D.E., Hinton G.E. & Williams R.J. (1986) Learning internal representations by error propagation. Parallel Distributed Processing: Explorations in the Microstructures of Cognition. D. E. Rumelhart & J.L. McClelland, Eds. Cambridge, MA: MIT Press. (1) 318-362.

Sietsma J. & Dow R. J. F. (1991) Creating Artificial

Neural Networks that generalize. Neural Networks. (4) 1: 67-79.

Turney P., Whitley D. & Anderson R. (1996) Special issue on the baldwinian effect. Evolutionary Computa- tion. 4(3) 213-329.

Whitley D., Starkweather T. & Bogart C. (1990) Genetic algorithms and neural networks: Optimizing connections and connectivity. Parallel Comput., Vol. 14, No 3. 347-361.

Yao X. & Shi Y. (1995) A preliminary study on design-

ing artificial neural networks using co-evolution. Proc. IEEE Singapore Int. Conf. Intelligence Control and Instrumentation. 149-154.

Yao X. & Liu Y. (1998) Toward designing artificial

neural networks by evolution. Appl. Math. Computa- tion. vol. 91, no. 1, 83-90.

KEy TERMS

Artificial Neural Networks: Interconnected set of many simple processing units, commonly called neurons, that use a mathematical model, that represents an input/output relation,

Back-Propagation Algorithm: Supervised learn-

ing technique used by ANNs, that iteratively modifies

the weights of the connections of the network so the error given by the network after the comparison of the outputs with the desired one decreases.

Evolutionary Computation: Set of Artificial In- telligence techniques used in optimization problems, which are inspired in biologic mechanisms such as natural evolution.

ANN Development with EC Tools

Genetic Programming: Machine learning technique that uses an evolutionary algorithm in order to optimise the population of computer programs accord-

ing to a fitness function which determines the capability

of a program for performing a given task.

Genotype: The representation of an individual on an entire collection of genes which the crossover and mutation operators are applied to.

Phenotype: Expression of the properties coded by the individual’s genotype.

Population: Pool of individuals exhibiting equal or similar genome structures, which allows the application of genetic operators.

Search Space: Set of all possible situations of the problem that we want to solve could ever be in.

A

In document ISR Encyclopedia Of Artificial Intelligence Aug 2008 eBook ELOHiM pdf (Page 163-166)