A Comparatively study of EW Algorithm and Genetic Algorithm based on CMST

(1)

11

A Comparatively study of EW Algorithm and Genetic Algorithm based on CMST

Arif Ali¹,Sapana Singh² M.Tech Student, IIMT Meerut (U.P) India Assistant Professor, IIMT Meerut (U.P) India [email protected], [email protected]

ABSTRACT

In this paper we represent the problems arises in network design, the CMST problem and the different solutions available for this problem. It tells about the EW algorithm available for CMST and also describes the problem with EW algorithm.

Capacitated Minimum Spanning Tree problem is an NP-complete problem. For designing an efficient telecommunication network the major issue is to find a minimum cost layout with a specified traffic allowed to transmit. Genetic algorithm (GA) is an optimization approach in which a population of strings is maintained, representing solutions to a specified problem. The GA then creates new populations from the old by the selection mechanism and allows the fittest to create new strings, which are expected closer to the optimum solution to that particular problem. The major advantage of the GA approach is that it doesn‘t depend on specific knowledge of the problem definition. As from the detail study of EW algorithm, it‘s found that EW fails to give optimal solution to the Dynamic CMST problem in some situations, such as when all the sub trees that are formed so far in the process, have weight just over k/2, where k is capacity constraint, making it impossible to merge any of the two sub trees, as it exceeds capacity constraint, due to this difficulty of merging two sub trees, EW chooses to connect each sub tree directly to the root. This might result in almost twice as many as the number of sub trees in any optimal solution. In this paper we will discuss the proposed GA algorithm for solving Dynamic CMST problem. First the two encoding schemes Prufer and NetKey are described followed by what GA operators that are applied. The results obtained from EW and GA algorithm are analyzed and compared.

1. INTRODUCTION

Capacitated Minimum Spanning Tree problem is an NP-complete problem. For designing an efficient telecommunication network the major issue is to find a minimum cost layout with a specified traffic allowed to transmit. One more thing regarding this problem is that as the time changes the network required to update.

From a long time this problem is taken care. A number of algorithms has invented regarding this problem.

The basic problem in network design is the centralized computer network design. In this case a central processor and a set of remote terminals with specific demands are given. Between a pair of terminals or between the central processor and a terminal there is a link that can be included for a given cost. The connection or link between a terminal and central processor must have the following properties:

 the network should be a tree;

 the traffic on any link of the network should be at most capacity constraint;

 the total design cost should be minimized.

When there is unit demand that means the demands of all terminals equal to one and the links have the same capacity, it is easy to solve the problem of capacitated minimum spanning tree. The terminals having non-unit demand that is not an easy task. When the demands from all terminals are not equal to one, then the

(2)

12 problem becomes the totally generalized CMST problem. The detail study of EW algorithm and some other heuristics for Dynamic CMST problem are given in this chapter..

A teleprocessing system may include many terminals at great distances from the computing center. Specification of a communication network for connecting the remote terminals to the central computer constitutes an important network design problem.

2. Capacitated Minimum Spanning Tree

The CMST problem is NP-hard [18]

problem. NP-hard (non deterministic polynomial-time hard), in computational complexity theory, is a class of problems informally at least as hard as the hardest problems in NP. A problem H is NP-hard if and only if there is an NP-complete problem L that is polynomial time reducible to H.NP-complete, also known as NP-S or NPC, means non-deterministic polynomial time. Problems are designated

―NP-complete‖ if their solutions can be quickly checked for correctness, and if the same solving algorithm used can solve all other NP problems. This NP-hard problem has important applications in network design, but it also receives attentions from the optimization community for being a nice example of an ―easy to state, but hard to solve‖ problem. CMST problem in which we can calculate minimum cost of a given spanning tree so the root of the tree not more than the total capacity in the sub tree of the graph‘s total demand of the vertices.

In the term of mathematical the Capacitated Minimum Spanning Tree problems is there is a undirected graph G=(V,E) and V is denoted set of n vertex , E is denoted the set of edges. The r is

treated as a root and can be central vertex and it is look like r  V and total capacity is k which is k>0 . the meaning of Capacitated Minimum Spanning Tree problems is to find MST in the every sub tree with sum of weight of given nodes(all), that are connected to r and k is at most.

Each node In CMPT has its specific demand and can detect the flow of traffic on a particular link.

a) Homogeneous demand case

every client node has its own demand and all clients nodes demand treated as one.

This case describe that every sub tree which is rooted reduces the problem in finding spanning tree have k at most nodes.

b) Heterogeneous demand case:

here is the case of non-unit demand shown below in the figure. the meaning of non-unit is that all node have not demand just one, every node demand could be one or more than one.

2.1 Dynamic CMST

Many problems in the field of communication networks are designed as graph problems. Even though, in most cases, the input graphs are static (remain unchanged), there are situations in which graphs are subject to discrete changes.

Networks represented as graphs, that experience changes in the topology are called dynamic networks. An interesting problem in dealing with dynamically changing networks is to perform the updates swiftly and efficiently without having to shut down the entire system.

There are some discrete modification are needed like deletion or addition of nodes which not a very easy work in the network communication

(3)

13 designing. There are two type of Dynamic Graph –

1. Fully dynamic in the fully dynamic addition of edegs or node both are possible and deletion of node or edges both are possible.

2. Partially Dynamicin the partially dynamic there is only one updation is possible it may be inserting a node or edge and it may be deletion of node or edge.

Insertion of a node is called incremental and deletion of node is called decremental.

Our approach is to addition a node which is new one into an existing CMST

Changes applied to a CMST known as Dynamic CMST problem, is an NP-hard problem because for applying a single change, a number of updations required to fulfill the constraint of the network design.

2.2 Applications of CMST

CMST problem is a problem in the designing of communication network. In this type of problem in which there must be a central node and all other node are attached with central node directly or it may be indirectly with their cost and node which is linked with the central node have capacity which is maximum that always maintain the network communication designing criteria. There are a lot of applications of CMST like--

2.3 Other applications

• Telecommunications network design

• Centralized communication network design

In a telecommunication network, a network is designed having the minimum cost by installing expensive (fiber-optic)

cable with its edges, with some capacity constraint k on the cable being used. The cable costs at the rate of per unit length.

Traffic flow to every client nodes can be controlled by central node in the design of network access network topology optimization is the problem that can be arises. This type of problem has some characteristics—

1. Need hierarchical n/w structure 2. Assume pattern of simple traffic

demand

Taking example of BSNL which has a centralized server and it want to create a minimum cost network design with fixed traffic flow on links. And A is treated as central node.

Fig 1: graph for BSNL Fully connected A

G

F

E D

C B

(4)

14 All nodes have weight=1 but D have weight=2. The total capacity W is 3. The cost matrix is:

Table1:Cost matrix for ‗BSNL‘ connection

Fig 2: Design of Telecommunication Network Now we are designed the network (telecommunication) which is suitable to sove the access n/w topology optimization and follow desired constraint. So now the total cost of design will be--

(A, B) + (B, E) + (A, C) + (C, F) + (F, G) + (A, D) which is

5 + 6 + 6 + 8 + 8 + 9 = 42 2.4 Problem Statement

In the real world, there are usually such cases that one has to consider weight of each edge and the capacity of the

network design based on the demand of each node. The problem may arise, for instance, when designing a layout for telecommunication system, besides the cost for connections between cities or terminals, the capacity constraint and other factors such as the time for communication or construction, the complexity for construction and even the reliability are all important and have to be taken into consideration.

The objective of this thesis is:

 Analysis and Comparison of EW and GA for dynamic capacitated minimum spanning tree.

 Comparison of GA with two encodings Prufer and Netkey for the capacitated minimum spanning tree.

The above comparison and analysis is performed with test data involves different network size (16, 25, 50 and 100 nodes) and taking unit and non-unit demand case for each network size.

3. EW (Esau & Williams) Algorithm

As it‘s known that for a Minimum Spanning Tree problem Kruskal’s algorithm [15] start with n subtrees, each containing a single node and then merging them until only one tree is left. In Kruskal’s Algorithm the merging is based on the proximity of the subtrees. And for the CMST problem the Kruskal’s Algorithm is slightly modified to find the feasible CMST for a given problem. One of the modifications is to connect the subtree directly to the root or central node if the sum of vertex or node weight is up to capacity constraint ‗k‘.

As in a CMST problem a communication line is capable of transmitting only a limited amount of

Node B C D E F G

A 5 6 9 10 11 15

B 9 6 6 8 17

A

B

C

E D F

G

(5)

15 traffic and still satisfy the application requirements, the desired configuration will be determined, in part, by the amount of traffic at each remote location. This method to be described assumes that the line-loading capabilities are given. The cost is proportional to distance will suffice as the basic variable in the construction of a most-economical configuration. In explaining the algorithm here the cost is proportional to distance; the algorithm assumes that a maximum distance configuration is one in which each remote location is connected to the center by a single link.

The EW (Esau & Williams) algorithm [5], [9] and [19] is based on the savings. Instead of merging two subtrees based on the proximity, EW introduces a new concept ‗savings‘ to merge any two subtrees. For connecting two subtrees EW compute savings for all subtrees and join those two subtrees, which produce the greatest savings.

The distance between two nodes, dist (i, j) represents distance between i and j. The distance between all nodes are stored in an n  n matrix, on the bases of this distance matrix the n  n saving matrix will be computed using the formula as:

Savings (i, j) = dist (i, j) –dist (i, r)

‗r‘ is the central node/vertex. The subtree i and j, which shows the best savings will be connected if it also fulfill the capacity constraint.

Let’s take an example: For a given fully connected graph, compute CMST through EW. First calculate the saving for each node and then select the best saver link to connect. A is the root node, total with capacity k= 3, and the demand for each node is 1 except c, where demand = 2.

Table 2: Cost matrix for 5 node graph

Fig 3: Fully Connect Graph

I^STIteration

Tradeoff B= (B, D) = dist (B, D) – dist (B, A) = 7-5= 2

Tradeoff C= (C, D) = dist(C, D) – dist(C, A) = 9-2= 5

Tradeoff D= (D, E)= dist(D, E) – dist(D, A) = 1-10= -9

Tradeoff E= (E, D)= dist(E, D) – dist(E, A) = 1-6= -5

Here the most negative value for D which means the best saving selects that And continue to II_nd Iteration and for that recomputed the tradeoff to get the next best saving link.

II^nd Iteration

Tradeoff B= (B, D) = dist (B, D) – dist (B, A) = 7-5= 2 unchanged

Tradeoff C= (C, D) = dist(C, D) – dist(C, A) = 9-2= 5 unchanged

Tradeoff D= (D, E)= dist(D, E) – dist(D, A) = 1-10= -9

Tradeoff E= (E, D)= dist(E, D) – dist(E, A) = 1-6= -5 unchanged

Nodes B C D E

A 5 2 10 6

B 8 7 11

C 9 15

D 1

A

B

D C E

(6)

16 Recomputed Tradeoff for D

Tradeoff D= (D, B) = 7-6 = 1

Now most negative is ED but it‘s already connected so recomputed the tradeoff of E.

Tradeoff E= (E, B) = 11-6= 5

Now because no negative value left so all subtrees will be connected to the root node, and the CMST cost is 14.

Fig 4: CMST through EW

3.1. Problems with Esau-William’s Algorithm

The problem with this algorithm [4] is that, it does not provide the optimal result for the CMST problem in all the cases.

Consider the case when during the algorithm , the weight must be k/2 which is coming when all su tree has been formed, as we know that k is the capacity constraints, as there is a difficulty when the we are merging th e 2 sub tree ,EW will connect root with the sub tree directly. This will give us twice optimal solution as number of sub tree as shown in figure (4).

(i) With 3⁺cost

(ii) With 2⁺cost Fig 5: (i) EW Solution (ii) Optimal Solution

4. Genetic Algorithms

Genetic algorithm (GA) [16] is an optimization approach in which a population of strings is maintained, representing solutions to a specified problem. The GA then creates new A

B

C D

E

B D F

C E G

A

B

D C

F

G E

A

(7)

17 populations from the old by the selection mechanism and allows the fittest to create new strings, which are expected closer to the optimum solution to that particular problem. The GA creates ‗s‘ set of strings from the bits and pieces of the previous strings, by adding random new data to keep the population from stagnating, in each generation. The end result of GA is a search strategy that is tailored for very large, complex search spaces. Genetic algorithm has been successfully applied to a number of combinatorial optimization problems.

The major advantage of the GA approach [6], [16] is that it doesn‘t depend on specific knowledge of the problem definition. The success of the algorithm is given to the various factors including its powerful parallel search capability, computation simplicity, robustness, global search capability, ability to combine with other heuristic procedures and independence from solution characteristics such as linear or non-linear constraints and discrete or continuous search space.

4.1. Operators Used In GA 4.1.1. Encoding a Chromosome

The chromosome should contain information about the solution it represents. First the initial population should be generated through an encoding, for finding the optimal solutions. Different encoding schemes are:

4.1.1.1.Binary Encoding

One way of encoding is a binary string encoding. Each bit in the string represents some characteristics of the solution or it could represent whether or not some particular characteristics are there. Another possibility is that the

chromosome could contain just 4 numbers where each number is represented by 4 bits (the highest number therefore being 15).

The most commonly used for every chromosome is a string of bits of 1s and 0s.

For example:

Chromosome A: 10110101010001010101 and

Chromosome B: 01010001010100101000 Binary encoding is efficient but not always natural; sometimes corrections must be made after crossover and mutation to ensure that the genotype means something at the phenotype level. Some examples of binary encoding are the Knapsack problem: There are objects with given value and size. The knapsack has given capacity. Select the objects to maximize the value of in the knapsack, but do not extend knapsack capacity. The Eight Queens problem: There are eight queens. Find a way to place them on a chessboard so that no two queens attack each other.

4.1.1.2. Permutation Encoding

Generally, permutation encoding is used in ordering problems, such as the traveling salesman problem or a task-ordering problem. Every chromosome is a string of numbers, which represents number in a sequence. In the TSP each number would represent a city to be visited.

For example:

Chromosome A: 1 5 3 2 6 5 Chromosome B: 5 3 6 2 4 7

The example of problems for Permutation encoding is The Traveling Salesman problem: There are cities and given

(8)

18 distances between them. Traveling salesman has to visit all of them, but he does not want to travel more than necessary. Find a city with a minimal traveled distance.

4.1.1.3 Value Encoding

Value encoding is used in problems where some complicated value, such as real numbers; characters are used.

For example:

Chromosome A: 1.232 3.452 2.65 0.454

Chromosome B:

ABDDDHSGHGSHGSGSHGSWE

Chromosome C: (back) (right) (left) (forward)

Value encoding is useful for certain specialist problems (e.g. evolving weights for neural networks), but requires special mutation and crossover mechanism. The example of this encoding is the problem of Neural Network where defined architecture of neural network is given.

Find weights between neurons in the neural network to get the desired output from the network.

4.1.1.4 Tree Encoding

Tree encoding is used mainly for evolving programs or expressions for genetic programming. In this encoding every chromosome is a tree of some objects, such as function or commands in programming language such as:

Chromosome A: (+  (/ 5 y))

Chromosome B: (do until step wall)

Fig 6: (a) & (b) Tree encoding for two examples of chromosomes

For the purpose of designing an optimal network communication regarding DCMST problem different encoding schemes, Prufer and Network Random Key Encoding, are used.

5. Proposed Solution

As from the detail study of EW algorithm, it‘s found that EW fails to give optimal solution to the Dynamic CMST problem in various conditions, for example when all the sub trees those are formed very far in the process, have weight just mre then k/2, where k is capacity constraint, that makes it impossible to merge any two sub trees, as it has increased capacity constraint, because of this problem of merging two sub trees, EW connect each sub tree directly to the root. This may result in almost twice as many as the number of sub trees in any other optimal solution.

We will describe the proposed GA algorithm to solve Dynamic CMST problem.

+

 /

Do Until

step wall

y 5

(9)

19

5. Prufer Encoding

Prufer Encoding provides onr to one relation between tree and the set of all permutation of n-2 digits. This means that only permutation of n-2 digit in order to uniquely represent a tree with n vertices where all digits are integer beyween 1 and n. This permutation is usually known as prufer number. The prufer number is a value encoding for spanning tree. The encoding length is only n-2, this search space size is n^n-2and the correspondence between prufer numbers and spanning tree is one to one mapping. The probability of random production of a spanning tree is always 1, which implies this encoding equally and uniquely represent all possible tree and any initial population or offspring from crossover and mutation operation.

6. Conclusion:

This paper represent an overview of different methods available for the solution of the capacitated spanning tree problem.

It also examines the solutions given [3]

that, whether it is optimal or not. As these heuristics work dynamically or online, takes less time to compute for the addition of the new node. But it does not provide the optimal result than the recomputed CMST. Gives a brief knowledge of Genetic Algorithm. This includes different features of Genetic Algorithm and how it is implemented. GA is an evolutionary algorithm used for finding the optimal results by Prufer encoding schemes.

7. REFERENCES

[1] Gengui Zhou. Zhenyu Cao. Jian Cao, ,

“A Genetic Algorithm Approach on Capacitated Minimum Spanning Tree

Problem”, University of Technology, China, pages 725-729,2006.

[2] Lixia Hanr and Yuping Wang, “A Novel Genetic Algorithm for Degree- Constrained Minimum Spanning Tree Problem”, International Journal of Computer Science and Network Security, VOL.6 No.7A, pages 50-57,July 2006.

[3] Raja Jothi and Balaji Raghavachari, 2004 “Dynamic Capacitated Minimum Spanning Tree Problem”, Proc. 3^rd IEEE International Conference on Networking (ICN)).

[4] Raja Jothi and Balaji Raghavachari, 2004 “Revisiting Esau-Williams Algorithm: On the Design of Local Access Networks”, Department of Computer Science, University of Taxas at Dallas, pages 104-107.

[5] Raja Jothi and Balaji Raghavachari, 2003 ―Design of local access networks”, appear in 15th IASTED Inst. Conf. on Parallel and Distributed Computer and Systems (PDCS), pages 883-888.

[6] Melanle Mitchell, “An Introduction to genetic algorithms”, prentice Hall India, 2002 Edition.

[7] Günter R. Raidl, Christina Drexel, 2002 “A Predecessor Coding in an Evolutionary Algorithm for the Capacitated Minimum Spanning Tree Problem”, Institute of Computer Graphics, Vienna University of Technology, Austria, pages 309-316.

[8] Barbara Schindler, Franz Rothlauf and Hans-Jsef Pesch, 2002 “Evolution Strategies, Network Random Keys, and the One-Max Tree Problem”, S.Cagnoni et al. (Eds.): EvoWorkshops 2002, LNCS 2279, pp. 143-152, 2002.  Springer- Verlag Berlin Heidelberg.

[9] R. Patterson and H. Pirkul, January 2000 “Heuristic Procedure Neural Networks for the CMST Problem”, School of Management, University of

(10)

20 Texas, Dallas, Taxas, vol. 27, pages 1171- 1200.

[10] H.chen, Y. Wang, 2000 “An efficient algorithm for generating Prufer codes from labeled trees”, Theory of Computing Systems, vol. 33, pp. 97-105.

[11] Günter R. Raidl, November 2000 “An Efficient Evolutionary Algorithm for the Degree-Constrained Minimum Spanning Tree Problem”, Institute of Computer Graphics, Vienna University of Technology, Austria, pages 104 – 111.

[12] Franz Rothlauf, Armin Heinzl and David E. Goldberg, July 2002 “Network Random Keys – A Tree Network Representation Scheme for Genetic and Evolutionary Algorithms”, Illinois Genetic Algorithms Laboratory, University of Illinois at Urbana-Champaign, Urbana, IL 61801, pages 143-152.

[13] Gengui Zhou , 2008 “Genetic Algorithm Approach on Multi-criteria Minimum Spanning Tree Problem”, Department of Industrial and Systems Engineering, Ashikaga Institute of Technology, Ashikaga 326, Japan, pages 215-218.

[14] R.C. Prim, 2004 “Shortest connection networks and some generalizations”, Bell Systems Tech.

Journal, vol. 36, pages 1389-1401.

[15] B. Kruskal Jr., 1956 “On the shortest spanning subtree of a graph and the traveling salesman problem”, In Proc.

American Math Soc, volume 7, pages 48- 50.

[16] David E. Goldberg, 2009 “Genetic algorithms in search, optimization &

Machine learning”, Pearson Education, Second Edition.

[17] Holland, J.H., "Adaptation in Natural and Artificial Systems", MIT Press, 1975.

[18] Vibhav Vineet, Pawan Harish, 2009

“Fast minimum spanning tree for large graph” ACM New York pages 167-171.

[19] L.R. Esau and K.C. Williams, “On teleprocessing system design”, IBM Systems Journal 5(1966), pages 142-147.

[20] J.C. Bean, June 1992 “Genetics and random keys for sequencing and optimization”, Ann arbor, MI: Department of Industrial and Operations Engineering, University of Michigan, Technical Report, pages 43-92.