• No results found

Spanning trees of a graph

In document Invitation to Discrete Mathematics (Page 184-188)

Graphs: an introduction

5.3 Spanning trees of a graph

A spanning tree is one of the basic graph constructions:

5.3.1 Definition. Let G = (V, E) be a graph. An arbitrary tree of the form (V, E), where E ⊆ E, is called a spanning tree of the graph G. So a spanning tree is a subgraph of G that is a tree and contains all vertices of G.

Obviously, a spanning tree may only exist for a connected graph G.

It is not difficult to show that every connected graph has a spanning tree. We prove it by giving two (fast) algorithms for finding a span-ning tree of a given connected graph. In the subsequent sections we will need variants of these algorithms, so let us study them carefully.

5.3.2 Algorithm (Spanning tree). Let G = (V, E) be a graph with n vertices and m edges. We order the edges of G arbitrarily into a sequence (e1, e2, . . . em). The algorithm successively constructs sets of edges E0, E1, . . .⊆ E.

We let E0=∅. If the set Ei−1 has already been found, the set Ei is computed as follows:

Ei=

Ei−1∪ {ei} if the graph (V, Ei−1∪ {ei}) has no cycle Ei−1 otherwise.

The algorithm stops either if Ei already has n− 1 edges or if i = m, i.e. all edges of the graph G have been considered. Let Et denote the set for which the algorithm has stopped, and let T be the graph (V, Et).

5.3.3 Proposition (Correctness of Algorithm 5.3.2). If Algo-rithm 5.3.2 produces a graph T with n−1 edges then T is a spanning tree of G. If T has k < n− 1 edges then G is a disconnected graph with n− k components.

Proof. According to the way the sets Ei are created, the graph G contains no cycle. If k =|E(T )| = n − 1 then T is a tree according to Exercise 5.1.2, and hence it is a spanning tree of the graph G. If k < n− 1, then T is a disconnected graph whose every component is a tree (such a graph is called a forest). It is easy to see that it has n− k components.

We prove that the vertex sets of the components of the graph T coincide with the vertex sets of the components of the graph G. For contradiction, suppose this is not true, and let x and y be vertices

5.3 Spanning trees of a graph 167 lying in the same component of G but in distinct components of T . Let C denote the component of T containing the vertex x, and con-sider some path (x = x0, e1, x1, e2, . . . , e, x = y) from x to y in the graph G, as in the following picture:

x y

xi e

C

Let i be the last index for which xiis contained in the component C.

Obviously i < , and hence xi+1∈ C. The edge e = {xi, xi+1} thus does not belong to the graph T , and so it had to form a cycle with some edges already selected into T at some stage of the algorithm.

Therefore the graph T +e also contains a cycle, but this is impossible as e connects two distinct components of T . This provides the desired

contradiction. 2

Complexity of the algorithm. We have just shown that Algorithm 5.3.2 always computes what it is supposed to compute, i.e. a spanning tree of the input graph. But if we really needed to find spanning trees for some large graphs, should we choose this algorithm and spend our time programming it, or our money by buying some existing code?

To answer such a question is no simple matter, and algorithms are compared according to different, and often contradictory, criteria. For instance, it is important to consider the clarity and simplicity of the algorithm (a complicated or obscure algorithm easily leads to program-ming errors), the robustness (how do rounding errors or small changes in the input data influence the correctness of the output?), memory requirements, and so on. Perhaps the most common measure of com-plexity of an algorithm is its time comcom-plexity, which means the number of elementary operations (such as additions, multiplications, compar-isons of two numbers, etc.) the algorithm needs for solving the input problem. Most often the worst-case complexity is considered, i.e. the number of operations needed to solve the worst possible problem, one expressly chosen to make the algorithm slow, for a given size of input.

For computing a spanning tree, the input size can be measured as the number of vertices plus the number of edges of the input graph. Instead of “worst-case time complexity” we will speak briefly of “complexity”, since we do not discuss other types of complexity.

The complexity of an algorithm can seldom be determined precisely.

In order that we could even think of doing it, we would have to deter-mine exactly what the allowed primitive operations are (so, in principle, we would restrict ourselves to a specific computer), and also we would

have to describe the algorithm in the smallest details including various routine steps; that is, essentially look at a concrete program. Even if we did both these things, determining the precise complexity is quite laborious even for very simple algorithms. For these reasons, the com-plexity of algorithms is only analyzed asymptotically in most cases. We could thus say that some algorithm has complexity O(n3/2), another one O(n log n), and so on (here n is a parameter measuring the size of the input).

For a real assessment of algorithms, it is usually necessary to com-plement such a theoretical analysis by testing the algorithm for various input data on a particular computer. For example, if the asymptotic analysis yields complexity O(n2) for one algorithm and O(n log4n) for another then the second algorithm looks clearly better at first sight because the function n log4n grows much more slowly than n2. But if the exact complexity of the first algorithm were, say, n2− 5n and of the second one 20n(log2n)4, the superiority of the second algorithm will only show for n > 5· 106, and such a superiority is quite illusory from a practical point of view.

Let us try to estimate the asymptotic complexity of Algorithm 5.3.2.

We have described the algorithm on a “high level”, however. This doesn’t refer to a prestigious social position but to the fact that we have used, for instance, a test of whether a given set of edges contains a cycle, which cannot be considered an elementary operation even with a very liberal approach. The complexity of the algorithm will thus dep-end on our ability to realize such a complex operation by elementary operations.

For our Algorithm 5.3.2, we may note that it is not necessary to store all the edge sets Ei, and that all of them can be represented by a single variable (say, a list of edges) which successively takes values E0, E1, . . ..

The only significant question is how to test efficiently whether adding a new edge ei creates a cycle or not. Here is a crucial observation: a cycle arises if and only if the vertices of the edge ei belong to the same connected component of the graph (V, Ei−1). Hence we need to solve the following problem:

5.3.4 Problem (UNIONFIND problem). Let V = {1, 2, . . . , n}

be a set of vertices. Initially, the set V is partitioned into 1-element equivalence classes; that is, no distinct vertices are considered equiva-lent. Design an algorithm which maintains an equivalence relation on V (in other words, a partition of V into classes) in a suitable data structure, in such a way that the following two types of operations can be executed efficiently:

(i) (UNION) Make two given nonequivalent vertices i, j ∈ V equiva-lent, i.e. replace the two classes containing them by their union.

5.3 Spanning trees of a graph 169 (ii) (Equivalence testing—FIND) Given two vertices i, j ∈ V , decide

whether they are currently equivalent.

A new request for an operation is input to the algorithm only after it has executed the previous operation.

Our Algorithm 5.3.2 for finding a spanning tree needs at most n− 1 operations UNION and at most m operations FIND.

We describe a quite simple solution of Problem 5.3.4. In the beg-inning, we assign distinct marks to the vertices of V , say the marks 1, 2, . . . , n. During the computation, the marks will always be assigned so that two vertices are equivalent if and only if they have the same mark. Thus, equivalence testing (FIND) is a trivial comparison of marks.

For replacing two classes by their union, we have to change the marks for the elements of one of the classes. So, if the elements of each class are also stored in a list, the time needed for the mark-changing operation is proportional to the size of the class whose marks are changed.

For a very rough estimate of the running time, we can say that no class has more than n elements, so a single UNION operations never needs more than O(n) time. For n− 1 UNION operations and m FIND operations we thus get the bound O(n2+ m). One inconspicuous imp-rovement is to maintain also the size of each class and to change marks always for the smaller class. For such an algorithm, one can show a much better total bound: O(n log n+m) (Exercise 1). The best known solution of Problem 5.3.4, due to Tarjan, needs time at most O(nα(n) + m) for m FIND and n− 1 UNION operations (see e.g. Aho, Hopcroft, and Ullman [11]), where α(n) is a certain function of n. We do not give the definition of α(n) here; we only remark that α(n) does grow to infinity with n→ ∞ but extremely slowly, much more slowly than functions like log log n, log log log n, etc. For practical purposes, the solution described above (with re-marking the smaller class) may be fully satisfactory.

Let us present one more algorithm for spanning trees, perhaps even a simpler one.

5.3.5 Algorithm (Growing a spanning tree). Let a given graph G = (V, E) have n vertices and m edges. We will successively con-struct sets V0, V1, V2, . . .⊆ V of vertices and sets E0, E1, E2, . . .⊆ E of edges. We let E0 =∅ and V0 ={v}, where v is an arbitrary vertex.

Having already constructed Vi−1 and Ei−1, we find an edge ei = {xi, yi} ∈ E(G) such that xi ∈ Vi−1 and yi ∈ V \ Vi−1, and we set Vi = Vi−1 ∪ {yi}, Ei = Ei−1 ∪ {ei}. If no such edge exists, the algorithm finishes and outputs the graph constructed so far, T = (Vt, Et).

5.3.6 Proposition (Correctness of Algorithm 5.3.5). If the algorithm finishes with a graph T with n vertices, then T is a spanning

tree of G. Otherwise G is a disconnected graph and T is a spanning tree of the component of G containing the initial vertex v.

Proof. The graph T is a tree because it is connected and has the right number of edges and vertices. If T has n vertices, it is a spanning tree, so let us assume that T has ¯n < n vertices. It remains to show that V (T ) is the vertex set of a component of G.

Let us suppose the contrary: let there be an x ∈ V (T ) and a y ∈ V (T ) connected by a path in the graph G. As in the proof of Proposition 5.3.3, we find an edge e ={xj, yj} ∈ E(G) on this path such that xj ∈ V (T ) and yj ∈ V \ V (T ). The algorithm could thus have added the edge e and the vertex yj to the tree, and should not have finished with the tree T . This contradiction concludes the

proof. 2

Remark. The details of the algorithm just considered can be designed in such a way that the running time is O(n + m) (see Exercise 2).

Exercises

1. Prove that if Problem 5.3.4 is solved by the described method (always changing the marks for the smaller class), then the total complexity of n− 1 UNION operations is at most O(n log n).

2. ∗,CS Design the details of Algorithm 5.3.5 is such a way that the run-ning time is O(n + m) in the worst case. (This may require some knowledge of simple list-like data structures.)

3. From Exercise 4.4.7, we recall that a Hamiltonian cycle in a graph G is a cycle containing all vertices of G. For a graph G and a natural number k≥ 1, define the graph G(k)as the graph with vertex set V (G) and two (distinct) vertices connected by an edge if and only if their distance in G is at most k.

(a) Prove that for each tree T , the graph T(3) has a Hamiltonian cycle.

(b) Using (a), conclude that G(3) has a Hamiltonian cycle for any connected graph G.

(c) Find a connected graph G such that G(2)has no Hamiltonian cycle.

In document Invitation to Discrete Mathematics (Page 184-188)