Network flow. The Network Flow Problem. History. Overview

(1)

Network flow

History

The RAND Corporation is American think tank composed of scientists created to offer research and analysis to the United States Armed Forces. In the 50’s helped decision concerning the nuclear race, space program, etc. Contributed to the development of many scientific techniques, such as game theory. Among the members were John Nash or John Von Neumann.

A document that was declassified in 1999 describes how RAND studied the Soviet train system. They studied the Soviet ability to transport things from place to place (e.g., Asian side to European side). That is when the foundations of network flow study were laid.

In the Vietnam war, the Vietcong used a series of underground tunnels called the Ho Chi Minh trail. The US was looking for the min cut to efficiently disconnect the south and the north parts of the system. The capacity of an edge is the difficulty of destroying it.

Overview

A typical application of graphs is using them to represent networks of infrastructure e.g. for distributing water, electricity or data. In order capture the limitations of the network it is useful to annotate the edges in the graph with capacities that model how much resource can be carried by that connection.

An interesting property of networks like this is how much of the resource can simultaneously be transported from one point to another - the maximum flow problem. Network flow is important because it can be used to express a wide variety of different kinds of problems. By developing good algorithms for solving network flow, we immediately get algorithms for solving many other problems as well. In this chapter we are going to cover following topics:

. The definition of the network flow problem . The uppermost path algorithm for planar graphs . The Ford-Fulkerson algorithm

. The maxflow-mincut theorem . The bipartite matching problem

The Network Flow Problem

Searching for a maximum flow in optimisation theory is strongly motivated by its practical meaning nowadays.

Several application deal mainly with the transportation of different resources - formally defined as flow - from a particular source to a particular target.

In general, this type of model is called flow problem. A flow problem is defined by the graph representing the network structure, annotated with some more information. For every edge of the graph it is necessary to specify the maximum amount of resource that can be shifted along this edge - it’s capacity c(e). It is also necessary to pick a source and a target node for the flow.

The most basic flow problem is that of finding the maximum possible flow: how much resources can be transported from one node to another.

We begin with a definition of the problem. We are given a directed graph G, a start node s, and a sink node t. Each edge e in G has an associated non-negative capacity c(e), where for all non-edges it is implicitly assumed that the capacity is 0. For example, consider the graph in Figure 1.1.

(2)

Figure 1.1:

Our goal is to push as much flow as possible from s to t in the graph. The rules are that no edge can have flow exceeding its capacity, and for any vertex except for s and t, the flow in to the vertex must equal the flow out from the vertex. That is,

Capacity constraint: On any edge e we have:

f (e) ≤ c(e) Flow conservation: For any vertex v /∈ {s, t}, flow in equals flow out:

X

u

f (u, v) =X

u

f (v, u)

Subject to these constraints, we want to maximize the total flow into t. For instance, imagine we want to route message traffic from the source to the sink, and the capacities tell us how much bandwidth we’re allowed on each edge. E.g., in the above graph, what is the maximum flow from s to t?

Answer: 5. Using “capacity[flow]” notation, the positive flow looks as in Figure 1.2. Note that the flow can split and rejoin itself.

Figure 1.2:

How can you see that the above flow was really maximum? Notice, this flow saturates the a → c and s → b edges, and, if you remove these, you disconnect t from s. In other words, the graph has an “s − t cut” of size 5 (a set of edges of total capacity 5 such that if you remove them, this disconnects the source from the sink). The point is that any unit of flow going from s to t must take up at least 1 unit of capacity in these pipes. So, we know we’re optimal.

We just noticed that in general, the maximum s − t flow ≤ the capacity of the minimum s − t cut. An important property of flows, that is proved as a byproduct of analyzing an algorithm for finding maximum flow, is that the maximum s − t flow is in fact equal to the capacity of the minimum s − t cut. This is called the Maxflow-Mincut Theorem. In fact, the algorithm will find a flow of some value k and a cut of capacity k, which are both optimal!

Definition 1: An s − t cut is a set of edges whose removal disconnects t from s. Or, formally, a cut is a partition of the vertex set into two pieces A and B where s ∈ A and t ∈ B. (The edges of the cut are then all edges going from A to B).

Definition 2: The capacity of a cut (A, B) is the sum of capacities of edges in the cut. Or, in the formal viewpoint, it is the sum of capacities of all edges going from A to B. (Don’t include the edges from B to A.)

Definition 3: It will also be mathematically convenient for any edge (u, v) to define f (v, u) = −f (u, v). This is called skew-symmetry. (We will think of flowing 1 unit on the edge from u to v as equivalently flowing -1 units on

(3)

How can we find a maximum flow and prove it is correct? Here’s a very natural strategy: find a path from s to t and push as much flow on it as possible. Then look at the leftover capacities (an important issue will be how exactly we define this, but we will get to it in a minute) and repeat. Continue until there is no longer any path with capacity left to push any additional flow on. Of course, we need to prove that this works: that we can’t somehow end up at a suboptimal solution by making bad choices along the way. This approach, with the correct definition of “leftover capacity”, is called the Ford-Fulkerson algorithm.

The uppermost path algorithm

For planar graphs much simpler way of finding maximum flow can be used. This method is called the uppermost path algorithm. As an input we need to have a planar drawing of a network in which source and sink are at the circumference of the graph.

Definition 4: We call a directed path p^∗(s, t) the uppermost path if every other path lies in the part of the plane bounded by cycle p^∗(s, t) ∪ (t, s)

Definition 5: Given a flow f in graph G, the residual capacity c_f(u, v) is defined as c_f(u, v) = c(u, v) − f (u, v), where recall that by skew-symmetry we have f (v, u) = −f (u, v).

Step 1: Initialization

Start with zero flow: for all e ∈ E set f (e) = 0 Step 2: Find the uppermost path p^∗, if none exists then stop.

Step 3: Let

D = min cf(e) ∈ p^∗ Step 4: Increase the flow and decrease the capacities along the path p^∗:

f (e) = f (e) + D c_f(e) = c_f(e) − D Step 5: Delete edges with zero residual capacity. Go to step 2.

The Ford-Fulkerson algorithm (1956)

The Ford-Fulkerson Algorithm computes a maximum flow in a iterative manner by starting with a valid flow, and then making adjustments that fulfill the constraints and increase the flow.

An important concept to uderstand the algorithm is the residual graph. This is a graph generated by calculating how the flow along each edge can be modified - each edge in the network graph is replaced by up to two new edges, a forward edge with the same direction that that signifies how much the flow can be increased, and a backward edge storing how much the flow can be reduced.

The algorithm starts with a empty flow (which is always valid) and then repeatedly finds paths in the residual graph from source to target. Adding just enough flow along the path to saturate one edge, which is the one with the lowest capacity, keeps the contraints on the flow fulfilled and strictly increases the flow. These are called augmenting paths

How to Find Paths in the Residual Graph

The Ford-Fulkerson method does not explicitly state how to find the augmenting paths, and works with a path finding algorithm. However, the choice of strategy has an impact on the (worst case) runtime of the algorithm. Depth-first search is a comparatively bad choice, using Breadth-first search (BFS) gives better results. This specific modification is called the Edmonds-Karp algorithm. Another variant with even better performance is the algorithm of Dinic.

(4)

Residual graph

For a network G = (V, E) and flow f , the residual graph G_f = (V⁰, E⁰) of G with respect to f is . V⁰ = V

. Forward Edges:

For each edge e ∈ E with f (e) < c(e), we add e⁰ ∈ E⁰with capacity c(e) − f (e)

. Backward Edges: For each edge e = (u, v) ∈ E with f (e) > 0, we add (v, u) ∈ E⁰with capacity f (e)

Algorithm

Step 1: f (u, v) ← 0 for all edges (u, v)

Step 2: While there is a path p from s to t in G_f, such that c_f(u, v) > 0 for all edges (u, v) ∈ p:

c_f(p) = min{c_f(u, v) : (u, v) ∈ p}

For each edge (u, v) ∈ p recalculate flow and capacity:

Forward edges

f (u, v) ← f (u, v) + c_f(p) (Send flow along the path)

cf(u, v) ← cf(u, v) − cf(p) (The remaining capacity is decreased) Backward edges

f (v, u) ← f (v, u) − cf(p) (The flow might be ”returned” later)

c_f(v, u) ← c_f(v, u) + c_f(p) (The capacity of the backward edge is increased)

Example: Find the maximum flow from the source s to the sink t.

v1 v2

v3 v4

t v5

s

v6 6

7

10 5

9 6

7 9

7 7

The following figures show the residual graph, and the values on the edges represent remaining capacities.

(5)

v1 v2

v3 v4

t v5

s

v6 0

1

10 5

9 6

7 9

1 7 6

6

v1 v2

v3 v4

t v5

s

v6 1

9 5

9 6

7 8

0 7 6

6

7

1

v1 v2

v3

v4

t v5

s

v6 1

9

4 1

7

8 7

6

7

1

5

5 0

v1 v2

v3

v4

t v5

s

v6 7

3

4 1

1

2 1

6

0

7

5

6 6

(6)

v1 v2

v3

v4

t v5

s

v6 7

3

4 1

1

2 1

6

7

5

6 6

The maximum flow can be calculated as a sum of flows added on the single paths (6 + 1 + 5 + 6) or it is a flow of any st-cut, for example the flow from the source s equals to the sum of the flows of edges going from s (which is not depicted in the figures). The flow of a forward edge equals to the capacity of the backward edge (these capacities are in the figures), therefore the total flow is the sum of capacities of backward edges going to s (6+7+5).

Another way of using the same algorithm:

Algorithm

Step 1: By choosing arbitrary paths from source to sink we push maximum flow through every path and we get a graph in which there is a saturated edge on every path from source to sink. Now we have to check if this is the maximum flow and if not we increase the flow.

Step 2: As long as possible we search for an augmenting path from source to sink. Two types of edges can be used on this path, that are selected in the following order (second type is chosen only if the first one is not possible):

a. we can use forward edges (u, v) if f (u, v) < c(u, v)

b. we can use backward edges (v, u) (in opposite direction) if f (u, v) > 0

Step 3: From all edges on augmenting path we choose minimum (in absolute value) a of:

a. flow that can be added to forward edges as min c(u, v) − f (u, v) b. flow that can be removed from backward edges as min f (u, v)

Step 4: We add a to the flow on forward edges and subtract a from backward edges on the augmenting path. We either get at least one saturated forward edge or at least one backward edge with zero flow.

Step 5: The algorithm end if we are not able to find augmenting path from source to sink.

Exercise:

1. For the network in Fig 1.1, find all possible cuts which separate S from t, and evaluate the capacity of each cut.

What is the minimum capacity of any cut?

2. If you can find a cut and flow of the same value, can you be sure that you have found the maximum flow?

3. Each maximum flow defines a minimum capacity cut, using the method discussed in class. Is this minimum capacity cut unique? That is, for a given maximum flow, could there be more than one minimum capacity cut for this flow?

(7)

4. Does each minimum capacity cut define a unique maximum flow

5. We are given a flow network with several sources and several sinks. Explain how to use an algorithm for finding a maximum flow (in a standard flow network) for this case.

6. Given a flow network such that all of the capacities are even numbers, show that the size of the maximum flow is even.

7. Given a digraph and a pair of vertices u, v ∈ V , describe an algorithm that finds the maximum number of edge-disjoint paths from u to v.

8. Given a digraph and vertices u, v ∈ V , describe an algorithm that finds the minimum number of edges needed to be removed so that there would be no path from u to v.

Let’s see how we can use an algorithm for the max flow problem to solve other problems as well: that is, how we can reduce other problems to the one we now know how to solve.

Bipartite matching

Say we wanted to be more sophisticated about assigning groups to homework presentation slots. We could ask each group to list the slots acceptable to them, and then write this as a bipartite graph by drawing an edge between a group and a slot if that slot is acceptable to that group. For example:

This is an example of a bipartite graph: a graph with two sides L and R such that all edges go between L and R. A matching is a set of edges with no endpoints in common. What we want here in assigning groups to time slots is a perfect matching: a matching that connects every point in L with a point in R. For example, what is a perfect matching in the bipartite graph above? More generally (say there is no perfect matching) we want a maximum matching: a matching with the maximum possible number of edges. We can solve this as follows:

Transformation of bipartite matching to maximum flow:

1. Set up a fake “start” node s connected to all vertices in L. Connect all vertices in R to a fake “sink” node t.

Orient all edges left-to-right and give each a capacity of 1.

2. Find a max flow from s to t using Ford-Fulkerson.

3. Output the edges between L and R containing nonzero flow as the desired matching.

This finds a legal matching because edges from R to t have capacity 1, so the flow can’t use two edges into the same node, and similarly the edges from s to L have capacity 1, so you can’t have flow on two edges leaving the same node in L. It’s a maximum matching because any matching gives you a flow of the same value: just connect s to the heads of those edges and connect the tails of those edges to t. (So if there was a better matching, we would not be at a maximum flow).

What about the number of iterations of path-finding? This is at most the number of edges in the matching since each augmenting path gives us one new edge.

(8)

Let’s run the algorithm on the above example. Notice a neat fact: say we start by matching A to 1 and C to 3. These are bad choices, but the augmenting path automatically undoes them as it improves the flow!

Matchings come up in many different problems like matching up suppliers to customers, or cell- phones to cell-stations when you have overlapping cells.