Loop abort: Add a test to stop a loop early.

Tuning Algorithms, Tuning Code

Guideline 4.5 Loop abort: Add a test to stop a loop early.

Experiments using version V1 indicate that this idea could be very effective if a suitable bound can be found: on random uniform graphs the subgraphS is ﬁnished

after the smallest 2 to 10 percent of edges have been considered; that means that the remaining 90 to 98 percent of edges are nonessential. (This observation applies only to random uniform graphs, but theoretical results mentioned in the Chapter Notes suggest that a similar property holds for general random graphs.) For graphs with uniform edge costs from(0, 1), a bound of x∈ (0,1) on the largest essential

edge would reduce the number of outer loop iterations fromm to xm.

One way to implement the loop abort test is to calculate an upper boundD(S)

on the diameter ofS, which is the maximal distance between any pair of vertices

inS. Ifecost is greater than the diameter (or its upper bound), the ES algorithm can stop because all remaining edges must be nonessential.

One way to compute such a bound is to perform a full search ofS from vertexs to every other vertex in the graph. Letf be the farthest-away vertex found in that search; twice the distance froms to f is an upper bound on D(S). Any type of search will do: one idea is to rewriteS.distance to perform a full Dijkstra search froms rather than stopping when d is encountered; another is to run a breadth-ﬁrst

search (BFS) froms. The slower Dijkstra search would provide a tighter bound on the diameter, and the faster BFS search would yield a looser bound.

We employ a small pilot experiment for guidance in choosing a good bounding strategy, by adding a full BFS search insideS.distance, to run before the (inner) main loop begins. The full Dijkstra search can be implemented by commenting out the loop abort test on line 7 of the code in Figure 4.7. Here are some observations from an exploratory experiment to evaluate these two strategies.

• Early in the computation when S is unconnected, both searches return D(S) = ∞, which is no use in the loop abort test. Later, when S is nearly ﬁnished, the bounds returned by these searches yield signiﬁcant reductions in main loop iterations. For example, at n= 1000 the BFS bound is near 0.06 on average;

that means that with the loop abort test the main loop executes about 30,000 iterations instead of the full 499,500 iterations, a reduction of 94 percent. • A full BFS search of S is much faster than a full Dijkstra search. The slightly

tighter bounds returned by the full Dijkstra search are not enough to counteract the greater computation time.

• Both BFS and the full Dijkstra search are much more expensive than the par- tial search (to noded) performed in S.distance. Furthermore, the bounds returned by these searches do not change much from call to call. It is not cost-effective to perform a full search at each invocation ofS.distance.

On the basis of these observations, version V2 implements the following search strategy for the loop abort test: (1) Wait untiln edges have been added to S to

activate the BFS search (so it is more likely to be connected); (2) once the search is activated, in each call toS.distance, check whether BFS has been performed with source vertexs; if not, run the BFS search with s as source. This adds at mostn invocations of BFS to the cost of the algorithm. The new code section is

sketched in Figure 4.8.

Note the result of the BFS search is memoized on line 0.6. In fact, all distances discovered during BFS search can be memoized (not shown).

Also, if the farthest-away vertexf has distance less than ecost, the edge must be nonessential and the Dijkstra search need not be performed. This test to abort the inner main loop appears on line 0.7.

Another loop abort test can be applied to this inner loop, as follows. Since Dijkstra’s algorithm ﬁnds vertices in increasing order by distance, eachw.dist extracted on line 5 is a lower bound on the distance froms to d. If (w.dist > ecost), then edge (s,d) must be essential: we can abort the Dijkstra search and return the boundw.dist instead of the true distance.

bound = (global) bound on max essential edge cost procedure S.distance(s, d, ecost) returns

distance from s to d, or upper bound D[s,d] 0.1 if (D[s,d] <= ecost) return D[s,d];

0.2 if (S.edgeCount() >= n) {

0.3 if (s.dfs has not been performed) {

0.4 <f,f.dist> = S.dfs(s); // f is farthest from s 0.5 if (d*2 < bound) bound = d*2; // save min bound 0.6 if (D[s,f] > f.dist) D[s,f] = f.dist; // memoize 0.7 if (f.dist <= ecost) return f.dist; // loop abort

}

1 For all vertices v: v.status = unseen;

2 Ps.init(s,0) // insert s with distance 0

3 s.status = inqueue; 4 while (Ps.notEmpty()) {

5 <w, w.dist> = Ps.extractMin();

5.1 if (w.dist < D[s,w]) D[s,w] = w.dist;

6 w.status = done;

Figure 4.8. BFS search. The BFS search returns the distance fromsto the farthest-away vertex f. Twice this distance is an upper bound on the diameter ofS.

4 while (Ps.notEmpty()) {

5 <w, w.dist> = Ps.extractMin();

5.1 if (w.dist < D[s,w]) D[s,w] = w.dist;

5.2 if (w.dist > ecost) return w.dist; // loop abort

6 w.status = done;

This reduces the number of iterations of that loop but also might increase total cost of the algorithm because fewer memoizations would be performed in each call toS.distance.

These two loop abort strategies will be evaluated together with the next algorithm tuneup, called ﬁltering.

Filtering. Consider the relax operation on line 9 of Figure 4.7. If (z.dist > ecost), there is no need to insert z into Ps because it can not affect the decision about whether edge(s,d) is essential or nonessential. If the (inner) loop abort tuneup mentioned earlier is implemented, the loop will stop before this value can be extracted, anyway. This strategy of ﬁltering the data structure saves the cost of

someinsert operations and speeds up other operations by making the priority queue smaller overall.

In document 9cgmv.A.Guide.to.Experimental.Algorithmics.pdf (Page 121-124)