• No results found

Experiments and Evaluations 7.1 Experiment Software Modules

7.4 Texas Transportation Network Data Set

The data set is shown in Fig 7-4. There are 62 nodes (cities) and 120 edges (major roads) in the transportation network. The number of all-pair shortest paths is 62*(62-1)/2=1891, i.e., the hypergraph representation for network path query has 1891 hyperedges. For spatial range queries, the query window we use is a 100 by 100 square miles area. The resulting hypergraph representation has 62 nodes (cities) and 420 hyperedges.

We explore the following ordering heuristics for experiments using the DBW cost model (c.f. Fig. 7-1): random, original graph traversal/EAFG traversal (BFS/DFS), traversal of the original graph partition tree, traversal of the derived hypergraph partition tree, traversal of EAFG partition tree, Hilbert SFC and R-Tree traversal. For BFS/DFS traversals, start at different nodes will generate different orderings. We report min, max and average of the n BFS/DFS orderings using different nodes as the starting node where n is the number of vertices in the network. For the R-Tree traversal ordering heuristic, we vary the branch factor from 4 to 9. We also evaluate the optimized orderings using the following BDTs: the original graph partition tree, the EAFG partition tree, the hypergraph partition tree and the decomposed R-Tree built from treating graph nodes as geographical points. These heuristic orderings and optimized orderings based on the hypergraph representation of network path queries on graph data are evaluated first on the network path queries. The same orderings are then evaluated on the hypergraph representation of spatial range queries on point data. The results are shown in Table 7-12.

From the results we can see that for network path queries, traversal of the graph (original graph, EAFG and hypergraph) partition tree orderings and their optimized orderings achieve much better results than both the graph traversal orderings and geometric based heuristic orderings. Among these orderings, traversal of the hypergraph partition tree ordering as an ordering heuristic is the best. On the

among the three optimized orderings although they are pretty close. The optimized ordering based on the original graph partition tree has the largest improvement ratio over its graph partition tree traversal ordering heuristic.

Table 7-12. Summary of Results of Texas Transportation Network Data Set Under DBW Cost Model

Orderings Path Query Range Query

n-Rand-min 35.55 26.68 n-Rand-max 45.10 34.25 n-Rand-Avg 40.79 30.46 n-EAFG-BFS-min 35.16 21.58 n-EAFG-BFS-max 42.57 32.24 n-EAFG-BFS-avg 39.35 27.97 n-EAFG-DFS-min 32.54 18.58 n-EAFG-DFS-max 40.23 27.93 n-EAFG-DFS-avg 37.31 23.50

Traversal of original graph partition tree 30.69 21.33 Optimization based on original graph

partition tree 22.56 20.04

Traversal of EAFG partition tree 26.37 20.61

Optimization based on EAFG partition tree 22.26 21.16 Traversal of hypergraph partition tree 24.25 25.19 Optimization based on hypergraph partition

tree 22.74 18.69

Hilbert SFC 38.63 25.61

Traversal of R-Tree –min 35.73 17.76

Traversal of R-Tree –max 39.97 25.92

Traversal of R-Tree –avg 37.99 21.24

Optimization R-Tree –min 33.53 13.31

Optimization R-Tree –max 35.87 22.41

Optimization R-Tree –avg 34.30 13.49

Since the EAFG and the original graph has the same topology, their breadth first search orderings and depth first search orderings are also the same. Among the EAFG traversal ordering heuristics, DFS seems to be better for all n-min/max/avg cases than those of BFS. The Hilbert ordering, although better than the maximum of

random ordering orderings, is worse than their average. The R-Tree traversal orderings average is slightly worse than the DFS traversals average of EAFG, but is better than BFS traversals average of EAFG. The results suggest that the geometric based heuristic orderings (R-Tree traversal) and their optimized orderings are not as good as graph partition based ones (Original/EAFG/Hypergraph partition tree traversal).

It is interesting to see that geometric-based ordering heuristics and their optimized orderings where optimizations are based on the hypergraph representation of spatial range queries perform better than graph partition orderings and their optimized orderings where the optimizations are based on the hypergraph representation of network path queries. We thus draw our conclusion that geometric based orderings should be used for spatial range queries and graph partition based orderings should be used for network path queries.

We next perform experiments using the ATDataMul cost model on network path

queries. Several new ordering heuristics, such as Maximum Spanning Tree , MAX, MAX-LD, NODE-WEIGHT and EDGE-WEIGHT as discussed in Chapter 5, are available under MUL scheme but not under DBW/SEP scheme.

Although the Maximum Spanning Tree based orderings do not make much sense under the DBW cost model, it works well under the ATDataMul cost model since

they put nodes (or edges) with larger weights as close to the beginning of a broadcast cycle as possible. The similar arguments can be made for MAX, MAX-LD, NODE-

the Kruskal MST algorithm although generate the same MST, might have different orderings. For the Prim MST, we set each node in the graph as the source and record the sequence of nodes being visited to obtain n Prim’s MST orderings. While the MAX and MAX-LD heuristics cannot be extended to hypergraph easily, the NODE- WEIGHT and EDGE-WEIGHT heuristics can be used for both a regular graph and a hypergraph. We also include graph partition tree traversal orderings and their optimized orderings. Note that when applying optimizations, the original graph and the EAFG are used only to generate the BDTs while the hypergraph is still used as the underlying representation in all the three optimizations. Like the experiments under the DBW cost model, we also include the Hilbert and the R-Tree traversal ordering heuristics. Again R-Trees are used only for generating the BDTs . The results are listed in Table 7-13. For the numbers that stride multiple columns, they are the same by nature for the types of graphs denoted by the columns.

From the results we can see that graph partition based heuristic orderings and their optimized orderings remain among the best orderings under ATDataMul cost

model. The Kruskal-MST heuristic on the EAFG, the MAX and the MAX-LD heuristics on the original graph/EAFG are slightly better than the rest heuristic orderings. Although they are still slightly worse than the 1000 random ordering minimum, they are better than the 1000 random ordering average. Considering the computation cost of these heuristics and the cost of examining 1000 random orderings, they are preferred to random orderings. Although the optimized orderings using R-tree as BDTs improve the R-Tree traversal ordering heuristics by 5% on

average, they are still only comparable to the Kruskal-MST heuristic on the EAFG, the MAX and the MAX-LD heuristics on original graph/EAFG. Thus we do not recommend performing optimization using R-Tree as the BDT construction to optimize broadcast ordering for network path query processing.

Table 7-13. Summary of Results of Texas Transportation Network Data Set Under ATDataMul Cost Model

Graph Types Orderings

ORGN EAFG Hyper

1000-Rand-min 47.39 1000-Rand-max 53.39 1000-Rand-Avg 51.00 n-BFS-min 47.78 N/A n-BFS-max 53.49 N/A n-BFS-avg 50.41 N/A n-DFS-min 48.67 N/A n-DFS-max 53.07 N/A n-DFS-avg 51.35 N/A n-Prim-MST-Min 47.89 48.67 N/A n-Prim-MST-Max 53.09 52.18 N/A n-Prim-MST-Avg 49.56 50.47 N/A Kruskal-MST 49.61 47.01 N/A MAX 47.79 N/A MAX-LD 47.64 N/A NODE-WEIGHT 52.28 52.00 52.43 EDGE-WEIGHT 50.91 53.15 51.95

Traversal of partition tree 44.38 47.79 45.01

Optimization based on partition tree 41.76 41.31 41.20

Hilbert SFC 50.71

Traversal of R-Tree –min 48.50

Traversal of R-Tree –max 51.09

Traversal of R-Tree –avg 50.09

Optimization R-Tree –min 46.04

Optimization R-Tree –max 48.79

Chapter 8