• No results found

7. IMPLEMENTATION

7.2 Graph Traversal Strategies

Traversal Strategies in STAPL GL are implemented using algorithmic patterns provided by

STAPL that execute higher-order functions (work-functions) on elements of a view. To express a parallel graph algorithm, users choose a suitable traversal strategy and provide two operators that describe the computation in a fine-grained manner. For example, users can implement a breadth- first traversal (BFS) by providing generic BFS vertex and visitor operators (Figure 7.3(b, c)) to the traversal strategy. Both operators are generic and oblivious to the strategy, allowing separation of algorithm from performance by choosing the appropriate strategy based on the input and system.

Figure 7.3 shows the pseudo-code to demonstrate how the level-synchronous, asynchronous, and KLA versions of BFS differ. From the user’s perspective, the BFS vertex and visit operators for the algorithm do not change (Figure 7.3(b,c)), and neither does the driver of the algorithm (Fig-

ure 7.3(a)). What does change, however, is the VISITfunction (Figure 7.3(d-f)), which determines the conditions to allow the computation to proceed on the target vertex being visited.

A KLA BFS results by providing the generic BFS operators (Figure 7.3(b,c)) to the traversal strategy, using the KLA VISITfunction (Figure 7.3(e)). Lines 3-7 in Figure 7.3(a) can be replaced

by a call to the KLA traversal strategy, which has the same effect. The vertex-operator returns true if it is active for a vertex, and false otherwise. The algorithm terminates when all vertices are inactive (all invocations of vertex-operators return false). The requirements for the KLA BFS visitor operator are the same as for the asychronous BFS visitor operator – it cannot rely on the order in which the vertex is visited and must be resilient to multiple visits.

Hierarchical Paradigm in SGL. The hierarchical graph paradigm allows the execution of vertex- centric algorithms on hierarchical graphs, as it is a drop-in replacement for the standard graph paradigm, as seen in Figure 2.1(c), such that existing vertex and neighbor operators need not be modified to take advantage of the hierarchy, or even be aware of it. Different graph algorithms can take advantage of the hierarchical paradigm simply by swapping the call to kla_paradigm with hierarchical_paradigmand providing it the hierarchical graph. This swap is encapsulated inSTAPL GL’s execute method and the execution policy.

Support for Out-of-Core Processing. To support out-of-core (disk-based) graph processing, we extendSTAPL GL, which supports in-memory graph computations, to allow storing the graph struc-

ture to disk. The graph structure stored in thePGRAPHdistributed container consists of one or more

base containers per location. The subgraphs are stored in the base container, which we allow to be stored in-memory or on-disk. The base container also contains an in-memory hubs-cache, if the optimization is enabled. To allow for loading and storing subgraphs to disk, we extended the

PGRAPH base containers to provide serializing and deserializing capabilities, and implemented the ability to read and write shards. The asynchronous forwarding of updates is processed by the two-level distributed directory that maps vertices to their home-locations.

The API of the STAPL GL does not change, as our technique is transparent to the user. This

v o i d BFS( Graph graph , v e r t e x source , i n t k ) source . c o l o r = GREY;

GraphView a c t i v e _ v e r t i c e s ( graph , source ) w h i l e ( a c t i v e _ v e r t i c e s . s i z e ( ) ! = 0 )

a c t i v e _ v e r t i c e s

= reduce ( map (bfs_op, a c t i v e _ v e r t i c e s , k ) , l o g i c a l _ o r ( ) ) g l o b a l _ f e n c e ( ) ; (a) b o o l bfs_op( v e r t e x v ) i f ( v . c o l o r == GREY) / / A c t i v e i f GREY v . c o l o r = BLACK map_func (Visit( b f s _ v i s i t o r _ o p ( v . d i s t + 1 ) ) , v . n e i g h b o r s ( ) ) / / V i s i t n e i g h b o r s r e t u r n t r u e / / v e r t e x was A c t i v e e l s e r e t u r n f a l s e / / v e r t e x was I n A c t i v e (b) b o o l b f s _ v i s i t o r _ o p( v e r t e x v , i n t new_distance ) i f ( v . d i s t > new_distance ) v . d i s t = new_distance / / update d i s t a n c e v . c o l o r = GREY / / mark t o be processed r e t u r n t r u e / / v e r t e x was updated e l s e r e t u r n f a l s e

(c)

/ / Level−Sync V i s i t −− don ’ t spawn f u r t h e r t a s k s Visit( v i s i t o r v i s, v e r t e x u , i n t k _ c u r r e n t )

v i s( u )

(d)

/ / KLA V i s i t −− r e c u r s i v e l y spawn t a s k s t o depth k Visit( v i s i t o r v i s, v e r t e x u , i n t k _ c u r r e n t ) i f (v i s( u ) ) / / i f u was updated i f ( k _ c u r r e n t % k ! = 0 ) / / i f f i n s i d e async s e c t i o n bfs_op( u ) / / r e c u r s i v e l y c a l l work−f u n c t i o n on u (e) / / Async V i s i t −− r e c u r s i v e l y spawn t a s k s Visit( v i s i t o r v i s, v e r t e x u , i n t k _ c u r r e n t ) i f (v i s( u ) ) / / i f u was updated bfs_op( u ) / / r e c u r s i v e l y c a l l work−f u n c t i o n on u (f)

Figure 7.3: (a) The fine-grained BFS algorithm, with (b) BFS vertex operator and (c) BFS visitor operator. Also shows how changing the visit-functions between level-sync (d), async (f) and k- level-async (e) visit functions changes the behavior of the algorithm from async to KLA to level- sync, using BFS as an example algorithm.