Scalable Network Traffic Visualization By Structural Equivalence Grouping

(1)

Scalable Network Traffic Visualization By

Structural Equivalence Grouping

Qi Liao

Department of Computer Science Central Michigan University, USA

Email: [email protected]

Lei Shi

State Key Laboratory of Computer Science Institute of Software

Chinese Academy of Sciences Email: [email protected]

Chuang Lin

Department of Computer

Science and Technology Tsinghua University, China Email: [email protected]

Abstract—Visualizing large and complex networks, such as social networks and computer networks, is a common yet chal-lenging task. Traditional methods either render every node/edge, which inevitably creates visual clutterness and is not scalable except for impractically small networks, or simplify the network by clustering, which oversimplifies the network topology and may not work well in cases such as heterogeneous graphs. As a supplement to community-based approach, we studied and applied a classic sociology method, i.e., Structural Equivalence Grouping (SEG). One key feature of SEG from the traditional proximity-based community detection algorithms is that nodes with similar connectivity behavior and patterns but not neces-sarily physically close to each other may be possibly considered in one group. We have developed a highly interactive visual analytic tool based on SEG and conducted experiments and case studies in real-world scenarios. SEG-based graph visualization reduces the network size more than 20 times in many cases without losing graph connectivity information. The fuzzy version of SEG and the interactive level-of-detail control allow a desirable balance between visual complexity and structural integrity.

I. INTRODUCTION

The analytic complexity of networks has been increasingly challenging over the past decades due to the ever growing size of networks and the explosion of information exchanged over the Internet, known as the big data era. The state of the cur-rent Internet has evolved into a direction where ubiquitously connected mobile and sensor devices are interconnected, the so called Internet of Things (IoT). The growing popularity of social networking is another example of large network, with one single social network exceeding one billion users.

In the process of understanding these large networks, vi-sualization of the overall connection patterns, assumably as graphs, is vital in many scenarios. For example, network man-agers/operators need to keep track of the traffic distribution among servers and hosts for better network/virtual-machine optimization during the capacity planning stage. They also need to monitor the latest traffic graphs to increase situa-tion awareness for both more responsive troubleshooting and security-related investigation. In a broader Internet scenario, operators of social networking service (SNS) websites depend on the knowledge of the social networks to design effective promotion strategies and online advertising.

Nevertheless, it is highly challenging to visually analyze the network traffic graphs consisting of a large number of nodes

with complex connection relationship patterns among them. First, we note that the quadratic-complexity force-directed drawing methods for the general graphs [1] are generally not capable of calculating a good layout in real time (∼1s), even for a graph with only a few hundreds of nodes. Second, even if faster large graph layouts can be computed through either optimizations [2]–[4] or new layout models [5], [6], the visual clutterness in the node-link representation [as in Figure 1(a)], mainly due to the edge crossings introduced by highly dynamic connections, prohibits human users from understanding the network traffic.

Besides the large graph drawing algorithms, existing solu-tions to the large graph analytic challenges include abstracting the complexity of the traffic graph through clustering or community detection [7]–[11]. Though communities are useful in analyzing social networks, these methods may lead to poor results when communities are not prevalent as in the network traffic graphs, as shown in Figure 1(b). In addition, the community-level view hides the important context and critical topological details while the interactive navigation can bring additional overhead to pattern discoveries. Another challenge for community detection is how to group heterogeneous nodes, i.e., nodes of different types, in a more meaningful way. We note that many networks are indeed heterogeneous as long as there are at least one semantic attribute on the nodes, e.g., the author-paper-conference network in academic collab-orations [12], the host-user-application network in computer communications [13], or semantic and ontology graphs in large heterogeneous social networks [14], etc.

In this paper, we study an alternative node and edge group-ing strategy based on the concept of structural equivalence [15], [16], well known in the sociology and social network research. Rather than detecting proximity-based communities from a network, structural equivalence classifies the network nodes into several categories by positions taken in the network, depending merely on the network graph. We implement this idea with a Structural Equivalence Grouping (SEG) algorithm completed in linear time for large network traffic graphs. SEG groups the nodes with the same neighbor set as a larger mega-node and regenerates a compressed graph for subsequent visualization and analysis (Figure 1(c)). Beyond the SEG of undirected graphs, we extend the method to

(2)

Fig. 2. Compressed graph view through Structural Equivalence Grouping (SEG) in Network Security and Anomaly Visualization (NSAV) tool.

support both directed and weighted graphs. To further control the visual complexity, we develop the fuzzy SEG (FSEG) method according to the similarity of neighbor-set through an interactive level-of-detail control. The FSEG implementation has a complexity of below quadratic and scales to a graph with 105_{edges while the basic SEG implementation supports} networks with up to multi-million edges.

We have designed intuitive visual encodings for the SEG/FSEG-based compression graphs, and introduced several user interactions to accelerate the large network visual analyt-ics. Notably, the in-situ interaction design switches between consecutive views (e.g., the compressed and original graphs) in a stable manner through modified layout schemes and staged animations. The graph interaction supports both SEG-defined or manual grouping that allows a maximum level of customization and extensibility. SEG is integrated into a tool called Network Security and Anomaly Visualization (NSAV) (Figure 2), together with an anomaly panel, a time range selector, and several controls and detail-on-demand panels. We demonstrate through case study that SEG/FSEG-based NSAV can significantly improve the domain users’ capability in their complex network traffic analysis tasks.

II. RELATEDWORK

Current approaches to large graph visual analytics fall into three categories. As discussed earlier, the first class abstracts the complexity of traffic graphs through clustering or community detection [7]–[11]. The second class adopts filtering methods [17], [18], which truncate network graphs by removing unimportant nodes and edges. Nevertheless, it is difficult to maintain a balance of visual complexity and integrity. A mild approach by edge centrality [17] can still generate cluttered graphs while a valiant reduction into a spanning tree [18] loses much of the topology information. The third class tries to alleviate the visual clutter on views, rather than by data reduction. They include the edge bundling techniques to visually cluster group of edges [19], [20], and the view point transition methods through hyperbolic views [21], [22] and fisheye distortions [23]. These methods are

orthogonal to the other approaches and may be integrated into our proposed method.

The theory of structural equivalence dates back several decades to the seminar work by social scientists Lorrain and White [15]. In their early studies, a categorical approach for the algebraic analysis of social networks is proposed. The aim is to reduce the social structure for better empirical study of the relationships among individuals. Structural equivalence is defined as the relationship between two individuals who have exactly the same type of relationships with any other individuals in the network. In real world, however, very few individuals in the network share exactly the same relationships. Relaxed definitions of structural equivalence are proposed later. In [24], automorphic equivalence is developed that the two individuals are equivalent if they are swappable together with some related others while keeping all the network rela-tionships intact. To better capture the notion of social roles, the concepts of structural relatedness [25] and later regular equivalence [16] are proposed. Rather than requiring exactly the same relationships with any others, two individuals are equivalent if they share the same set of neighborhood types, which are recursively defined by regular equivalence.

Another relevant visualization work is the graph drawing with modular decomposition [26]. Rooted in the graph theory, the module of a graph defines a set of nodes that all nodes in the set are either neighbors or non-neighbors simultaneously to all the other nodes outside the set. The authors propose an algorithm to draw a graph bottom-up along its modular decomposition tree and show promising results to reveal the graph structure while preserving several layout aesthetics. Another research work [27] develops a method to generate various types of network from text. In the context of edge compression, the text nodes with identical neighborhoods are collapsed together into groups. The similar work of motif simplification is done in [28], which extracts the frequent local structures in graphs such as fans, connectors and cliques, and rendered subgraphs as common motif glyphs encoding the types, sizes and specifications. In the large graph drawing context, the property of neighborhood set is applied in an iterative coarsening process to reduce graphs to smaller ones and layout recursively by the multi-level drawing approach [2], [3]. A comprehensive survey in the area of large graph visualization may be found in [29].

III. STRUCTURALEQUIVALENCEGROUPING SEG aggregates graph nodes with the same neighbor set into groups and constructs a new graph for visualization. For example, in the network traffic graph of Figure 3(a), the host “192.168.2.23” can be combined with the other three surrounding hosts with the same connection pattern. The new graph after SEG (Figure 3(b)) is referred as a SEG-compressed graph. The SEG-compressed graph has two types of nodes:

the single-node remaining from the original graph (drawn in

hollow) and the mega-nodegrouped from multiplesub-nodes in the original graph (drawn with filled colors).

(3)

(a) Original network (b) Communities (c) SEG

Fig. 1. Visual comparison of network analysis using the original network, clustering results and Structural Equivalence Grouping (SEG). The data source is from VAST Challenge 2012 network traffic dataset (40 hours, 3460 nodes, 48599 edges).

(a) Original graph (b) Compressed graph Fig. 3. Concept illustration of Structural Equivalence Grouping (SEG) on an undirected graph.

We begin the description of algorithm by formally defining a few graph terminologies. Let G = (V,E) be a directed, weighted and connected original graph whereV={v1, ...,vn} and E={e1, ...,em} denote the node and link set. Let W be the graph adjacency matrix where wi j>0 indicates a link from vi to vj, with wi j denoting the link weight. In each row ofW,Ri={wi1, ...,win} denotes the row vector for node vi, representing its connection pattern. The compressed graph after SEG is denoted asG∗= (V∗,E∗). The compression rate is defined by Γ=1− |V∗|/|V|for nodes and 1− |E∗|/|E| for edges, which is used in Section IV.

The basic SEG algorithm takes the graph as simple, undi-rected and unweighted by setting wii=0 and wi j = wji = 1 for any wi j>0. On graph G, for any collection of nodes with the same row vector (including the single outstanding node), aggregate them into a new mega-node/single-node Gvi ={vi1, ...,vik}. All Gvi form the node set V

∗ _{for the} compressed graph G∗. Let f vi=vi1 denote the first sub-node

inGvi. The link setE∗inG∗is generated by replacing all f vi withGviin the original link set, and removing all the links not incident to any f vi. SEG is single-pass on any graph, as any two nodes in the compressed graph have different row vectors. The SEG algorithm supports both directed and weighted graphs. The adjacency matrixW can be transformed to encode

bidirectional connections for each node. Each row vector Ri(i=1, ...,n) becomes Ri ={wi(−n), ...,wi(−1),wi1, ...,win} havingwi(−j)=wji for j=1, ...,n. In addition, the adjacency matrixW can also be switched to a weighted one by mapping a numeric data attribute of link (i,j)(e.g., flow count in the traffic graph) to wi j in the matrix. To further increase the compression rate, discretization of the link weight is allowed: transform all link weights intowi j∈(0,1] by either linear or non-linear normalization, and then pick a bin countB(B≥1)

and regenerate link weights bywi j=dwi j×Be.

With the basic SEG algorithm, the sub-nodes within a mega-node do not have any intra-group link. This actually ensures zero connectivity loss in the compressed graph. However, it is also useful to group a clique of nodes with the same external connections. Specific rendering can be applied to differentiate mega-nodes with isolated and fully connected sub-nodes. While we may set wii =1 for diagonal cells, such straightforward method prohibits the grouping of isolated nodes. To achieve an optimal SEG performance, we devise a two-step approach. In the first step, the graph adjacency matrix W is set to wii=0, allowing the grouping of isolated nodes. In the second step,W is reset to wii=1 and all the original nodes not aggregated in the first step are grouped again. A. Fuzzy Structural Equivalence Grouping

SEG is a deterministic algorithm in that for the same original graph, it always produces the same compressed graph. In the real scenario of interactive visualization, users may want the flexibility of controlling the compression rate for tradeoff of the visual complexity and precision. To that end, we also develop a Fuzzy Structural Equivalence Grouping (FSEG) algorithm. The basic idea of FSEG is to group nodes with not exactly the same but similar neighbor set. The compression rate can be increased with bounded compensation on accuracy. The key is to define the pairwise similarity score between graph nodes. Here we adopt the standard Jaccard Coefficients between two sample setsA andBfor the similarity measure, i.e., J(A,B) =|A

T B| |AS

B|. For the directed and weighted graphs, we introduce a unified Jaccard similarity computation between

(4)

nodevi andvjin graphGbyρ=

∑∀kmin(wik,wjk)

∑∀kmax(wik,wjk). Note that for the directed graph, k=−n, ...,−1,1, ...,n. FSEG is achieved by setting a similarity threshold ξ. The pair of nodes with

ρ≥ξ are grouped iteratively.

In contrast to the FSEG, level-of-detail (LOD) control al-lows users to access more details beyond the fully compressed graph, however, with a lower compression rate. The major gain is to maintain the users’ mental map to a certain kind of graph topology. LOD is achieved after the SEG algorithm by re-splitting the aggregated node into smaller mega-nodes of the same size. By default, the SEG compression level without LOD is set to 100%. Using a compression level ofβ, each mega-node containing s sub-nodes is partitioned into k smaller mega-nodes where k=b1+ (s−1)×(1−β)c. B. Implementation

The core step of SEG to group nodes with the same row vector is achieved through an appropriate hash functionH(Ri) over the row vector identifiers, associated by a hash collision resolution mechanism. The row vector identifier is created by splicing the positive cells in the row into a string. Each positive cell is represented by the destination node ID and the link weight (1 for unweighted graphs). The hash-based implementation has a computational complexity of O(ND), whereNis the number of nodes in the original graph andDis the average node degree for the complexity to splice and hash each row vector identifier. Since the graph is stored in a sparse format by adjacency lists and most traffic graphs are sparse,D is a bounded constant. Then the hash-based implementation of the basic SEG has a linear complexity and will perform well even for very large graphs.

For shingle-ordering [30] based FSEG implementation, for each row vector Ri, construct the element set Ai =

{a|wia=1}. Given any permutationσ:{1, ...,n} → {1, ...,n}, the shingle of the row vector Ri is defined by Mσ(Ai) =

σ−1(minα∈Ai{σ(α)}). By shingle properties, the probability that shingles of setAandBare identical is precisely their Jac-card coefficient J(A,B). The corresponding FSEG algorithm still works in a greedy manner to scan each node in a order. Each newly scanned node without a group is selected as an anchor node. Create a new group around the anchor node with the other nodes of sufficient similarity. Using shingle-based method, we do not need to compare pairwise similarity. Instead, we pre-compute and recordkshingles for each node’s neighbor set, using k orthogonal permutation functions. At mostkNlists are created, each holding the nodes with a same shingle result (1∼N) by one permutation function. Then for each anchor node, we can get the set of similar enough nodes by scanning thekcorresponding list, according to the shingle property. The overall complexity is O(kND+kN∗D2), where kND is the complexity to pre-compute shingles, D2 _{is the} average size of the 2-hop neighbor set of each node, also the upper bound of the average length of each list to scan. In most graphs, the complexity is significantly below N2.

TABLE I

SEGPERFORMANCE ONVAST11 CHALLENGE DATASET. Data edges edges rate time layout layout

(before) (after) (edges) (SEG) (before) (after) undirected, sim=1 1613 50 96.9% 0.007 0.242 0.084 undirected, sim=0.8 1613 39 97.6% 0.012 0.242 0.088 undirected, sim=0.5 1613 23 98.6% 0.006 0.242 0.079 directed, sim=1 1613 82 94.9% 0.005 0.242 0.084

TABLE II

SEGPERFORMANCE ONVAST12 CHALLENGE DATASET. Data edges edges rate time layout layout

(before) (after) (edges) (SEG) (before) (after) undirected, sim=1 48599 28 99.9% 0.437 3.151 0.078 undirected, sim=0.8 48599 19 99.9% 0.515 3.151 0.062 directed, sim=1 48599 1022 97.9% 0.328 3.151 0.125 directed, sim=0.8 48599 403 99.2% 0.374 3.151 0.078

IV. EVALUATION

We evaluate the effectiveness of SEG and FSEG through both experiments on the performance of compression rates and a case study of real-word network data.

A. Compression Rate

The performance of SEG and FSEG is measured in terms of the visual compression rate (by the number of edges), the compression time and the layout time before and after compression. All the experiments are carried out on the same 64-bit Windows desktop (Intel Core [email protected] with 8GB memory). Tables I∼IV show the results on three graph datasets. Notably for the cases in VAST11 and VAST12 chal-lenge datasets, SEG achieves consistently high compression rates (above 90%) with the basic algorithm or applying a fuzzy setting. The compression time is mostly below 0.5 second, reaching up to 104edges. The layout time after SEG is significantly reduced from that of the original graph.

We also experiment on one type of extreme-scale graph: the twitter social graphs with up to millions of edges (IV) for the purpose of a scalability study. The basic SEG compresses the largest graph in half a minute while the FSEG with the shingle implementation supports graphs with up to 105 edges and returns results in only 25 seconds.

TABLE III

SEGPERFORMANCE ONDATACENTER DATASET.

Data edges edges rate time layout layout (before) (after) (edges) (SEG) (before) (after) undirected 18347 2626 85.7% 0.538 4.885 0.402 undirected, weighted, #bin=10 18347 2636 85.6% 0.132 4.885 0.37

undirected, compress=0.8 18347 5428 70.4% 0.151 4.885 0.427 undirected, sim=0.5 18347 1510 91.8% 0.364 4.885 0.336

TABLE IV

SEGPERFORMANCE ONTWITTER SOCIAL NETWORKS. Data edges edges rate time

(before) (after) (edges) (SEG) mention undirected 122976 66858 45.6% 2.074 mention, sim=0.5, anchor 122976 63089 48.7% 370.121 mention, sim=0.5, shingle 122976 59873 51.3% 24.92

(5)

(a) Original (b) Compressed Fig. 4. AFC corporate network traffic overviews.

B. Case Study and Discussion

In this section, we first describe the NSAV tool followed by a case study. The NSAV tool reads and automatically correlates several standard network traffic and management data:

• Netflow: the industry standard in network management to collect and monitor network activities. Each Netflow record contains pairwise connections characterized by source and destination IP addresses and port numbers, time, protocols, etc.

• Intrusion Detection System (IDS): e.g., Snort.

• Acceptable Use Policy (AUP): serving as a good-rule set, i.e., activities/services allowed within an enterprise network.

• OS Log: operating system events such as invalid logon attempts, file operations, etc.

The SEG/FSEG-based compressed graph functions as the major view in the central panel of the NSAV tool (Figure 2). Meanwhile, there are several other panels illustrating multi-facets of the network traffic.

• Graph Node/Edge Filtering and Selection: In the lower-left corner of the control panel (Figure 2), the user can filter the graph according to various criteria of the importance scores of nodes and edges (the “#Node” and “#Edge” tabs). Individual nodes can also be quickly selected from the “List” tab.

• Anomaly Panel: There are three sections in the right panel: 1) a list of all the anomaly types occurred during the specified time range (top panel); 2) mappings between anomaly icons and their anomaly types (middle panel); and 3) anomaly events as timeline plots (bottom panel). Orange and gray colors represent the source and destina-tion respectively for an anomalous connecdestina-tion.

• Time Range Selector (double-end slider at the bottom). We illustrate the effectiveness of SEG in network analysis and situation awareness. The SEG/FSEG-based NSAV tool is applied on the VAST 2011 Mini Challenge-II dataset, which includes a computer network in All Freight Corporation (AFC). Consider John, the AFC network administrator, is checking the corporate network status of the recent three days for noteworthy events. He starts by loading the whole network traffic in this period, as shown in Figure 4(a). Because the

Fig. 5. The hosts with security holes and the cross-subnet port scans from 192.168.2.174/175.

Fig. 6. DoS attacks against 172.20.1.5 (corporate web server) from 10.200.150.201, 206-209.

original network view is too messy, he continues by applying the basic SEG to create a compressed traffic graph, as shown in Figure 4(b). From this graph, he quickly learns some key hosts in the period (e.g. 1.2, 1.141, but still feel a little overwhelmed by the amount of detail that he has to remember. He further simplifies the graph by using FSEG with a similarity score of 0.5. The resulting graph in Figure 2 is clear enough for his overview purpose: the hosts in the central group (1.2, 1.6, 1.14, 2.171-173) and 2.174/175 are all the hub nodes.

Based on the AFC network structure, John bypasses three server machines (1.2, 1.6, 1.14), which routinely communicate with all the hosts for DNS/data services. Based on his domain knowledge, the suspicious behaviors of a hub node, e.g. port scans, are often associated with the OS security holes. So John clicks on this anomaly type, which automatically highlights all the hosts with such anomaly on the graph. To drill down to individual hosts, he splits the fuzzy group and locates 2.171-175 as the threats. He finds that 2.174 and 2.2.171-175 are more

(6)

dangerous due to the higher port scan rate (indicated by the link width) and the cross-subnet floods to 1.10-250 where many hosts do not exist. The screenshot with anomaly views of 2.174/175 is given in Figure 5.

The next critical machine that John examines is the AFC’s external web server (172.20.1.5). With the SEG-compressed graph, this server is easily located by its unique connection patterns. A single click on the host shows up a noteworthy anomaly icon (I) in the morning of the first day, suggesting that there could be Denial-of-Service (DoS) attacks through SIP. John further drills down to that period with the time range selector and highlights the web server’s egocentric traffic graph. Figure 6 confirms the potential DoS attacks from the external hosts 10.200.150.201, 206-209 due to the fact that the anomalies happen simultaneously with the web server.

With the above case study, the SEG-based NSAV tool shows advantage in terms of significantly reducing the complexity of visual analytics of large network traffic graphs. The accuracy of SEG is generally better in understanding the topology. Its performance time is also quicker in drilling down to the details.

V. CONCLUSION

The increasing size of today’s large networks poses a big challenge for network researchers and operators to analyze and understand the complex relationship in such big graphs. In this paper, we propose using the Structural Equivalence Grouping (SEG) method for better scalability to larger net-work analysis. As a valuable supplement to proximity-based community detection, SEG may offer meaningful methods for grouping nodes in heterogeneous network environment. The developed SEG and fuzzy SEG based network traffic visualization tool provides an interactive option for users to explore the various tradeoffs between level-of-details and hu-man cognitive limitation. It is shown that SEG can effectively reduce the visual complexity of real-world networks while still preserving critical topological features of the original graph. Results show that the proposed visual analytic method is effective for general graph analysis and may potentially have profound impact on many large networks such as computer and communication networks and social networks. In the future we would like to extend SEG to support time-varying network graphs.

REFERENCES

[1] E. R. Gansner, Y. Koren, and S. North, “Graph drawing by stress majorization,” inGraph Drawing, 2004.

[2] P. Gajer and S. G. Kobourov, “GRIP: Graph drawing with intelligent placement,”Journal of Graph Algorithms and Applications, vol. 6, no. 3, pp. 203–224, 2002.

[3] Y. Hu, “Efficient and high quality force-directed graph drawing,” Math-ematica Journal, vol. 10, no. 1, pp. 37–71, 2005.

[4] D. Archambault, T. Munzner, and D. Auber, “Multi-level graph layout by topological features,”IEEE Trans. Visual Comput. Graphics, vol. 13, no. 2, pp. 305–317, 2007.

[5] Y. Koren, L. Carmel, and D. Harel, “ACE: A fast multiscale eigenvector computation for drawing huge graphs,” inInfoVis, 2002, pp. 137–144. [6] D. Harel and Y. Koren, “Graph drawing by high-dimensional

embed-ding,”Journal of Graph Algorithms and Applications, vol. 8, no. 2, pp. 195–214, 2004.

[7] M. E. J. Newman, “Fast algorithm for detecting community structure in networks,”Physical Review E, vol. 69, no. 6, p. 066133, Jun 2004. [8] A. Quigley and P. Eades, “FADE: Graph drawing, clustering and visual

abstraction,” inGraph Drawing, 2000, pp. 197–210.

[9] D. Auber, Y. Chiricota, F. Jourdan, and G. Melancon, “Multiscale visualization of small world networks,” inInfoVis, 2003, pp. 75–81. [10] J. Abello, F. van Ham, and N. Krishnan, “ASK-GraphView: A large

scale graph visualization system,”IEEE Trans. Visual Comput. Graphics, vol. 12, no. 5, pp. 669–676, 2006.

[11] D. Archambault, T. Munzner, and D. Auber, “Tugging graphs faster : Efficiently modifying path-preserving hierarchies for browsing paths,” IEEE Trans. Visual Comput. Graphics, vol. 17, no. 3, pp. 276–289, 2011.

[12] Y. Sun, Y. Yu, and J. Han, “Ranking-based clustering of heterogeneous information networks with star network schema,” inProceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, Paris, France, June 28-July 1 2009, pp. 797–806. [13] Q. Liao, A. Blaich, A. Striegel, and D. Thain, “ENAVis: Enterprise

network activities visualization,” in Proceedings of the USENIX 22nd Large Installation System Administration Conference (LISA ’08), San Diego, CA, November 9-14 2008, pp. 59–74.

[14] Z. Shen, K.-L. Ma, and T. Eliassi-Rad, “Visual analysis of large heterogeneous social networks by semantic and structural abstraction,” IEEE Transactions on Visualization and Computer Graphics (TVCG), vol. 12, no. 6, pp. 1427–1439, November 2006.

[15] F. Lorrain and H. C. White, “Structural equivalence of individuals in social networks,”The Journal of Mathematical Sociology, vol. 1, no. 1, pp. 49–80, 1971.

[16] D. R. White and K. P. Reitz, “Graph and semigroup homomorphisms on networks of relations,”Social Networks, vol. 5, no. 2, pp. 193–234, 1983.

[17] Y. Jia, J. Hoberock, M. Garland, and J. C. Hart, “On the visualization of social and other scale-free networks,”IEEE Trans. Visual Comput. Graphics, vol. 14, no. 6, pp. 1285–1292, 2008.

[18] F. van Ham and M. Wattenberg, “Centrality based visualization of small world graphs,”Computer Graphics Forum, vol. 27, no. 3, pp. 975–982, 2008.

[19] D. Holten, “Hierarchical edge bundles: Visualization of adjacency re-lations in hierarchical data,” IEEE Trans. Visual Comput. Graphics, vol. 12, no. 5, pp. 741–748, 2006.

[20] W. Cui, H. Zhou, H. Qu, P. C. Wong, and X. Li, “Geometry-based edge clustering for graph visualization,” IEEE Trans. Visual Comput. Graphics, vol. 14, no. 6, pp. 1277–1284, 2008.

[21] J. Lamping, R. Rao, and P. Pirolli, “A focus+context technique based on hyperbolic geometry for visualizing large hierarchies,” inCHI, 1995, pp. 401–408.

[22] T. Munzner, “H3: Laying out large directed graphs in 3d hyperbolic space,” inInfoVis, 1997, pp. 2–10.

[23] E. Gansner, Y. Koren, and S. North, “Topological fisheye views for visualizing large graphs,”IEEE Trans. Visual Comput. Graphics, vol. 11, no. 4, pp. 457–468, 2005.

[24] S. P. Borgatti and M. G. Everett, “Notions of position in social network analysis,”Sociol. methodol., vol. 22, pp. 1–35, 1992.

[25] L. D. Sailer, “Structural equivalence: Meaning and definition, computa-tion and applicacomputa-tion,”Social Networks, vol. 1, no. 1, pp. 73–90, 1978. [26] C. Papadopoulos and C. Voglis, “Drawing graphs using modular

decom-position,”Journal of Graph Algorithms and Applications, vol. 11, no. 2, pp. 481–511, 2007.

[27] F. van Ham, M. Wattenberg, and F. B. Viegas, “Mapping text with phrase nets,”IEEE Trans. Visual Comput. Graphics, vol. 15, no. 6, pp. 1169– 1176, 2009.

[28] B. Shneiderman and C. Dunne, “Interactive network exploration to derive insights: Filtering, clustering, grouping, and simplification,” in Graph Drawing, 2012.

[29] T. von Landesberger, A. Kuijper, T. Schreck, J. Kohlhammer, J. J. van Wijk, J.-D. Fekete, and D. W. Fellner, “Visual analysis of large graphs,” EuroGraphics - State of the Art Report, pp. 37–60, 2010.

[30] F. Chierichetti, R. Kumar, and S. Lattanzi, “On compressing social networks,” inKDD, 2009.