3.3 Visualisation Techniques
3.3.1 Social Network Visualisation
A “social network” is defined as a system of agents that interact with each other and have some pattern of contact between them [52, 54]. Social network visu- alisation is a computational technique that generates images of these social net- works, based on the connections and relationships that exist between individual agents. The technique of social network visualisation originated through sociol- ogy [69] and mathematical graph theory [52], where the idea was developed for using points and lines (called “graphs” or “social network graphs”) to represent systems of agents and their connections. The purpose for using the points and lines representation was to aid with visualising relational data, which describe how individual agents relate to each other in the social network.
Node(individual)
Edge(connection/relationship)
Figure 3.5: Simple diagram showing the components of a social network graph.
Social network visualisation is often used for analysing relational information, due to its easy-to-understand graph representation. The diagram shown in Fig- ure 3.5 indicates the basic components of a social network graph. Each “Node” represents an individual and each “Edge” represents a connection or relationship between two particular individuals. The edges can be assigned arrows to indi- cate a directional relationship that is one-way, and also be assigned different line thicknesses to indicate the strength of a relationship. For example, thin lines in- dicate a weak relationship, while thick lines indicate a strong relationship. This type of visual representation enables the user to immediately perceive and under- stand the network of connected individuals, which is one of the useful features of using social network visualisation.
Analysing The Social Networks In E-mail Traffic
The purpose of using social network visualisation for e-mail traffic analysis is to present the user with an overview of the communication ties between particular e-mail accounts. E-mail traffic log data like the one in Table 3.1 is difficult for the user/analyst to understand when examining the data for connections between different e-mail accounts. Social network visualisation aids the user/analyst with gaining an overview of the social network of e-mail users, so that the user/analyst can understand the communication ties and relationships between particular e- mail accounts. This also enables the user/analyst to spot areas of interest in the e-mail social network, such as the clustering of e-mail users into distinct social groups or communities [43, 44, 70]. E-mail Traffic Data Transformation to Relational Data Mapping to Visual Representation Output onto Graphical Display Exploration of Data by the User
Social Network Visualisation Feedback Parameters from the User
Figure 3.6: The social network visualisation process.
To convert e-mail traffic data into social network graphs, the data is processed using the steps shown in Figure 3.6. Firstly, the e-mail traffic data is transformed into relational data, like the one shown in Table 3.2. This relational data is ex- tracted from the traffic data, producing information about connections between e-mail accounts. The relational data may also contain information about the number of e-mails sent between e-mail accounts, which can be used to indicate the connection strength of the relationship. The relational data is then mapped into its graph image through social network visualisation, which computation- ally maps the relational data into its visual representation and outputs it onto a graphical display. When creating the visual representation, social network visu- alisation also applies a type of layout algorithm (e.g. multidimensional scaling [71, 72], spring embedder algorithm [73, 74]) that organises the layout of the points and lines in the image. This is so that the graph image is automatically arranged in a convenient manner for the user to look at and analyse. In the final output, a social network graph of the e-mail traffic data is produced, like the one shown in Figure 3.7, which was created using a program called GUESS [75].
The process of social network visualisation shows how it is useful for providing the user/analyst a visual representation of e-mail traffic data and also an overview of the connections between e-mail accounts.
Table 3.2: Example of relational data extracted from e-mail traffic data.
E-mail Account Address Associate Address Number of E-mail
Messages Sent [email protected] [email protected] 170 [email protected] [email protected] 186 [email protected] [email protected] 105 [email protected] [email protected] 104 [email protected] [email protected] 95 [email protected] [email protected] 36 [email protected] [email protected] 93 [email protected] [email protected] 63 [email protected] [email protected] 104 [email protected] [email protected] 213 [email protected] [email protected] 26 [email protected] [email protected] 145 [email protected] [email protected] 126 [email protected] [email protected] 37 [email protected] [email protected] 92 [email protected] [email protected] 62 [email protected] [email protected] 33 [email protected] [email protected] 62 [email protected] [email protected] 16 [email protected] [email protected] 30 [email protected] [email protected] 23 [email protected] [email protected] 32 [email protected] [email protected] 216 [email protected] [email protected] 108 [email protected] [email protected] 175 [email protected] [email protected] 138 [email protected] [email protected] 128 [email protected] [email protected] 64 [email protected] [email protected] 148 [email protected] [email protected] 188 [email protected] [email protected] 103 [email protected] [email protected] 136
[email protected] [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] [email protected]
Figure 3.7: Social network visualisation of data from Table 3.2.
Limitations of Social Network Visualisation
While social network visualisation is useful for visualising the communication ties and relationships between e-mail accounts, it does have a number of limi- tations. Firstly, when the number of individuals visualised becomes extremely large (e.g. more than 200 individuals), it becomes increasingly difficult for the user to perceive the distinct communication ties and relationships between par- ticular individuals. This is because as the number of individuals increases, the number of visualised connections also increases, resulting in a social network graph that is crowded with communication ties between the large number of in- dividuals. An example of this is shown in Figure 3.8. Due to the crowding of large numbers of individuals in the social network graph, this makes it difficult for the user to locate and investigate e-mail accounts that may be exhibiting un- usual communication behaviour.
Another limitation with social network visualisation is that the social network graphs produced are not ideal for representing the temporal information con- tained in e-mail traffic data. The temporal aspect of e-mail traffic data is impor- tant, given that the relationship between e-mail accounts does not always remain the same and may fluctuate due to the occurrence of particular events. The infor- mation displayed in social network graph images (e.g. the name/address of the
Figure 3.8: Example of a large network of 355 e-mail users.
individual, the number of connections, the strength of the connections) cannot show temporal changes using the points and lines representation. To overcome the limitations of displaying temporal information with social network visual- isation, time-series visualisation was used in this research as another type of visualisation technique for analysing e-mail traffic.