• No results found

Using adjacency matrices to lay out larger small-world networks

N/A
N/A
Protected

Academic year: 2021

Share "Using adjacency matrices to lay out larger small-world networks"

Copied!
14
0
0

Loading.... (view fulltext now)

Full text

(1)

Citation: Gibson, Helen and Vickers, Paul (2016) Using adjacency matrices to lay out larger

small-world networks. Applied Soft Computing, 42. pp. 80-92. ISSN 1568-4946

Published by: Elsevier

URL:

http://dx.doi.org/10.1016/j.asoc.2016.01.036

<http://dx.doi.org/10.1016/j.asoc.2016.01.036>

This version was downloaded from Northumbria Research Link:

http://nrl.northumbria.ac.uk/25672/

Northumbria University has developed Northumbria Research Link (NRL) to enable users to

access the University’s research output. Copyright

©

and moral rights for items on NRL are

retained by the individual author(s) and/or other copyright owners. Single copies of full items

can be reproduced, displayed or performed, and given to third parties in any format or

medium for personal research or study, educational, or not-for-profit purposes without prior

permission or charge, provided the authors, title and full bibliographic details are given, as

well as a hyperlink and/or URL to the original metadata page. The content must not be

changed in any way. Full items must not be sold commercially in any format or medium

without formal permission of the copyright holder. The full policy is available online:

http://nrl.northumbria.ac.uk/policies.html

This document may differ from the final, published version of the research and has been

made available online in accordance with publisher policies. To read and/or cite from the

published version of the research, please visit the publisher’s website (a subscription may be

required.)

(2)

ContentslistsavailableatScienceDirect

Applied

Soft

Computing

jo u r n al h om ep a g e :w w w . e l s e v i e r . c o m / l o c a t e/ a s o c

Using

adjacency

matrices

to

lay

out

larger

small-world

networks

Helen

Gibson

a,∗

,

Paul

Vickers

b,1

aCENTRIC,SheffieldHallamUniversity,CantorBuilding,153ArundelStreet,SheffieldS12NU,UK

bNorthumbriaUniversity,DepartmentofComputerScienceandDigitalTechnologies,PandonBuilding,CamdenStreet,Newcastle-upon-TyneNE21XE,UK

a

r

t

i

c

l

e

i

n

f

o

Articlehistory: Received4January2015 Receivedinrevisedform 12November2015 Accepted20January2016 Availableonline29January2016 Keywords:

Smallworldnetworks Adjacencymatrix Nodeattributes Graphvisualization Targetedprojectionpursuit

a

b

s

t

r

a

c

t

Manynetworksexhibitsmall-worldproperties.Thestructureofasmall-worldnetworkischaracterized byshortaveragepathlengthsandhighclusteringcoefficients.Fewgraphlayoutmethodscapturethis structurewellwhichlimitstheireffectivenessandtheutilityofthevisualizationitself.Herewepresent anextensiontoournovelgraphTPPlayoutmethodforlayingoutsmall-worldnetworksusingonlytheir topologicalpropertiesratherthantheirnodeattributes.TheWatts–Strogatzmodelisusedtogenerate avarietyofgraphswithasmall-worldnetworkstructure.Communitydetectionalgorithmsareusedto generatesixdifferentclusteringsofthedata.Theseclusterings,theadjacencymatrixandedgelistare loadedintographTPPand,throughuserinteractioncombinedwithlinearprojectionsoftheadjacency matrix,graphTPPisabletoproducealayoutwhichvisuallyseparatestheseclusters.Theselayoutsare comparedtothelayoutsoftwoforce-basedtechniques.graphTPPisabletoclearlyseparateeachofthe communitiesintoaspatiallydistinctareaandtheedgerelationshipsbetweentheclustersshowthe strengthoftheirrelationship.Asasecondarycontribution,anedge-groupingalgorithmforgraphTPPis demonstratedasameanstoreducevisualclutterinthelayoutandreinforcethedisplayofthestrength oftherelationshipbetweentwocommunities.

©2016TheAuthors.PublishedbyElsevierB.V.ThisisanopenaccessarticleundertheCCBYlicense (http://creativecommons.org/licenses/by/4.0/).

1. Introduction

Small-worldnetworksareacommonlyoccurringgraph

struc-ture characterized by short average path lengths and high

clusteringcoefficients [1].This meansthateven when the

net-workislargethereareveryfewstepsbetweeneachpairofnodes.

Despitetheirprevalenceveryfewmethodsareabletolaythem

outsuchthattheirstructureiscommunicatedoptimally[2].

Exam-plesofnetworksthatdisplaythesecharacteristicsinthereal-world

includesocialnetworks[3],biologicalnetworks[4]andeven

geo-physicalones[5].

Thehighclustering coefficientofsmall-worldnetworks

pro-videsaninterestingproblemforlayoutparticularlyasithasbeen

shownthatusersseektolayoutgraphssuchthattheirclustered

structureisapparent[6].Thus,ensuringthattheclustered

struc-tureofthegraphiswellrepresentedinthelayoutseemscrucial

forenhancingusers’understandingofthegraphinthecontextof

itsclustersandtherelationshipsbetweenthem.vanHamandvan

Wijk[7]haveproposedoneofthefewsmall-worldnetworkspecific

∗ Correspondingauthor.Tel.:+4401142256819;fax:+4401142256702. E-mailaddresses:h.gibson@shu.ac.uk(H.Gibson),

paul.vickers@northumbria.ac.uk(P.Vickers).

1 Tel.:+4401912437614..

layoutmethodswhileGibsonandFaith[8]putforwardamethod

basedonnode-attributedata.

Howclustersarerepresentedinagraphisextremely

impor-tantasthehumanperceptualsystemwillnaturallycauseusersto

assumethatthereisarelationshipbetweennodesthatareplaced

closetogether[9].Forgraphsthatarehighlyclusteredadheringto

thisprinciplewhenlayingoutthegraphiskeyfor

communicat-ingthestructureofthenetwork.Generally,force-directedlayouts

donotproduceanaccuraterepresentationofsmall-worldnetwork

structures.Thisisbecausetheytrytooptimizethelayouttohave

uniformedgelengthsbutlongeredgelengthsareusuallyrequired

toseparateclusters[2].

Force-directed methods such as LinLog (linear-logarithmic)

[2,10]andOpenOrd(asuccessortoVxOrd)[11]dotrytooptimize

forclusteringbasedontopologywhilethereareothermethods

thatuseattributesorpre-computedclusterings.MuelderandMa’s

treemap[12]andspace-filling[13]approachesuseapre-computed

clustering,whilethegroup-in-the-boxlayout[14]cantakeanyuser

inputorpre-computedclustering.

graphTPP(graphtargetedprojectionpursuit)[8]isamethod,

encapsulatedin an interactivesoftware program, that has

pre-viouslybeenusedfor thelayoutofsmall-worldnetworksusing

node-attributes.Thegraphislaidout,assistedbydirectuser

inter-action, accordingtothespecifiedclustering resultingin aclear

visual separation of the clusters. Here we propose that rather

http://dx.doi.org/10.1016/j.asoc.2016.01.036

(3)

thanusingnode-attributes,theadjacencymatrixofthegraphcan

replacethemultidimensionalmatrixofattributesandby

signif-icantlyincreasingthesizeofthegraph(intermsofthenumber

ofnodes)demonstratesthatgraphTPPisscalablebeyondthevery

smallexamplesusedinthepreviouswork.

Thepaperproceedsasfollows.Section2discussesrelatedwork

onthelayoutofsmall-worldnetworks.Section 3introducesthe

Watts–Strogatzmodelthatisusedforcomputingthesmall-world

networksandthecommunitydetectionalgorithmsusedbythe

lay-out.Section 4presentsthegraphTPPlayoutsofthesmall-world

networksforbothoneandtwo-dimensionalWatts–Strogatz

mod-elswhilstcomparingthemtoresultsobtainedusingtheOpenOrd

and ForceAtlas layout algorithms. It also introduces an

edge-groupingtechniqueforreducingvisualcluttercausedbytheedges

inthegraphTPPlayout.Section5discussestheresultsandthe

lim-itationsofthisworkwhilealsorecommendingdirectionsforfuture

work.Section6thenconcludesthepaper.

ThisresearchshowsthatgraphTPPoutperformsOpenOrd[11]

and ForceAtlas [15] as a method for laying out small-world

networkswheretheaimistooptimizethelayoutforthe

communi-tiesdetectedthroughvariouscommunitydetectionalgorithms.The

maincontributionthenisthedemonstrationthatgraphTPPcanbea

viablelayoutmethodevenwhenthereisnotypicalnode-attribute

dataavailableuponwhichtobasethelayout.

2. Relatedwork

2.1. Small-worldnetworks

Small-worldnetworksarecharacterizedbyshortaveragepath

lengths(theshortestpathbetweenanypairofnodes)and high

clusteringcoefficients(e.g.,insocialnetworksthiswouldbethe

number of friendsa person haswho are alsofriends, i.e., they

completethetriangle).Theaveragepathlengthonlygrows

loga-rithmicallywiththenumberofnodes,whiletherearemanycliques

ornearcliques.

Oneofthemostnotableexamplesofasmall-worldnetworkis

fromMilgram’s[16]sixdegreesofseparationexperimentwhich

proposedthatmostpeopleintheUnitedStatesatthattimewere

separatedbyonlysixpeopleinachainoffriendship.Recently,Boldi

andVigna[17]havemodelledthedegreeofseparationinthe

Face-bookgraphtobeonly3.74.Therearemultipleotherexamplesof

small-worldnetworksoccurringinthereal-worldincludingsocial

networks[3],biologicalnetworks[4]andgeophysicalnetworks[5].

AlbertandBarabási[4]havehypothesizedthattheprevalenceof

networksinbiologywithsmall-worldpropertiesisduetotheir

inherentstructuraladvantages.

There are a number of ways to model a small-world

net-work,themostpopularbeingtheWatts–Strogatzmodel[1].The

Watts–Strogatzmodelrequirestheconstructionofaregularring

latticefollowedbyrandomrewiringoftheedgesaccordingtoa

rewiringprobabilityp.Thisproducesagraphwithshortaverage

pathlengthsandahighclusteringcoefficient;however,compared

toreal-worldsmall-worldgraphsthedegreedistributiondoesnot

tendtobescale-free(i.e.,doesnotfollowapowerlawdistribution).

TheBarabási-Albert[4]modeldoesproduceascale-freenetwork

butitdoesnotexhibittheclusteringpropertiesthatareintegral

tosmall-worldnetworksandessentialforthisvisualization

tech-nique.

Giventhatsmall-worldnetworksaresuchacommonly

occur-ringgraphstructureitissurprisingthatlittleattentionhasbeen

paidtotheirvisualization.Theirstructuralpropertiesarenotwell

suitedtogeneralforce-basedlayoutmethodssincethese

meth-odsoftentrytomapshortestpathlengthstoEuclideandistances.

Theshortaveragepathlengthspossessedbysmall-worldnetworks

resultinhighdegreenodesbeingplacedclosetothecentreofthe

graphandthisencouragesthewell-knownhairball-stylelayoutto

form(foranexamplein thispaperseeFig.9(c)).Further,many

force-basedtechniquesalsotrytooptimizethelayoutaccordingto

certainaestheticcriteria.Onesuchcriterionisuniformityofedge

length;however,longedgelengthsarerequiredtoseparate

clus-tersandassuchtheforce-stylelayoutsdonotwellrepresentthe

clustersformedbythesmall-worldnetwork[2].

TenyearshavepassedsinceAuberetal.[18]commentedthat

the“thestructuralpropertiesofsmall-worldnetworkshavenot

yetbeenfullyexploitedfromavisualizationperspective”buttoour

knowledgestillveryfewtechniquesexistthatdealspecificallywith

layoutofgraphsexhibitingsmall-worldproperties.Auberetal.’s

[18] multi-scale small-world layout is one solution. It requires

decomposingthe graph in a hierarchical mannerby rating the

strengthofeachedgeandremovingtheso-called‘weaker’ones.

Thishelpswithexploringtheindividualclusters.

Othersmall-worldnetworklayoutsincludevanHamandvan

Wijk’s[7]thatusesanadaptedversionofNoack’s[2]force-model

tofurtherseparatetheclusterswhilevanHamandWattenberg

[19]createasparsebackboneofthegraphbyremovingedgeswith

lowbetweennesscentrality,layingoutthegraphatthisleveland

thenproceedingtoaddedgesbackin.Topolayout[20]alsodetects

small-worldfeatureswhileHiMap[21]hasadaptedKamadaand

Kawai’s[22]algorithmtoproduceclusteredlayouts.

Recently,GibsonandFaith[8]usedaprojectionbasedtechnique

tolayoutsmall-worldnetworksbasedonnode-attributesaiming

toproduceaclusteredlayout.Hereweextendthattechniquetoa

muchlargerclassofsmall-worldnetworksandnolongerrelyon

thenode-attributesbutinsteadusethetopologicalstructureofthe

graphitself.

2.2. Graphlayout

There are many hundreds of solutions to the graph layout

problemeachproposingadifferentmethodtohighlightdifferent

featuresofthegraph.Whileforce-basedtechniquesareextremely

popular,theyaregenerallynotsuitableforsmall-worldnetworks

forthereasonsdetailedabove.Thereare,however,afew

force-basedmethodsthattrytooptimizeforclusteringand,hence,are

potentialcandidatesforthelayoutofsmall-worldnetworks.

Noack’s[2,10]LinLoglayoutsareonesuchexampleandthey

havebeenappliedtothelayoutofsmall-worldnetworks.The

Lin-Logtechniqueignorestheuniformedgelengthcriterioninorder

toseparateclusters.Thenamederivesfromthefactthatthemodel

haslinearattractiveforcesbutlogarithmicrepulsiveforces.Noack

hasalready shownthatthemethodcanbetterseparateagraph

intoclusterscomparedtotheFruchterman–Reingoldmethod[23].

OpenOrd[11]isanotherforce-basedmethodthataimsto

encour-ageclusteringbasedonthesimulatedannealingalgorithm.Itisa

multi-levelmethodthatcutslongedgestoreducethedomination

oftherepulsiveforceswhichimprovesclustering.Itisahighly

scal-ablealgorithmthatcanlayoutgraphswithhundredsofthousands

ofnodes.

Similar to some force-based methods are those that use

dimensionreduction.Bothclassicalanddistancemultidimensional

scaling(MDS)usethedissimilaritymatrixoftheshortestpaths

betweennodepairstogeneratetheembeddingwhilePivotMDS

[24] is a samplingapproximation technique toclassical scaling

wherenodesarepositionedaccordingtothepositionsofasubset

ofpivotnodes.High-dimensionalembedding(HDE)[25]issimilar

toPivotMDSbutthefinalstepusesprincipalcomponentanalysis

toprojectthelayoutonto2Dspace.

Other methods that make use of dimension reduction are

(4)

projectionforthelayoutandPEx-graph[27]whichcaneitheruse

attributesorthegraph’sconnectivityforlayout.

Becausesmall-worldnetworksareinherentlyclustered,

cluster-basedlayout techniques arealso appropriate. Methods suchas

Group-in-a-box[14]requireaclusteringtobepre-definedbutthey

thenplaceeachclusterintoitsownboundedboxandthenodes

ineachbox(cluster)canhavetheirownlayoutapplied.Muelder

andMaalsoproposedtreemap[12]andspace-fillingapproaches

[13]whichusecommunitydetectionalgorithmstopre-compute

aclusteringfirstandthenusethecommunitiestodeterminethe

decompositionintothetreemaportheorderofplacementforthe

space-fillingmethod.

2.3. TargetedprojectionpursuitandgraphTPP

Targeted projection pursuit (TPP) [28,29] is an open-source

exploratory, visual, interactive dimension reduction technique

incorporatedintoapieceofsoftware,knownastheTPPtool,that

providesaGUIfront-endthatpresentsbothareal-time

visualiza-tionofthecurrentprojectionandtheabilityfortheusertointeract

withthisprojectiondirectly.2Thetoolitselfisbuiltontopofthe

dataminingsoftwareWeka[30]andthroughtheTPPtooltheuseris

abletoexplorethespaceofpossiblelinearprojectionsfroma

high-dimensionalspaceontotwodimensions.Thetechniquefocuseson

threeareas:findingaprojectionthatgroupsthepointsinto

speci-fiedclusters,identifyingthediscriminatorydimensionsthatcanbe

usedtodescribeandanalyzetheclustersand,thirdly,identifying

outliers.

Throughtheinteractiveuserinterface,theusercanseparate

nodesintoclustersordragthemaboutin thetwo-dimensional

spacetofittheirintuitionorunderstandingofthedata.The

under-lyingTPPalgorithmwillthensearchfortheprojectionthatbest

matchesthetargetviewthattheuserwantstosee.

Formally,asetofnentitiesisdescribedbythen×kmatrixX

thatdefineseachentity’spositionink-dimensionalspace.Ak×2

projectionmatrixPmapstheentitiesontotwodimensionalspace.

Whentheuserdefinesann×2targetviewT,TPPsearchesfora

projectionthatminimizesdifferencebetweenthistargetviewand

theprojection.Thatis

minT−XP. (1)

Theprojectionmatrixisfoundbytrainingasingle-layer

percep-tronartificialneuralnetworkwithkinputsandtwooutputs.The

nrowsoftheoriginalmatrixareexaminedinorderandstandard

back-propagationisusedtotrainthenetworktogeneratetherows

ofthetargetmatrixTaccordingtoaleast-squarescalculation.Once

convergenceisreachedtheoriginaldataistransformedintothe2D

viewwheretheconnectionweightbetweeneachinputneuronand

thetwooutputneuronsgivestheweightofeachdimensioninthe

finalprojectionandthustheprojectionmatrix.

graphTPP leverages theTPP algorithm for graph layout and

extendstheoriginalTPPtoolsoftwaretoimportgraphdataand

updates thevisualdisplay and controlpanelto facilitate

view-ingand interactionwiththe graph.The originalmotivation for

graphTPPwastousetheattributesofthenodesfor the

dimen-sionreductionprocessthatcreatesthelayout.Eachnode-attribute

woulddescribea dimensionofthedataand givena clustering,

eitherpre-definedorthroughthek-meansalgorithm,theaimwas

tolayoutagraphinaclusteredfashionsuchthatthelayoutwould

beadirectresultoftheattributes,bringingthegraph’s

topologi-calstructureanditsattributestogether.Thiswouldfacilitatethe

2 https://code.google.com/p/targeted-projection-pursuit/.

understandingofthegraph’sstructurefromthepointofviewof

theattributes.

Clustering,in particular,hasbeenidentified asanimportant

structuralfeaturetoberepresentedinlayoutwithusersfavouring

itovertraditionalaestheticcriteria[6]whiletheuseofattributes

inlayoutcanbeusedtocreateadeeperunderstandingofthegraph

[31].graphTPPexpresslyaimstoseparatethegraphintoclusters.

TheuseofgraphTPPhereisbasedonthesameprincipleofusing

ahigh-dimensionaldatamatrixthatdescribessomefeaturesofthe

graph’snodesbuttheattributematrixisreplacedbytheadjacency

matrix(seestartofSection4).Theadjacencymatrixassociateseach

nodewithacolumnandarow.Anentryismadeinposition(i,j)

ifnodeiisconnectedtonodej.Forexample,inFig.1the

high-lightednodesandedgeshowninthegrapharecircledinthetable,

i.e.,theadjacencymatrix.Thislayoutmethodhasmoreincommon

withtheforcetechniquesasbothrelyonthetopologicalstructure

forlayout.graphTPPisthenabletoleveragealltheinteractive

fea-turesoftheoriginalTPPtoolsuchastheautomaticseparationof

pointsintoclusters,directuserinteractionaswellasanumberof

optionsforadjustingthevisualfeaturesdisplayedonthelayout

(suchasnodesize,colour,shapeetc.).graphTPPalsoincorporates

newoptionsforcontrollingthevisualfeaturesincludingedgeshape

andappearance,nodelabellingandsomefilteringoptionsnot

uti-lizedbythegraphsandlayoutsinthispaper.Furtherexplanation

oftheprocessofusinggraphTPPinthispaperisgiveninSection

4.1.

3. Small-worldnetworks

3.1. Networkgeneration

The small-world networks are generated according to the

Watts–StrogatzmodelasimplementedinRpackageigraph[32].

TheWatts–Strogatzmodelaimstogenerateagraphwithahigh

clusteringcoefficientandashortaveragepathlength,thus

simu-latingthecharacteristicsofasmall-worldnetwork.

ThebasicWatts–Strogatzsmall-worldmodel assumesa ring

latticeofnnodeswithkconnectionspernode.Eachedgeisthen

randomlyrewiredwithaprobabilitypwherep=0woulddescribea

regularnetworkandp=1describesacompletelyrandomnetwork.

ThegraphhasthestructuralpropertiesL(p)whichdescribespath

lengthandC(p)whichdescribestheclusteringcoefficient.When

p=0thevalueofLgrowslinearlywithnandiswellclustered(high

L(p)andhighC(p))toformalarge-worldnetwork.Whenp=1the

randomnetworkispoorlyclustered(lowL(p)andlowC(p))andL

growslogarithmicallywithn.However,forintermediatevaluesof

p,L(p)isalmostassmallasLrandomwhileC(p)≫Crandom.Hence,it

approximatesthesmall-worldpropertiesofashortaveragepath

lengthandahighclusteringcoefficient.

TheigraphpackageinR[33]includesanimplementationofthe

Watts–Strogatzmodelwhichwasexecutedbyvaryingthe

follow-ingparameters

watts.strogatz.game(dim, size, neighbours, p)

wheredim=dimensionofthelattice(1foraringlattice);size

=thenumber ofnodesin eachdimension;neighbours=num.

nearestneighbourconnections;p=therewiringprobability.

Table1 shows thehow theparameters werevaried. A

con-stantrewiringprobabilityof 0.05wasusedwhileonly oneand

two-dimensionalmodelswereconsidered.

3.2. Communitydetection

Innetworkscience,communitydetectionisthepartitioningof

agraphintoclustersorcommunities.Generallythepartitionsare

(5)

Table1

Theparametersusedtogeneratethesmall-worldnetworksandthenumberofcommunitiesdetectedbythecommunitydetectionalgorithms.

Parameters Communitydetectionalgorithms

Nodes Edges Dim Size Nearest

neighbours

Rewiring probability

Edge betweenness

Fastgreedy Leading eigenvector

Spinglass Walktrap Label propagation 500 5000 1 500 10 0.05 12 3 12 9 9 21 1000 5000 1 1000 5 0.05 18 6 27 10 20 74 1500 7500 1 1500 5 0.05 22 9 34 10 31 102 2000 4000 1 2000 2 0.05 51 46 45 10 93 243 2000 10,000 1 2000 5 0.05 30 7 47 10 38 150 3000 15,000 1 3000 5 0.05 – 10 – 10 – – 4000 20,000 1 4000 5 0.05 – 13 – 10 – – 5000 25,000 1 5000 5 0.05 – 14 – 10 – – 400 4800 2 20 3 0.05 7 4 8 9 7 1 400 2400 2 20 2 0.05 8 4 7 8 9 10 625 1250 2 25 1 0.05 18 12 14 10 12 56 625 3750 2 25 2 0.05 11 4 10 10 10 25 900 5400 2 30 2 0.05 12 4 11 10 12 22 1225 7350 2 35 2 0.05 12 4 14 10 14 34 1600 9600 2 40 2 0.05 15 4 12 10 16 46

nodesinthesamecommunitythantheyaretothoseoutsidetheir community. Communitydetectionis usually only basedonthe topologicalpropertiesofthegraphratherthanonattributes.Some algorithmsimposealimitonthenumberofcommunitiesthegraph ispartitionedintowhileothersplacenorestrictionsonthenumber. Atthemostbasiclevel,identifyingcommunitiesinthegraph helpsin understanding thegraph’sstructure which canenable theclassificationofanode’sstructuralposition,iftherearenodes thatareontheperipheryofaclusterorpotentiallyoverlapping clusters. The communities may also communicate the hierar-chical organization of thegraph [34]. In this paper we usesix

differentcommunitydetectionalgorithmsincludedintheigraph

package.Theseare edge-betweenness[35](hierarchical

decom-position based on iteratively removing edges with the lowest

edge-betweennesscentrality),fast-greedy[36](mergesnodesinto

communitiesinagreedymannerbasedonoptimizingthe

modu-larityfunction),leadingeigenvector[37] (iterativelydividesthe

graphintocommunitiesbasedonthesignsoftheeigenvectorthat

correspondstothelargesteigenvalueofthemodularitymatrix),

Walktrap[38](mergescommunitiesbasedonshortrandomwalks

sincestayingwithinaclusterismorelikelythanleavingit),

Spin-glass[39] (statisticalphysicsapproachwhere a nodetakesone

ofthe nspin states andstates changes dependingonthestate

of neighbouring nodes) and label propagation [40] (each node

assigned oneof thek labelsand nodestakethemost common

labeloftheirneighbours).Furtherexplanationsofthecommunity

detectionalgorithmsaregiveninthesupplementalmaterialanda

detailedreviewcanbefoundinFortunato[34].

4. Layoutandcomparison

GibsonandFaith[8]demonstratedgraphTPP’scapabilitiesfor

thelayoutofsmall-worldnetworks.Inthatcasethenetworkwas

smallwithonly30nodesclusteredintofourgroups.However,it

indicatedthepotentialofgraphTPPasalayoutmethodfor

small-worldnetworks.Hereweextendthatworkbyapplyingittothe

layoutofmuchlargernetworks.

Moreimportantly,sinceartificiallygeneratednetworksdonot

haveattributesthegraph’sadjacencymatrixisusedinstead.This

meansthat,likeforce-basedmethods,thelayoutnowonlyrelies

onthetopologicalstructureofthegraph.Itiscomparedtotwo

force-basedtechniquesasimplementedinthegraphlayout

soft-wareGephi:Martinetal.’s[11]OpenOrdlayoutmethodandthe

ForceAtlas[15].

The ForceAtlas layout is included as a comparison to a

generalnon-optimized for clustering force-based layout whose

performance should be better than the Fruchterman–Reingold

algorithmbutworsethanLinLogintermsofclustering.Becauseof

systemunavailabilityitwasnotpossibletocompareperformance

againstvanHamandvanWijk’sframework[7].

The networks were generated using the Watts–Strogatz [1]

model intheigraphpackage inR.Variousclusteringsare

com-putedthroughcommunitydetectionalgorithmsandrecorded(see

Table1).Eachgraphcreatedwastheresultofrunningone

particu-larinstanceofWatts–Strogatzmethod.Duetotherandomrewiring

probabilityeachrunofthismethod,evenwiththesame

param-eters,willproduce aslightly differentgraph,interms ofwhich

nodesareconnectedtoeachother;however,thisdifferencedoes

not affectthevalidity oftheseresultsas each graphstill hasa

small-worldstructure.Thecommunitydetectionalgorithmswere

executedwiththeirdefaultparametersexceptfor theSpinglass

methodwhichwasrunwithamaximumoftenspinstateswhich

limitedthenumberofcommunitiestoten.Thiswasdonetoensure

thattherewasatleastonealgorithmthatwasgeneratinga

clus-tersetthatwasnotexcessivelylargeandagraphTPPlayoutcould

beproduced.Outofthesixcommunitydetectionalgorithmsthe

Edge-Betweenness,Fast-Greedy,LeadingEigenvectorand

Walk-trapalgorithmsconsistentlyproducethesamecommunitywhen

runoverthesamegraph,intermsofnumberandsizeof

communi-tiescreated,whiletheSpinglassandLabelPropagationalgorithms

produceslightlydifferentresultseachtime.However,these

differ-encesarenotlargeanditwasconsideredthattestingusingone

particularexampleforthesetwoalgorithmswasstillsufficientfor

demonstrationpurposes.

4.1. Datapreparationandimport

Once thegraphs had been createdin Rand each nodehad

beenassignedacommunityusingeachofthecommunity

detec-tionalgorithmsthecommandget.adjacencywasusedtocreate

theadjacencymatrixofeachgraphandsixadditionalcolumnswere

addedidentifyingthecommunityeachnodebelongedtoforeach

communitydetectionalgorithm. ThiswasthenexportedtoCSV

readytobeconvertedintotheARFFfileformatsupportedbyWeka.

Twofurtherfiles,asimplelistofnodesandthecommunitiesthey

belongtoandanedgelistwasalsoexported.AnARFFfileis

cre-atedbyloadingtheCSVfileintoWekaandensuringthatthefirstn

columns,wherenisthenumberofnodes,aredescribedasnumeric

columns(asthesedefinethedimensionstobeusedinthe

projec-tion).Thenextsixcolumnsareclass-typecolumnswhichdefine

themembershipofnodestoparticularcommunitiesforeach

(6)

Fig.1. Thegraphshownin(a)hasitsadjacencymatrixin(b).Aconnectionbetween twonodesisindicatedbya‘1’attheintersectionofthecorrespondingrowand col-umn.Twonodes8and10arelinkedandhighlightedinboldin(a),thecorresponding valuesintheadjacencymatrixarethencircledin(b).

representingthenode’sID.ThisfileisthenexportedfromWekaas

anARFFfileandcanbeimportedintographTPP.Theedgelistfile

canthenalsobeimportedsothatgraphTPPisabletorepresentthe

edgesinthegraph.

OncetheARFFandedgelistfileareimportedintographTPPthe

usercanselectaparticularsetofcommunitiestotrytoseparate

thenodes.Inthefollowingcasesonceacommunitydetection

algo-rithmhadbeenselectedthe‘Separatepoints’buttonwaspressed

andhelddowntobeginanautomaticseparationofthe

commu-nities.When suchan action is chosen thegraphTPP algorithm

identifies a number of points in space (based on the number

ofcommunities toseparate) where thesepoints aremaximally

separatedfromoneanother.Thisbecomesthetargetprojection

andtheautomaticlayoutprogressesbytryingtoachievethis

max-imalseparation.Ifauserbelievesthattheautomaticseparationis

notachievingthedesiredlevelofseparationtheycanselectasetof

nodes,eitherindividually,bycommunityorusingarectangle

selec-tion.Onceselected,theusercanattempttomovethisnodearound

inordertoachievesomethingclosertotheirdesiredtargetview.

Fig.2showshowthegraphandthelayoutappearingraphTPP.The

figurespresentedinthispaperarebasedontheSVGfileexported

directlyfromgraphTPP.Thecodeanddatafilesneededtorecreate

theseandthefollowingstepscanbefoundathttps://github.com/

helengibson/graphTPP.

TheForce-AtlaslayoutandtheOpenOrdlayoutswereproduced

byloadingthenodesfile(withthecommunityattributes)andthe

edgefileintothegraphlayoutsoftwareGephiandrunningeach

methodwiththeirdefaultparametersandexportingtoSVGusing

Gephi’sexporttool.

The nextsection exploresthe layoutsproduced usingthese

threelayoutsforanumberofdifferentsizedgraphsandcommunity

detectionalgorithms.

4.2. One-dimensionalmodels

Thefirstnetworkproducedfromtheone-dimensionalmodel

contained 500 nodes and 5000 edges. The 5000 edges were

producedbyrequiringeachnodetoconnecttoitstennearest

neigh-boursandthesmall-worldgraphwasproducedbyusingarewiring

probabilityof0.05.

Eachcommunitydetectionalgorithmwasrunonthegraphanda

setofclusterswasproducedforeachalgorithm.Thesixalgorithms

producedclustersetsofvaryingsizesrangingfrom3to21.The

Walktrapcommunitydetectionalgorithm[38]isbasedonrandom

walksandpartitionedthegraphinto9clusters.Fig.3showsthe

graphwiththisWalktrapcommunitydetectionalgorithmapplied

foreachofthethreelayoutmethods.Table1inthe

supplemen-talmaterialshowsthelayoutforallofthecommunitydetection

algorithmsforthisgraph.

Eachlayoutisclearlyinfluencedbytheoriginalringstructure

fromwhichthenetworkwasgenerated.However,onlygraphTPP

isabletoseparatetheclusterscorrectlyandintotheirowndistinct

area.ForceAtlasalsoseparatestheclusterscorrectlybutthe

snake-likelayoutmeansthattheyleadontooneanother.Thismakes

analysisofhoweachclusterisconnectedtotheothersdifficult.

Colourisalsorequiredtodeterminewhereeachclusterstartsand

ends.TheOpenOrdmethodalsoshowstheclustersleadingonto

oneanotherbutthenodesarenowpositionedintotightgroups.The

algorithmactuallysubdividesthemintosmallerclustersthanthose

detectedbytheWalktrapalgorithm.Again,itisnotclearwithout

thecolouringtowhichclustereachofthesesub-clustersbelongs.

However,seeingtheseclusterscouldalsostimulatefurther

inves-tigationintowhytheyaresub-dividedupbythismethodand,in

particular,thesignificanceofwheretheclustersoverlap.

graphTPPisnotonlyatoolforlayout,italsofacilitatesanalysis

ofthegraphaccordingtoitsattributecomposition.Fig.4shows

thecontributionthatfourdifferentnodesmaketothelayout.In

Fig.4(a)and(b)therednodesarethosepossessingthatparticular

attribute(inthiscasethatmeansnodeswhichareconnectedtothat

particularnode).Thosehighlightedattributesareclearlytwoofthe

maincontributorstothatspecificcluster,i.e.,thatnodeconnects

tomanynodeswithinthatcluster.Theattributeshighlightedin

Fig.4(c)and(d)showthesenodesconnectequallytoadjacent

clus-ters.Thisindicatesthatthosetwonodesmaybebridgesbetween

thetwocommunities.

While a graph with 500 nodes is a significant increase on

thegraph usedin Gibsonand Faith[8]it is stillfairly small.A

(7)

Fig.2.AscreenshotofthegraphTPPlayoutinterfacewherethenumberofnodesis500andthenumberofedgesis500.Therighthandpanelshowssomeoftheoptions availabletouserswhenlayingoutthegraph.

impactontheabilitytoseparatenodesintoclearlydefined

clus-tersfor both theForceAtlasand OpenOrdalgorithmsas shown

in Fig. 5. With 3000 nodes and 15,000 edges graphTPP is still

able to separate the clusters clearly. In this case there are

10 clusters which were again determined using the Walktrap

algorithm. The clear separation enables some analysis of the

strengthoftherelationshipsbetweendifferentcommunities,

fur-therevidencedinFig.7.

Fig.3.Thesmall-worldgraphwith500nodesand5000edgeslaidoutusingthethreedifferentalgorithmsusingtheWalktrapcommunitydetectionalgorithm.Theoriginal ringlatticestructureisclearlyvisibleinallthreelayouts.

(8)

Fig.4. Thesamesmall-worldgraphwith500nodesand5000edgesasinFig.3(a).Fourdifferentattributesareselectedandthenodesthathavethoseattributesare highlighted.Arednodeindicatesthatthatnodehasthatparticularattribute.Images(a)and(b)showhowforthosetwoexampleattributes,othernodeswiththesame attributesfallinthesameclusterwhileimages(c)and(d)showhownodesinadjacentclusterssometimesshareattributes.(Forinterpretationofthereferencestocolorin thisfigurelegend,thereaderisreferredtothewebversionofthisarticle.)

However,for thisgraph itismore difficulttoidentifynodes

whichcontributethemosttothedeterminationofcluster

member-ship.Thisisbecausetheaveragedegreeofeachnodeis10whereas

eachclustercontains,onaverage,300nodes.Thisdoesnotaffect

theclusteringabilityofthegraphTPPalgorithmthough.

Forthis graph(Fig.5), thevolumeofedgesin thegraphTPP

layoutoverwhelmstheoverall layoutofthegraphandits

clus-teredstructure.Thismeansthatwhilethenodesarewellclustered,

thesheernumber ofedgesimpedestheability tointerpretthe

strengthoftherelationshipsbetweentheclusters.Section4.3

dis-cussesasolutiontothisproblemasanextensionofthegraphTPP

tool.

Beyond3000–4000nodesitbecomesdifficultbothto

gener-atethegraphsandcalculatethecommunities.Thelargenumber

ofattributesthatincludingeverynodeasanattributebringsled

toslowsystemperformance;runninggraphTPPonatypical

desk-topcomputermeantthatitwasimpracticaltoattempttolayout

graphsof5000nodesandabove.Ofcourse,increasedprocessing

powerwouldmitigatethiseffect.Futureworkshouldalsofocus

oncarryingoutadetailedperformanceanalysisofthegraphTPP

algorithmtolookforopportunitiesforcodeoptimization.Forthis

typeofgraphstructure∼3000nodesisthecurrentpracticallimit

forgraphTPPoperatinginaregulardesktopenvironment.Further

layoutsforgraphswith1000,1500and2000nodescanbefoundin

thesupplementarymaterial.

4.3. Anedge-groupingmethod

Whenexperimentingwiththislayoutmethoditquicklybecame

clearthatthevolumeofedgestobedisplayedwasimpactingon

graphTPP’sefficacy.Forexample,inFig.5(a)theclustersarewell

separatedbutthenumberofedgesmakesitdifficulttointerpret

howstronglyeachclusterisconnected.Thus,whilethelayoutwas

successfulinpositioningthenodes,dealingwiththeedgeswas

becomingaproblemanddistractingfromtheinterestingstructures

ondisplay.

Edge-bundlingalgorithmsgroupedgestoreducevisualclutter.

Multipleedge-bundlingalgorithmshavebeenproposedpreviously

[41–43]buthereweuseanedge-groupingmethodthatmakesuse

oftheintrinsicpropertiesofthegraphTPPlayout,namelythe

clus-ters.Thecentroidofeachclusterisfoundandanedgeisdrawnfrom

eachnodetothecentroidofitsownclusterwithitsexactposition

offsetslightlyfromthecentroidinordertoallowthethicknessof

theedge-grouptobeproportionaltothenumberofedgesit

con-tains.Whenthebundlereachesthecentroidoftheotherclusterthe

edgessplitfromtheedge-groupandcompletetheedge,asshown

inFig.6.UnlikeinFig.6,intheactualgraphTPPlayoutthereare

usuallynodesplacedoverthecentroidpositionandsotheartefact

ofwheretheedgesmeettheedge-groupishidden.Thebundled

viewisintendedtoimprovetheoverviewofthegraphratherthan

(9)

Fig.5.Thesmall-worldgraphwith3000nodesand15,000edgeslaidoutusingeachofthethreelayoutalgorithmsbasedonthefastgreedycommunitydetectionalgorithm withnineclusters.HereweclearlyseethatgraphTPPistheonlylayoutalgorithmthatisabletoseparatetheclusterssuccessfullywhileboththeForceAtlasandtheOpenOrd layoutsproduceamuchmoretangledlayout.

Fig.7showstheedge-groupedlayoutthatcorrespondstothe

layoutinFig.5(a).Thevisualclutterissignificantlyreducedandit

ismucheasiertoseerelationshipstrengthsbetweenthevarious

clusters.

Usingaslightlysmallerexample,with2000nodesand10,000

edgesandalsowithfewerclusters(7comparedto10),the

clar-ity the edge-groupsoffer becomes more obvious. Fig. 8 shows

thegraphTPP layoutfor this graphwiththeoriginal edgesand

withthegroupededgesside-by-side.In Fig.8(a)itisvery

diffi-culttodistinguishthatthevolumeofedgesbetweentheclusters

differsdepending on which pair of clustersis chosen whereas

in Fig. 8(b) it is much clearer. For example, the edge-groups

linkedtocluster2inFig.8(b)appearthickerthanthose

belong-ingtomostoftheotherclusters.Table2showsthetotalnumber

of edges betweeneach cluster in this 2000 node graph which

Fig.6.Exampleofthegroupingmethod.Edgesaredrawnfromeachnodetoa posi-tionslightlyoffsetfromthecluster’scentroid.Fromheretheedgesaregroupedand drawntothecentroidofthecorrespondingclusterwherethegroupthensplitsapart againtomaketheindividualconnections.

supports this conclusion since the values show that cluster 2

has either the highest or second highest number of

connec-tionstoeachoftheothersixclusters.Theedge-groupedversion

of graphTPP was able to show this cluster’s high connectivity

Fig.7.ThecorrespondinggraphTPPedge-groupedlayoutforthesmall-worldgraph with3000nodesand15,000edgesshowninFig.5(a).Theedge-groupsreducethe clutterondisplayandmakethegrapheasiertointerpretwhilekeepingtheclustered structure.

(10)

Fig.8.Theone-dimensionalmodellayoutwith2000nodesand10,000edges.Thenodesaregroupedintosevencommunitiesbythefast-greedycommunitydetection algorithm.(a)TheregulargraphTPPlayoutwhile(b)thelayoutwithgroupededges.(c)and(d)Thegraphgroupedbythesameclusterswitheachclusterindividuallylaid outusingHarelandKoren’s[44]fastmultiscalemethodinNodeXL.(c)Usesbundlededgeswhile(d)usescombinededges.

immediatelywhiletheoriginalgraphTPPversioncouldonlycluster

thenodesbutcouldnotexpresslydistinguishhowwellconnected

theywere.

Fig.8(c)and(d)alsoshowsthis2000nodegraphlaidoutin

NodeXL[45]usingtheGroup-in-a-boxlayout[14]forthemain

lay-outandHarelandKoren’sfastmulti-scalelayout[44]forthewithin

boxlayouts.Thisisamethodthatrigidlyadherestotheclustered

structurebyconstrainingeachclustertoitsownbox.Fig.8(c)uses

anedge bundlingalgorithmtodisplaytheedgeswhileFig.8(d)

usesacombinededgemethodwhichdoesnottakeintoaccount

thenumberofedgesbetweeneachcluster.Intermsof

understand-ingtherelationshipsbetweentheclustersneitherlayoutisableto

dothisaswellasgraphTPPalthoughthelayoutinFig.8(d)isable

toshowsomeofthewithin-clusterstructure.

4.4. Two-dimensionalmodels

Thetwo-dimensionalmodelincrementallyincreasesthe

num-berofnodesalongeachdimension.Forexample,asizeof20gives

400nodes(20×20).Herewe testmodelsuptoa maximumof

1600nodes (40×40). In this model,setting the nearest

neigh-boursparametertotwobalancedtherequirementofhavingenough

edgessothattheycouldbeusedasattributeswithoutoverloading

thegraph.

Table2

Numberofedgesbetweeneachclusterforthe7clustersdefinedbythefast-greedy algorithmforthegraphwith2000nodesand10,000edges.Thefinalcolumnshows thetotalnumberofnodesineachcluster.

Cl. 1 2 3 4 5 6 7 Size 1 1520 72 59 50 44 19 46 334 2 72 1758 71 75 54 24 36 384 3 59 71 1782 75 61 16 39 389 4 50 75 75 1347 39 20 27 298 5 44 54 61 39 1337 38 33 293 6 19 24 16 20 38 441 9 102 7 46 36 36 27 33 9 908 200

Theinitial400nodemodelhad4800edgesandthecommunity detectionalgorithmsdividedthegraphintosetsofclustersof rea-sonablesizesforeachofthealgorithm;thealgorithmsyielded7,4, 8,9,7and1communitiesrespectively(seecorrespondingrowin Table1).Usingtheleadingeigenvectorcommunitydetection

algo-rithm[37],Fig.9showstheForceAtlaslayoutstrugglingtoproduce

anythingotherthanwhatcouldbedescribedasahairballlayout

andalthoughtheOpenOrdalgorithmshowsaclearstructure,itis

onethatdoesnotseemtocorrespondtoanyofthecommunity

detectionalgorithmsusedinthiscase(seeTable6inthe

supple-mentalmaterialforfurthercomparisons).graphTPPwasagainable

(11)

Fig.9.Thesmall-worldgraphwith400nodesand4800edgeslaidoutusingeachofthelayoutthreealgorithmsbasedontheleadingeigenvectorcommunitydetection algorithmwhichdetectseightclusters.graphTPPisclearlyabletoclusterthegraphintothecommunitiesandthegroupededgesin(b)providesusefuladditionalinformation thatdemonstrateshowstronglyeachoftheclustersareconnected.

edgesbringsoutthisstructureevenmoreclearly,preventingthe

edgesfromoverpoweringthegraph.

Thegraphsproduced fromthetwo-dimensionalmodel have

alowerlimit onthemaximumnumberof nodesthatgraphTPP

canlayoutcomparedtotheone-dimensionalmodel.Inthiscase

1225nodeswith7350edgeswasthemaximum.Ascanbeseenin

Fig.10,whichdisplaystheclusteringfoundbytheedge-betweeness

algorithm,ForceAtlaswasagainunabletoproducealayoutthat

clearlyrepresentedthestructureofthegraph.TheOpenOrd

algo-rithm is able to show some structure although it divides the

nodesintomanydifferentcommunities.graphTPPdisplaysalayout

which clearly respects thecommunities detectedby the

edge-betweennessalgorithm.Thisisparticularlyusefuliftheaimisto

understandtherelationshipsbetweenthedifferentcommunities

becausethethickerthebundlethegreaterthenumberofedges

betweenanytwocommunities.Furtherexamplesoflayoutsare

showninthesupplementalmaterialforgraphswith400,625and

900nodes.

InthissectionwehaveshownhowgraphTPPisabletolayout

small-worldgraphswhichrespectacommunityclustered

struc-tureinsuchawaythatthecommunitystructureiscommunicated

throughthelayoutofthegraph.Wehavealsodemonstrated,in

termsofthevisualseparationbetweeneachcluster,thatgraphTPP

isabletoproducethistypeoflayoutmorereliablyandconsistently

thantwootherlayoutmethods:ForceAtlasandOpenOrd.Thus,

layingoutasmall-worldnetworkwithgraphTPPallowsusto(1)

viewtheclusteredstructureofthegraphvisually;and(2)fromthis

clearvisualseparation,wecanbetterunderstandtherelationships

betweendifferentclusters.Oneofthemainaimsofgraphlayoutis

tobetterunderstandtherelationshipsbetweenthedifferentnodes

inthegraphandtodothisvisually.Therefore,sincegraphTPP

pro-videsthisabilitytodothis,throughaninteractiveplatform,ithas

thepotentialtobeappliedtoanumberofdifferent,real-world,

small-worldnetworksand generateinsightsaboutsuchgraphs.

Thus,although theForce-AtlasandOpenOrdwereoftenableto

producelayoutsthatappearedaestheticallypleasingtheselayouts

werenotabletocommunicatetheclusteredstructureinthesame

waythatgraphTPPwas.Furthermore,whilegraphTPPhasits

lim-itsintermsofthesizeofgraphthatitcanlayout,thedegradation

inlayoutqualityforlagergraphswasmuchmoreapparentforthe

Force-AtlasandOpenOrdgraphsthanitwasforgraphTPPdespite

thefactthatoverallForce-AtlasandOpenOrdareabletolayout

graphsofalargersize.

5. Discussion

Thestructure ofa small-worldnetwork shouldmake it

per-fectlysuitedtovisualization.Representingthecommunitiesthat

formwithinthegraphshouldbeaprincipalaimforlayoutifwe

wantthevisualizationtocommunicatethisstructureeffectively.

TheoriginalgraphTPP reliedlargelyonnode-attributestodrive

thelayoutmethods[8].Thehighlyclusteredstructurethat

char-acterizessmall-worldnetworksmeantthattherewaspotentialto

adaptthislayoutmethodbyreplacingthenode-attributematrix

withthesymmetricadjacencymatrixofthegraph.Sincetheedges,

andhenceentriesintheattributematrix,shouldbeconcentrated

withinclustersgraphTPPwasagoodcandidateforgraphlayout.

In thispaper wehave shownthat forsmall-worldnetworks

generated by the Watts–Strogatz model in both one and

(12)

Fig.10.Thesmall-worldgraphwith1225nodesand7350edgeslaidoutusingeachofthelayoutthreealgorithmsbasedontheedge-betweenessdetectionalgorithm.

presentation,visualdistinctionofclustersand representationof

theclusters)inclusteringthenodesintotheirappropriate

prede-finedcommunitiesand,indoingso,producingagraphlayoutthat

outperformedotherlayoutalgorithmsintermsofrepresentingthis

structure.

Whileboththeforce-basedmethodsinitiallyclearlyshowedthe

structureofthecommunitiesintheirlayouts,asthegraphsgrew

largerandmorecomplexthequalityofthetechniquesdegraded.

Evenwiththesmallestofthetwo-dimensionalmodelsthe

ForceAt-lasmethodwasnot abletoproduce a layoutwithany kindof

structurewhiletheOpenOrdmethodusuallyfurtherdividedthe

clustersorevendividedthemcompletelydifferently.

The edge-grouped graphTPP layout further enhanced the

appearanceandutilityof thegraphTPPlayout byremovingthe

visualcluttercausedbytheedgeswhichhadatendencytooverload

therepresentation.Theedge-groupsprovideafurtheradvantageby

showinghowstronglyeachclusterisconnectedtotheothersand

thiscanbeusedtodeterminethemoreimportantclustersinthe

graphandthosewhichhavemoreinfluence.

5.1. Limitationsofsmall-worldnetworklayoutwithgraphTPP

Despite the advantages of the graphTPP layout we have

describedinthelastfewsections,therearestillanumberof

limi-tationsassociatedwiththegraphTPPmethodoflayingoutgraphs.

Whenimporting dataintographTPPthereis nolimit onthe

numberofclustersintowhichthenodescanbedividedorhow

manyitcanattempttospatiallyseparate;however,alimitation,

especiallywiththissmall-worldstructure,isthattheclustersbegin

tobecomeequallyspacedoverthedisplayspace.Thisresultsin

graphTPPlosingsomeofitspowerinbeingabletoshowthe

struc-tureofthegraph.

OneofthegoalsofusingTPPandtheattribute-basedgraphTPPis

toidentifythemostcommonattributesthatoccurineachcluster.In

typicalusesofTPPonlyafewattributesemergeasbeingsignificant

andsotheseattributescanbeusedtodefinetheclusterandform

ausefulpartoftheanalysis.Alimitationinthiscase,sinceweare

nowusingthenodesasattributesandtheedgesastheattribute

values,isthatthesetofsignificantattributesforeachclusteris

muchlarger.Thisisbecauseeachnodeonlyconnectstoafewother

nodesand,therefore,eachattributeonlyhasafewnon-zerovalues.

Intermsofanalysisthismeansthatwhenaclusterisanalyzedits

mostsignificantattributesaremostlikelytobethenodescontained

inthatclusterthustellinguslittlenewaboutthegraphfroman

attributeanalysisperspective.

Theclusteringofthenodesalsocausesafurtherissue.Ascan

beseeninTable2thenumberofwithinclusteredgesisfarhigher

thanbetweenanyoftheclusters.Thisisagoodthinginthesense

thatitiswhatmakesthegraphTPPlayoutalgorithmworksowell

forthistypeofgraphbutthistightclusteringcomesattheexpense

ofbeingabletoseeanyofthesewithinclusteredgesandinstead

theyareobscuredbythenodes.However,thisisnotjustaproblem

limitedtographTPPandcanoccurinanylayoutthatclustersnodes

closelytogether.TheGroup-in-a-boxlayoutshowedtheopposite

problemtothiswherethewithinclusterstructurewasclearerbut

therelationshipsbetweentheclusterslessso.

Oneofthedifficultiesofrepresentingthesesmall-worldgraphs

and,inparticular,theiredgesisthateachcommunityisusually

connectedtoeveryothercommunity.Thismeansthateveninthe

(13)

makethegraphlookclutteredanddisorganized.Ineffect,this

bun-dledlayoutisactuallythelayoutofasmallcliquewhereeachcluster

wouldberepresentedbyasinglenodeandconnectedtoeveryother

node.Thisproblemisexacerbatedasthenumberofcommunities

increases.TheadvantagethatthegraphTPPlayoutmaintainsover

otherlayoutsisthattheproximityoftheclustersisrelatedtothe

topologyandeachonecanbeanalyzedforeachnode’s

contribu-tion.Apotentialsolutionwouldbetofilteredgesbasedonbundle

sizeandonlyshowedgeswhicharepartofabundleoveracertain

size.

Alimitationonthecomputationsideisthatduetothevolumeof

attributesgraphTPPisnotabletokeepupwiththecalculationsand

soalthoughthelayoutisstillinteractivetheinteractionnolonger

happensinreal-timeandthereisagapbetweentheuser

instruct-ingthenodestomoveandwhentheyactuallymove.Theyalsotend

to‘jump’acrossthescreenratherthanmovesmoothly.This

dimin-ishestheinteractionexperienceofgraphTPPandtheintermediate

stagesofinteractiveexploration.

5.2. Futureresearchdirections

Asmentionedintheprevioussection,theabilitytofilterout

someoftheweakerbundlededgesmayalsoaidtheclarityand

usefulnessofthelayout.Theseedgeswouldstill beusedinthe

projectionbutnotdisplayed.

Furtherinvestigationintoothermethodsbeyondmatrix

dia-grams[46]asawaytoshowthedensityofwithinclusteredges

wouldaidnotonlythisvisualizationmethodbutalsoanyothers

whichgroupnodesintoclustersbutthensufferfromtheocclusion

ofwithinclusteredges.Thiswouldallowtheusertoinvestigate

ifthereareanysub-patternswithintheclusterorparticular

fea-turesthatmayhavenotbeenapparentwhenthenodesaretightly

clusteredtogether.

Determiningifthereisaspecificsetoftasksthataremosteasily

accomplishedwiththistypeoflayoutwouldalsoprovidean

advan-tage.Inparticular,thiskindoflayoutseemstosupportoverview

basedtasksandinthiscase,itwouldmeanthattheuserwould

knowthattheycouldlayoutthegraphusingthegraphTPP

tech-niquewhentheyhad aspecificqueryorlineofinvestigationin

mind.Thispaperhasalreadyshownthatassessingthestrengthof

theconnectivitybetweentwoclustersisonesuchpotentialtask.

6. Conclusions

Thispaperhaspresentedanextensiontothesmall-worldspilot

studypresentedinGibsonandFaith[8]wheregraphTPPwasused

tolayout asmall-world network usingnodeattributes.In this

case the number of nodes hasbeensignificantly increased but

graphTPPstillshows asuperiorability toproducea layoutthat

reflectstheclusteredstructureofthegraphcomparedtotwo

force-basedmethods.Theadditionoftheedgebundlingmethodmeans

thatthestrengthofassociationbetweentwocommunitiesisclearly

visible.Theuseofthecommunitydetectionalgorithmsresultedin

reasonablyequallysizedclusterswhichalsoaidedtheanalysisand

visualization.Italsoshowedthatsmall-worldnetworksare

com-patiblewithre-purposingedgesasattributevalues,thusgraphTPP

could,intheory,beappliedasaviablelayoutoptionforanygraph

regardlessoftheexistenceofpre-existingattributesornot.

7. Supplementalmaterial

7.1. Communitydetectionalgorithms

Amoredetaileddescriptionofeachofthecommunitydetection

algorithmscanbefoundinthesupplementarymaterial.

7.2. Alllayouts

Furtherexamplesoflayoutsofgraphofdifferentsizesclustered

usingthecommunitydetectionalgorithms,asdetailedinTable1,

canbefoundinthesupplementarymaterial.

AppendixA. Supplementarydata

Supplementarydataassociatedwiththisarticlecanbefound,in

theonlineversion,athttp://dx.doi.org/10.1016/j.asoc.2016.01.036.

References

[1]D.J.Watts,S.H.Strogatz,Collectivedynamicsof‘small-world’networks,Nature 393(1998)440–442.

[2]A.Noack,Anenergymodelforvisualgraphclustering,in:G.Liotta(Ed.), Graph Drawing,Lecture Notes in Computer Science,vol. 2912, Springer Berlin/Heidelberg,2003,pp.425–436.

[3]O.Sporns,D.R.Chialvo,M.Kaiser,C.C.Hilgetag,Organization,developmentand functionofcomplexbrainnetworks,TrendsCogn.Sci.8(9)(2004)418–425.

[4]R.Albert,A.-L.Barabási,Statisticalmechanicsofcomplexnetworks,Rev.Mod. Phys.74(1)(2002)47–97.

[5]X.-S.Yang,Small-worldnetworksingeophysics,Geophys.Res.Lett.28(13) (2001)2549–2552,http://dx.doi.org/10.1029/2000GL011898.

[6]F.vanHam,B.Rogowitz,Perceptualorganizationinuser-generatedgraph lay-outs,IEEETrans.Vis.Comput.Graph.14(6)(2008)1333–1339.

[7]F.vanHam,J.vanWijk,Interactivevisualizationofsmallworldgraphs,in:IEEE SymposiumonInformationVisualization,2004,pp.199–206.

[8]H.Gibson,J.Faith,Node-attributegraphlayoutforsmall-worldnetworks,in: 15thInternationalConferenceonInformationVisualisation,2011,pp.482–487.

[9]C.McGrath,J.Blythe,D.Krackhardt,Seeinggroupsingraphlayouts, Connec-tions19(2)(1996)22–29.

[10]A.Noack,Energymodelsforgraphclustering,J.GraphAlgorithmsAppl.11(2) (2007)453–480.

[11]S.Martin,W.M.Brown,R.Klavans,K.W.Boyack,OpenOrd:anopen-source toolboxforlargegraphlayout,in:Proc.SPIE7868,2011,pp.786–806.

[12]C.Muelder,K.-L.Ma,Atreemapbasedmethodforrapidlayoutoflargegraphs, in:2008IEEEPacificVisualizationSymposium,2008,pp.231–238.

[13]C.Muelder,K.L.Ma,Rapidgraphlayoutusingspacefillingcurves,IEEETrans. Vis.Comput.Graph.14(6)(2008)1301–1308.

[14]E.M.Rodrigues,N.Milic-Frayling,M.Smith,B.Shneiderman,D.Hansen, Group-in-a-boxlayoutformulti-faceted analysisof communities,in:ThirdIEEE ConferenceonSocialComputing,2011.

[15]M.Jacomy,S.Heymann,T.Venturini,M.Bastian,ForceAtlas2,Agraph lay-outalgorithmforhandynetworkvisualization(draft).http://www.medialab. sciencespo.fr/publications/JacomyHeymannVenturini-ForceAtlas2.pdf. [16]S.Milgram,Thesmallworldproblem,Psychol.Today2(1)(1967)60–67.

[17]P.Boldi,S.Vigna,Fourdegreesofseparation,really,CoRRabs/1205.5509. [18]D.Auber,Y.Chiricota,F.Jourdan,G.Melancon,Multiscalevisualizationofsmall

worldnetworks,in:IEEESymposiumonInformationVisualization,2003,p.10.

[19]F.VanHam,M.Wattenberg,Centralitybasedvisualizationofsmallworld graphs,Comput.Graph.Forum27(3)(2008)975–982.

[20]D.Archambault,T.Munzner,D.Auber,Topolayout:multilevelgraphlayoutby topologicalfeatures,IEEETrans.Vis.Comput.Graph.13(2)(2007)305–317.

[21]L.Shi,N.Cao,S.Liu,W.Qian,L.Tan,G.Wang,J.Sun,C.-Y.Lin,Himap:Adaptive visualizationoflarge-scaleonlinesocialnetworks,in:IEEEPacificVisualization Symposium,2009,PacificVis’09,2009,pp.41–48.

[22]T.Kamada,S.Kawai,Analgorithmfordrawinggeneralundirectedgraphs,Inf. Process.Lett.31(1)(1989)7–15.

[23]T.M.Fruchterman,E.M.Reingold,Graphdrawingbyforce-directedplacement, Softw.:Pract.Exp.21(11)(1991)1129–1164.

[24]U.Brandes,C.Pich,Eigensolvermethodsforprogressivemultidimensional scal-ingoflargedata,in:M.Kaufmann,D.Wagner(Eds.),GraphDrawing,Lecture NotesinComputerScience,vol.4372,SpringerBerlin/Heidelberg,2007,pp. 42–53.

[25]D.Harel,Y.Koren,Graphdrawingbyhigh-dimensionalembedding,J.Graph AlgorithmsAppl.8(2)(2004)207–219.

[26]M.Dörk,S.Carpendale,C.Williamson,Visualizingexplicitandimplicitrelations ofcomplexinformationspaces,Inf.Vis.11(January(1))(2012)5–21.

[27]R.M.Martins,G.F.Andery,H.Heberle,F.V.Paulovich,A.AndradeLopes,H. Pedrini,R.Minghim,Multidimensionalprojectionsforvisualanalysisofsocial networks,J.Comput.Sci.Technol.27(4)(2012)791–810.

[28]J.Faith,R.Mintram,M.Angelova,Targetedprojectionpursuitforvisualizing geneexpressiondataclassifications,Bioinformatics22(21)(2006)2667–2673.

[29]J. Faith, Targeted projection pursuit for interactive exploration of high-dimensionaldata sets, in:11th International Conferenceon Information Visualization,2007,pp.286–292.

[30]M.Hall,E.Frank,G.Holmes,B.Pfahringer,P.Reutemann,I.H.Witten,TheWEKA dataminingsoftware:anupdate,ACMSIGKDDExplor.Newslett.11(1)(2009) 10–18.

(14)

[31]J.Heer,A.Perer,Orion:Asystemformodeling,transformationand visualiza-tionofmultidimensionalheterogeneousnetworks,VisualAnalyticsScience andTechnology(VAST),2011.

[32]G. Csardi, T.Nepusz, Theigraphsoftware package forcomplex network research,InterJ.ComplexSyst.(2006)1695,URLhttp://igraph.sf.net. [33]RCoreTeam,R:ALanguageandEnvironmentforStatisticalComputing,R

Foun-dationforStatisticalComputing,Vienna,Austria,2013,URL http://www.R-project.org.

[34]S.Fortunato,Communitydetectioningraphs,Phys.Rep.486(3–5)(2010) 75–174.

[35]M.E.J.Newman,M.Girvan,Findingandevaluatingcommunitystructurein networks,Phys.Rev.E69(2004)026113.

[36]A.Clauset,M.E.J.Newman,C.Moore,Findingcommunitystructureinverylarge networks,Phys.Rev.E70(2004)066111.

[37]M.E.J.Newman,Findingcommunitystructureinnetworksusingthe eigenvec-torsofmatrices,Phys.Rev.E74(2006)036104.

[38]p. Yolum,T.Güngör,F. Gürgen,C.Öztura(Eds.),Computerand Informa-tionSciences–ISCIS2005,LectureNotesinComputerScience,vol.3733, 2005.

[39]J.Reichardt,S.Bornholdt,Statisticalmechanicsofcommunitydetection,Phys. Rev.E74(2006)016110.

[40]U.N.Raghavan,R.Albert,S.Kumara,Nearlineartimealgorithmtodetect com-munitystructuresinlarge-scalenetworks,Phys.Rev.E76(2007)036106.

[41]D.Holten,Hierarchicaledgebundles:visualizationofadjacencyrelationsin hierarchicaldata,IEEETrans.Vis.Comput.Graph.12(5)(2006)741–748.

[42]D.Holten,J.J.vanWijk,Force-directededgebundlingforgraphvisualization, Comput.Graph.Forum28(3)(2009)983–990.

[43]W. Cui,H. Zhou,H.Qu, P.C. Wong,X.Li, Geometry-based edge cluster-ing forgraphvisualization,IEEETrans.Vis.Comput.Graph.14(6)(2008) 1277–1284.

[44]D.Harel,Y.Koren,Afastmulti-scalemethodfordrawinglargegraphs,in: ProceedingsoftheWorkingConferenceonAdvancedVisualInterfaces,AVI ’00,2000,pp.282–285.

[45]M.A.Smith,B.Shneiderman,N.Milic-Frayling,E.MendesRodrigues,V.Barash, C.Dunne,T.Capone,A.Perer,E.Gleave,Analyzing(socialmedia)networkswith NodeXL,in:ProceedingsoftheFourthInternationalConferenceon Communi-tiesandTechnologies,2009,pp.255–264.

[46]N.Henry,J.-D.Fekete,M.J.McGuffin,Nodetrix:ahybridvisualizationofsocial networks,IEEETrans.Vis.Comput.Graph.13(6)(2007)1302–1309.

HelenGibsonisaresearcheratSheffieldHallamUniversity.ShehasaBScin Math-ematicsfromEdinburghUniversityandstudiedforherPhDongraphandnetwork visualizationatNorthumbriaUniversity.SheiscurrentlyworkingontheUnity andAthenaprojectswhichareinvestigatingtechnologicalsolutionsforcommunity policinganddisastermanagementrespectively.

PaulVickersisaUKCharteredEngineer,holdsaBScdegreeinComputer Stud-iesfromLiverpoolPolytechnic,andaPhDinSoftwareEngineering&HCIfrom LoughboroughUniversity.HeiscurrentlyReaderinComputerScienceand Com-putationalPerceptualizationatNorthumbriaUniversitywhereconductsresearch intovisualizationandauditorydisplayandtheirapplications.

References

Related documents

More worrisome is the possibility of non-random migration, which is not a real concern with our mortality results (given we measure outcomes at ages 5-9, just a few years after

Acknowledging the lack of empirical research on design rights, our paper wishes to investigate the risk of piracy and the perceptions of the registered and unregistered design

Off eastern Newfoundland, the depth-averaged ocean temperature ranged from a record low during 1991 (high NAO index in preceding winter), a near record high in 1996 (following

The positive and signi…cant coe¢ cient on the post shipment dummy in the fourth column implies that prices charged in post shipment term transactions are higher than those charged

Meanwhile, there are commercial remedies for various cancers (for example, bowel, lung, stomach) and for bacilli and other infectious agents; there are also extracts of glands,

Of the counties in the South East region, Kent experienced the largest increase in population in both real and percentage terms between 2013 and 2014 growing by +16,800 people

Directors of the Head Pain Center are headache specialists, and center ser- vices include comprehensive evaluation, pharmacologic and nonpharmacologic therapy, relaxation and

Theme 1 – Background to Clinical Research Competence 1C: Study Design Target &amp; Date for Review Evidence of Achievement Level 1. Aware of competency; understanding of