Multi-touch Table User Interfaces for Collaborative Visual Software Analytics

(1)

Multi-touch Table User

Interfaces for Collaborative

Visual Software Analytics

by

Craig Anslow

A proposal

submitted to the Victoria University of Wellington in fulfilment of the

requirements for the degree of Doctor of Philosophy in Computer Science. Victoria University of Wellington

(2)

Abstract

Most software visualization systems and tools are designed from a single-user perspective and are bound to the desktop, IDEs, and the web. These design de-cisions do not allow users to collaboratively analyse software or easily interact and navigate software visualizations. We are building collaborative, interactive, multi-touch software visualization applications for multi-touch tables. Our user studies will outline the strengths and weaknesses of designing multi-touch soft-ware visualization applications and inform how users collaboratively conduct visual software analytics with multi-touch table user interfaces.

(3)

List of Figures

2.1 Information Visualization Examples. . . 6

2.2 Areas of software visualization [290]. . . 8

2.3 Software Structure Visualization Examples. . . 9

2.4 Software Behaviour Visualization Examples. . . 13

2.5 Software Evolution Visualization Examples. . . 16

2.6 Visual Analytics Tools. . . 21

2.7 Jigsaw views applied to the JRuby application [260]. . . 22

2.8 Visual analytics in softwre product assessments [303], using SolidFX [302]. 25 3.1 Example X3D Software Visualizations. . . 29

3.2 X3D Animation Routing Event Model. . . 31

3.3 Basic chart visualization types. . . 35

3.4 Specialised ((a) and (b)) and Externally ((c) and (d)) created visual-ization types. . . 36

3.5 Tag Cloud Visualization . . . 39

3.6 Tree Map Visualization of the ordering of words used in the class names from the Java API version 1.6. . . 40

3.7 Words used in the class names from our software corpus which contains 91 applications. . . 41

3.8 Java Class to Package Relationships in Java 1.6. . . 43

3.9 Tool suite to create our System Hot-spot Views. . . 45

3.10 The java and javax packages of the Java API 1.6 . . . 49

3.11 The org packages of the Java API 1.6 . . . 50

3.12 OptIPortal - Visualization Cluster displaying our visualizations. . . 51

4.1 Recent pioneering multi-touch examples. . . 56

4.2 Optical based multi-touch table - basic arrangement of components. 60 4.3 Multi-touch surface lighting techniques [299]. . . 61

4.4 Multi-touch surface lighting techniques continued [299]. . . 62

4.5 TUIO Architecture [135]. . . 69

4.6 Echtler and Klinker [79] multi-touch software architecture. . . 70 iv

(7)

LIST OF FIGURES v 4.7 Community Core Vision (CCV) multi-touch detection and tracking

software. . . 71

4.8 MT-Mini Touch Pad prototype. . . 74

4.9 Initial surface materials. . . 75

4.10 Initial lights, camera, and filter. . . 76

4.11 The Elvis Multi-touch Table. . . 77

4.12 The Elvis multi-touch table, configuration and demo application. . 78

4.13 The Elvis multi-touch table example applications. . . 82

4.14 Multi-touch System Hot-spot Views example. . . 83

(8)

Chapter 1

Introduction

Designing quality software is hard. Understanding what quality software looks like is much harder. Despite the spread of software development and software usage, we have almost no dependable data on how software is actually written in practice. Understanding the shape of existing software is an important step to understanding what good quality software looks like [19]. This proposal addresses an approach to understanding existing software using multi-touch table user interfaces and visual software analytic techniques.

The Lego Hypothesis [222, 297] says that software can be put together like Lego out of small interchangeable components. Software constructed according to this theory should show certain kinds of structure such as components should be small and should only refer to a small number of closely related components.

At the Programming Languages Grand Challenges panel at POPL 2009, Greg Morrisett (Professor of Computer Science at Harvard University) claimed that one of the great neglected areas in programming languages research is the bridge between Programming Languages and Human-Computer Interaction. This area of research can be seen as the evaluation of the usability of programming languages and tools.

We have a corpus [228] of software which is an organised collection of open source Java software systems intended to be used for empirical studies to evaluate various software design quality properties. The primary goal is to provide a resource that supports reproducible studies of software. The current release of the corpus contains 100 open-source Java software systems, 23 systems with multiple versions, with 400 versions total. We need better techniques for understanding the large amounts of data generated from measuring the software design quality metrics of the software in the corpus.

We believe that applying software visualization techniques to programming languages and software will help to assist researchers and developers to

(9)

1.1. CONTRIBUTIONS 2 stand programming languages and software. Understanding the shape of existing software is a crucial first step to understanding how software systems have been built [19].

Existing software visualization research to date have only considered Graphi-cal User Interfaces (GUI) and Virtual Reality. Natural User Interfaces (NUI) such as multi-touch tables and surface computing are positioned as the next major evolution in computing and user interfaces [274]. As Graphical User Interfaces (GUI) delivered unique interaction capabilities compared to command-line inter-faces, Seow et al. [274] believe multi-touch and surface computing will also bring unprecedented interaction experiences and capabilities to computing. We want to explore this new NUI paradigm with regard to multi-touch table interfaces and software visualization. A significant barrier to exploring multi-touch table applications is the cost of the necessary hardware and software.

Our proposal is to build our own low-cost multi-touch table, use existing open source multi-touch detection software, and create prototypes that use existing software visualization techniques [67, 290, 354] but extended with multi-touch features. We will then conduct studies of users collaboratively performing visual analytics [309] using the software visualization application to explore the way software from Qualitas Corpus [228] and Java Standard API is actually written in practice and evolved over time. Our user studies will outline the strengths and weaknesses of designing multi-touch software visualization applications and inform how users collaboratively conduct visual software analytics with multi-touch table user interfaces.

1.1

Contributions

Our intended contributions are as follows.

• Build a multi-touch table for collaborative visual software analytics.

• Demonstrate multi-touch table software visualization prototypes of Java software.

• Conduct studies of users collaboratively performing visual analytics with our multi-touch table software visualization prototypes.

(10)

1.2. STRUCTURE OF THE PROPOSAL 3

1.2

Structure of the Proposal

This proposal is structured as follows:

• Background(§2) covers the background to our research including informa-tion visualizainforma-tion, software visualizainforma-tion, visual analytics, software copora, and software metrics.

• Experimental Software Visualization (§3) presents some results we have produced thus far experimenting with existing information visualization tools, toolkits, and programming languages to conduct visual software ana-lytics.

• Multi-touch Technologies(§4) gives an overview of the hardware and soft-ware to create a multi-touch table and illustrated with some example systems. We then outline our experience at creating multi-touch tables and multi-touch software visualization prototypes.

• Research Plan(§5) describes our research methodology, a work plan with milestones to complete our research, and an outline of the final thesis docu-ment. We then give an overview of our intended contributions (§1.1). Finally, we list our research outputs to date.

(11)

Chapter 2

Background

The more we learn about past mistakes, the better are our chances to avoid them in the future – and build better software at lower cost, Zimmerman et al. [356].

The main reasons for wanting to reuse and maintain software are to save on time, effort, and costs in both development and maintenance of quality soft-ware [223]. For softsoft-ware reuse the developer will not have to implement a new solution to an old problem. For software maintenance the refactoring of code and fixing bugs will be reduced. The reason for reverse engineering is to break a piece of software down to understand it to either build a copy of the software or to improve the software [23]. Re-engineering is the subsequent modification of the software once it has been reverse engineered, usually to add new functionality or to correct errors.

Understanding the shape of existing software is a crucial first step to under-standing how software systems have been built [19]. Developers face the task of understanding software when they want to reuse, maintain, reverse engineer, or re-engineer a piece of software. Visualizing the source code and run-time of software can give a greater insight into the structure and behaviour of software, and will be able to help developers in these tasks [290].

In this chapter we begin with an overview of information visualization, and some techniques and tools (§2.1). Next we describe the nature of software visual-ization, and present some example systems (§2.2). An overview of visual analytics an emerging area is then given (§2.3). Finally, in the last section we outline existing software copora that we could use to conduct visual software analysis on followed by tools that can measure the software from these corpora (§2.4).

(12)

2.1. INFORMATION VISUALIZATION 5

2.1

Information Visualization

Card et. al [47] describe information visualization as the use of computer-supported, interactive, visual representations of abstract data to amplify cognition. Even after producing a visual representation, the following issues must be addressed: ex-ploration, navigation, and interpolation of the data [48]. Several overviews on information visualization techniques and tools exist [20, 47, 55, 103, 285, 326].

The theory of the visual display of quantitative information [314] consists of principles that generate design options and that guide choices among the design options. Tufte [314] describes graphical excellence of quantitative information, as the well designed presentation of interesting data – a matter of substance, of statis-tics, and of design. Graphical excellence consists of complex ideas communicated with clarity, precision, and efficiency. This results in a visualization displaying the greatest number of ideas, in the shortest time and in the smallest space possible.

An early example of graphical excellence is the original London Underground map designed by Harry Beck in 1933, see Figure 2.1(a). Typically most people will use the map as a visualization tool for planning a journey from one station to another and a feasible route between them. People may memorise their route by colour or the direction of the lines involved and any intermediate stations. The internal model created by memorising a route is known as a cognitive map [285]. Beck based the map on electrical circuit diagrams which does not reflect the geography of the city above. The revolutionary design with minor modifications and additions still remains today.

Ben Shneiderman [279] created a visual design guideline called the visual information seeking mantrawhich says show an overview of the data first, then zoom and filter, and finally show details-on-demand. He then proposed a task by data type taxonomy which has seven data types (1-, 2-, 3-dimensional data, temporal and multi-dimensional data, and tree and network data) and seven tasks which a user can perform (overview, zoom, filter, details-on-demand, relate, history, and extract). This visual information seeking mantra is one of the very few methodical guidelines for designing information visualizations and it is the most widely cited [62]. There are, however, other user task heuristics [5, 6, 167, 316, 357, 358] but they are not as useful for evaluating usability, focus on low level tasks, or are domain specific. Each of these other user task heuristics have components which overlap Shneiderman’s [279] mantra.

Visualization in 2D has been heavily explored [47]. One should not assume that a 3D visualization is automatically superior to a 2D visualization. Answering the question “which is better, 3D or 2D?” is difficult, if not impossible [86]. When deciding on using 3D there should be clear important sub-tasks for which 3D is

(13)

2.1. INFORMATION VISUALIZATION 6

(a) Original London Underground map, reproduced by kind permission of London’s Transport Museum cTransport for London.

(b) Many Eyes, Barrack Obama’s inaugural speech.

(14)

2.2. SOFTWARE VISUALIZATION 7 clearly beneficial over 2D [326]. Simply increasing a visualization from 2D to 3D is unlikely to improve task performance unless extra and greater controls of 3D data in the visualization are created.

Increasingly the web is becoming a platform of choice for many general pur-pose visualization tools. Many Eyes [64, 318, 319] is a web site provided by IBM research that has collaborative visualization services, see Figure 2.1(b). The web site allows users to upload ASCII data sets, visualize them, comment on each other’s visualizations, and then discuss their discoveries with other people.

Other similar information visualization web tools include Swivel1_{, Data360}2_,

and DataPlace3. These other tools operate similar to Many Eyes but provide less sophisticated visualizations. There are some key characteristics of all of these web visualization tools. First, they require end-users to register with the web site. Second, an end-user can upload data in ASCII or spreadsheet format. Third, end-users can modify their data online. Fourth, multiple visualizations can be created from the data and at any time. Finally, end-users can comment on the visualizations.

2.2

Software Visualization

Software visualization is defined as the application of information visualization in software engineering and can show the structure of software, run-time behaviour, and representation of source code [290]. The goal of software visualization is to help users comprehend software systems and to improve the productivity of the software development process [68]. Software visualization is essentially situated at the intersection of information visualization, software engineering, human computer interaction, graphics, and cognitive psychology [180]. Several overviews on software visualization exist [67, 68, 290, 354].

Software Visualization is comprised of the following sub-fields algorithm visualization, visual programming, programming by demonstration, and program visualization [290]. Figure 2.2 shows the relationships between the different types of software visualization fields.

Algorithm visualization is the visualization of the higher level abstractions which describe software, where as algorithm animation [142, 225] is dynamic algorithm visualization. Algorithm animation communicates how an algorithm works by graphically displaying its fundamental operations. Brown and Her-shberger [39] claim that creating effective visualizations is an art, not a science.

1_{http://www.swivel.com} 2_{http://www.data360.org} 3_{http://www.dataplace.org}

(15)

2.2. SOFTWARE VISUALIZATION 8

Figure 2.2: Areas of software visualization [290].

Flowcharts are an example of static algorithm visualizations while visualizing sort-ing algorithms are algorithm animation. Program visualization is the visualization of actual program code or data structures in either static or dynamic form.

We now give an overview of some of the tools that have been used to investigate the three aspects of software similar to that of Diehl [68] including structure (§2.2.1), behaviour (§2.2.2), and evolution (§2.2.3). We then provide some information on software visualization evaluation methods (§2.2.4).

2.2.1

Structure Visualization

Structure refers to the static parts and relations of software. In this section we cover program source code and static structure, 3D software visualization, software metrics, and UML.

Source Code and Static Structure

sv3D [180, 181, 182] is a framework for visualizing source code and related at-tributes. For visualizations, the framework uses 3D metaphors based on the SeeSoft [82] pixel representation and the 3D File Maps [234]. The framework

(16)

2.2. SOFTWARE VISUALIZATION 9

(a) Code Crawler - Metrics Visualization [166].

(b) Code City - Metrics Visualization [336].

Figure 2.3: Software Structure Visualization Examples.

uses transparency, elevation, and special 3D manipulators to overcome occlusion. They apply Shneiderman’s [279] seven high level user needs that an information visualization application should support to the framework. There is no support for extraction and querying features, however. sv3D is implemented using Qt for the user interface and Open Inventor for the rendering components. They also show how sv3D can be combined with an information retrieval tool [351] to enrich source code searching and browsing in MS Visual Studio. sv3D has since been integrated with Eclipse [193]. Previous work of theirs was done using immersive

(17)

2.2. SOFTWARE VISUALIZATION 10 environments [175, 176] whereupon they created a visual language which defines a formal mapping from object-oriented languages to a visualization in virtual reality. The language only supports syntactic and static features of a program.

Numerous other systems have looked at visualizing the static structure of software and other aspects as well. SolidFX [300, 302] is an IDE for reverse engineering of C/C++ programs and provides many advanced visualization techniques to explore attributes of a code base such as call graphs, metrics, and UML diagrams. J3Browser [1, 2, 3] explores Java class relations and their other tool VisMOOS [89, 90] is an Eclipse plug-in. Hierarchical Net [16, 17] visualizes the structure of large software systems as software landscapes. VizzAnalyzer [172] is a framework designed for reverse engineering. VizzAnalyzer also has a built-in tool Vizz3D [207], which can be used as standalone for visualizbuilt-ing class and package interaction, program evolution, and program quality. NosePrints [209, 210] visualizes code smells. Enhance [276] provides information about exception handling constructs and exceptions’ flow from the quantitative, the flow, and the contextual perspectives. Telea et al. [301] provide an open toolkit for visualizing telecommunications software for the purposes of reverse engineering. Clack [334] visualizes the structure of network routers.

3D Software Visualization

Teyseyre and Campo [306] provide an overview of 3D software visualization. Koike et al. [151, 152, 153, 155] described the significance of visualizing software information in three dimensional space and the problems of 2D visualization. This work also introduced the concept of a 3D class library browser to show method inheritance. The class hierarchy was represented as a tree in the X-Y plane and methods of each class were shown in the Z axis with the same X-Y coordinates.

Other early research has been done in 3D for visualizing Lisp programs [170], different features of a program [231, 232, 233], the layout and structuring of object oriented software in three dimensions as directed graphs (GraphVisualizer3D [86, 327, 328] and NestedVision3D [208]), web-enabled visualizations of complex SELF programs [72, 73, 108], visualizing call graphs [352], design patterns [46], and software architectures [84].

Some researchers have even explored different 3D visualization metaphors for source code comprehension. These metaphors include 3D cities (Software World [147, 148, 149], Component City [54], 3D City [206], and CodeCity [336]), a 3D solar system metaphor [96], 3D self organizing maps [36], and 3D computer game engines (Quake2 [146] and Quake3 [158]).

(18)

2.2. SOFTWARE VISUALIZATION 11 Software Metrics

Code Crawler [163, 165, 166] is a language independent reverse engineering tool which combines metrics and software visualization techniques such as Polymetric Views [164], see Figure 2.3(a). CodeCity [336] a tool that stems from Code Crawler and Polymetric Views uses a 3D city metaphor to display additional kinds of metric information, see Figure 2.3(b). A further study [337] extended CodeCity to focus on disharmony maps to look at the quality of the system design by focusing on design flaws. The Mondrian toolkit [192] aims to bring the Polymetric Views closer to the code by extending existing programming languages to use embedded scripts in their programs to create the visualizations as opposed to using another tool to generate the visualizations. Softwarenaut [173, 174] also uses Polymetric Views but focuses on the dependencies between modules. Barrio [69] is another tool that looks at dependencies between Java modules using clustering techniques.

Churcher et al. [59, 141] use VRML for software visualization and mainly focus on object-oriented metrics. They visualize inheritance structures with cone trees, inheritance structures with metrics, hierarchies with tree maps [128, 278], web sites [104], class cohesion [58], and object-oriented metrics and class clusters [120]. Various other systems and research groups have also looked at visualizing object-oriented metrics including Lagrein [125, 126, 127], CrocCosmos [168, 169], MetaViz [254, 255, 256], CocoViz [30, 31, 32, 33], and Langelier et al. [160, 161]. UML Diagrams

Many systems throughout the literature have looked to reverse engineer systems in order to generate UML diagrams of large software systems [68]. Recent sys-tems have taken a similar approach but given that software is more complex these days the software visualization research has been aimed at improving the understanding of UML diagrams by augmenting the UML diagrams with new features including areas of interest [44], textures [45], automatic layout of UML use case diagrams [81], semantic zooming [88], and digital pens and paper to create UML diagrams which can then be transferred to tabletop displays [63]. Sharif and Maletic [277] have also studied the effect of layout on the comprehension of UML class diagrams.

Visualizations of various UML diagrams such as class, object, sequence, and collaboration diagrams have been explored in 3D [66, 76, 92, 94, 188, 189, 229, 353]. Displaying UML in 3D – which is intended to be drawn on 2D surfaces – does not scale well once there are many nodes in a world. Text is also hard to render in 3D. When text is rotated in 3D it is hard for a user to view what is meant to be displayed. Rather than representing strict UML diagrams in 3D, some research has been

(19)

2.2. SOFTWARE VISUALIZATION 12 conducted that represents UML diagrams as 3D geon diagrams [49, 117, 118, 119]. The geon diagrams are made from 3D primitives such as cones, spheres, cylinders, and boxes. Another related area is visualizing CRC cards in 3D [262].

2.2.2

Behaviour Visualization

Behaviour refers to the execution of the program with real and abstract data. Behaviour data is collected at run time by instrumenting the code, debugger interfaces, byte code injection or extending virtual machines. In this section we cover algorithm animation, visualizations from execution traces, software architecture, and finite state machines.

Algorithm Animation

Algorithm animation has been quite useful for education and for research into the design and analysis of algorithms [290]. Sorting Out Sorting[12, 13] was the first teaching film on algorithm animation and described nine sorting algorithms. There are various software visualization systems that have been produced over the years and some early examples include: Balsa [37, 41] (the first real-time interactive algorithm animation system), Zeus [38] (follow up to Balsa), Tango [286, 289], Polka [288] (a follow up system to Tango), Pavane [259], Tarraing´ım [199, 200, 201] (a tool for visualizing Self programs). Some more recent examples include Blumenkrants et al. [29] which look at narrative algorithm visualizations, Alspaugh et al. [4] explain algorithms using scenario visualizations, SIV [102] visualizes inter-dependencies between scenarios, Lumi`ere [22] for visualizing scheduling based algorithms, visualization of the computation tree of the Tutte Polynomial algorithm [310], and HDPV [296] visualize C/C++ and Java programs to understand recursion and the effect of programming errors such as buffer overflow.

Stasko and Wehrli [291] identified the need for three-dimensional graphics in software visualization. They list the basic requirements for 3D computation visualization, define three categories for characterising visualizations, and discuss their system for supporting 3D animation development by programmers. Other systems have explored the use of 3D graphics for algorithm animation. Some of these systems include Pavane [61, 259], Polka3D [291], Zeus3D [40], 3D-AAPE [95], JCAT [195, 196], and Alice [65].

(20)

2.2. SOFTWARE VISUALIZATION 13

(a) Jinsight - Debugging Visualization [212].

(b) BLOOM, Spiral views of the stack (sampled during execution) [235].

(21)

2.2. SOFTWARE VISUALIZATION 14 Execution Trace Visualization

Jinsight [212, 213, 214, 215, 217, 218, 275], which stems from a wide range of work from IBM, a tool for visualizing and analysing the execution of Java programs and is useful for performance analysis, memory leak diagnosis, debugging, or any task in which a user needs to better understand what a Java program is really doing. Figure 2.4(a) shows a brief overview of some of the visualizations produced by Jinsight. Follow up systems have looked at visualizing the execution patterns of web services [216] and streaming applications [211].

BLOOM [234, 235, 236, 237, 246, 247, 248] is a framework for understanding software through visualization, see Figure 2.4(b). BLOOM provides facilities for static and dynamic data collection and offers a wide range of data analysis. The system includes a visual query language for specifying what information should be visualized. All these are used in conjunction with a back end that supports a variety of 2D and 3D visualization strategies. Other systems include JIVE [238, 239] for visualizing Java programs in action and JOVE [249] which provides slightly more detailed information about where execution is occurring. A follow up system focuses on more specific user abstractions [240]. Another system [282] focuses on virtual machine code (IBMS’s Jikes RVM) and how to optimise it as opposed to user code. More recent systems include DYPER [241, 242, 243] and DYEM [244, 245] (an extension to DYPER). DYPER does controlled performance analysis of Java systems and can obtain a variety of performance metrics including CPU usage, IO, sockets, heap utilization, memory allocations, phase analysis, and reaction analysis. DYMEM provides a visualization of object ownership from the memory of a running process.

TraceCrawler [97, 98] and CCJUN [350] explore visualizing feature traces in 3D of object instantiations and method sends to find which classes and objects are most active during the execution of a feature, what are the patterns of activity that are common in feature behaviour and which are specific to one feature. Bohnet et al. [34, 35, 320] also do dynamic analysis to look for features in C/C++ programs. Koike et al. [151, 152, 153, 155] has looked at visualizing large trace files in 3D of computer processes from a number of computers running in parallel and communicating with each other. Storer et al. [293] have developed a tool for teaching object-oriented programming concepts to introductory level computer science courses. The tool provides Java3D visualizations of the execution of Java programs including representation of classes, objects, references, and method execution.

(22)

2.2. SOFTWARE VISUALIZATION 15 Software Architecture

SoftArch [99] instruments classes then methods calls, and events are captured at run-time to produce architecture diagrams of the actual implementation. Pounamu [355] is another system which can produce static or dynamic software architecture nota-tion in XML. Developers or SoftArch and Pounamu have built a number of other tools that convert the XML notation about a program into the Graph eXchange Language (GXL)4 _{[343] and can then convert the GXL information into SVG or}

VRML [292]. Knight and Munro [150] have also explored using GXL to create visualizations for program comprehension.

Finite State Machines

The Rube [114, 145] framework use VRML [329] to produce finite state machines of programs. Pradel and Gross [224] also use finite machines to show the se-quences of the way API methods have been used in applications from the DaCapo Benchmarks [27, 28].

2.2.3

Evolution Visualization

Evolution refers to the process of developing software and focuses on the changes of the program code over time to improve the software and eliminate bugs. In this section we cover visualizing software archives

SeeSoft [82] and SeeSys [15] visualize various textual features of evolving large and complex software systems using the space filling technique which tries to convey as much information as possible with as few pixels as possible. The features include software metrics, number and scope of modifications, number and types of bugs, and dynamic program slices, see Figure 2.5(b). The tools support a number of different views including line, pixel, file summary and hierarchical representations.

Code Swarm [203] organically visualizes the commit history of open source projects using animations programmed in Processing [91, 230] and displayed as videos. Figure 2.5(a) shows a snapshot in time of visualizing the Eclipse IDE project. The developers and files of a project are represented as moving elements. Files are coloured differently for source code and documentation. Non-active files or developers eventually fade away. A bar chart at the bottom left is a reminder of the history of events.

Other tools also visualize software archives and other aspects. Voinea et at. [321, 322, 323, 324] describe a suite of tools CVSGrab and CVSscan for mining

(23)

2.2. SOFTWARE VISUALIZATION 16

(a) Code Swarm [203] visualizing the revision history of Eclipse.

(b) SeeSoft visualizing [82].

(24)

2.2. SOFTWARE VISUALIZATION 17 software artifacts which display various artifacts using some advanced visualiza-tion techniques. CCVisu [24, 25, 26] use a method for computing clustering layouts of software systems for which the change history is available. WhiteCoats [191] visualizes the evolution of software from CVS repositories. VRCS [154] visualizes software revision histories using the Z axis as a time axis to represent the different revisions of each file. Panas [205] visualizes the evolution of the signatures of soft-ware binaries to find malicious code. Theron et al. [307] visualize the evolution of baselines and revisions of artifacts from software repositories. Langelier et al. [162] use different views and animation to show structural and control version metrics of evolving software. Other evolutionary work has included the evolution of UML diagrams such as class diagrams [131, 132, 315] and model transformations [325].

2.2.4

Evaluation of Software Visualization

There is no silver bullet for evaluating software visualizations [105] nor any benchmarks [177] to determine how effective a software visualization is. How-ever, there are various information visualization and software visualization de-sign guidelines [197, 279], taxonomies [138, 204, 225, 226, 227, 258], and frame-works [7, 9, 74, 75, 178, 294] that can be used to evaluate algorithm animations, software visualizations, and software visualization tools.

The most common way to evaluate a visualization with users make use of evaluation metrics such as task time completion and number of errors. However, these methods appear insufficient to quantify the quality of an information visual-ization system. A newly established workshop5_{aims to explore novel information}

visualization evaluation methods beyond time and errors, and to structure the knowledge on evaluation in information visualization around a schema, where researchers can easily identify unsolved problems and research gaps.

In a recent survey [156, 157] based on questionnaires completed by 111 re-searchers from software maintenance, re-engineering and reverse engineering, 40% found software visualization absolutely necessary for their work and another 42% found it important but not critical. 7% think that software visualization is at least relevant and 6% that they can do without it but it is nice to have. Only 1% believe software visualization is not an issue at all. Finally, 4% did not answer the question. From the same survey relatively few people consider software visual-ization their primary research (11%) or at least a substantial part of their research (18%). Many people are doing software visualization research every now and then (20%), however most people are primarily using or integrating existing software 5_{BEyond time and errors: novel evaLuation methods for Information Visualization}

(25)

2.3. VISUAL ANALYTICS 18 visualization tools developed by others (33%). The survey did not ask what kind of software visualization tools were used.

Some recent studies [271, 272, 273] classified desirable features and lessons learned from a number of software visualization tools for corrective maintenance. Several features were strongly desired by all users and many of these features were provided by an increasing number of tools including IDE integration, scala-bility, multiple views, and query support. 3D and animation were less desirable. Others [163, 263] also report on lessons learned from creating their own software visualization tools.

Of particular interest to us is software visualization tools that support collab-oration. Storey et al. [295] propose that collaborative software visualization can improve team software maintenance. They reviewed a number of existing soft-ware visualization tools and found that most of them rarely support any form of collaborative authoring and sharing of views. They recommend that designers of software visualization tools for software maintenance consider the social and col-laborative aspects when building tools to help improve collaboration and usability, and adopt Computer Supported Cooperative Work (CSCW)6_{methodologies for}

evaluating collaborative visualizations.

2.3

Visual Analytics

Visual analytics is a new emerging research field and has evolved out of the fields of information and scientific information visualization. The goal of visual analytics [309] is the creation of tools and techniques to enable people to: syn-thesize information and derive insight from massive, dynamic, ambiguous, and often conflicting data; detect the expected and discover the unexpected; provide timely, defensible, and understandable assessments; and communicate assessment effectively for action.

Visual analytics is a multidisciplinary field. Visual analytics includes the following focus areas [309]:

• Analytical reasoning techniques that enable users to obtain deep insights that directly support assessment, planing, and decision making.

• Visual representations and interaction techniques that take advantage of the human eye’s broad bandwidth pathway into the mind to allow users to see, explore, and understand large amounts of information at once.

6_{CSCW is a generic term, which combines the understanding of the way people work in groups}

with the enabling technologies of computer networking, and associated hardware, software, services and techniques [342].

(26)

2.3. VISUAL ANALYTICS 19 • Data representations and transformations that convert all types of conflicting

and dynamic data in ways that support visualization and analysis.

• Techniques to support production, presentation, and dissemination of the results of an analysis to communicate information in the appropriate context to a variety of audiences.

2.3.1

Visual Analytics Versus Information Visualization

The new termvisual analyticshas confused people as to what the difference is be-tween visual analytics and information visualization [139]. Visual analytics differs from information visualization in a number of ways. Visual analytics is more than just viusalization and can be seen as an integral approach to decision-making, combing visualization, human factors, and data analysis [139]. Visual analytics is a more applied focused field with emphasis on transferring and adapting visualiza-tion technologies from the research and development communities to professional societies [347]. Research in information visualization has mainly concentrated on the process of producing views and creating valuable interaction techniques for a given domain of data.

Card et al. [47] declare that the purpose of information visualization is insight and Thomas and Cook [309] propose that the purpose of visual analytics is to enable and discover insight [52]. Visual analytics is concerned with improving the process of analytical reasoning through interactve visual interfaces [50].

Wong and Thomas [347] have analysed the content of the papers from the Visual Analytics and InfoVis conferences7from 2006-2008 using a visual analytics tool [344]. They found that there were a few clusters of visual analytic papers that overlapped in content with information visualization papers but there was a large cluster of visual analytics papers that had no overlap.

Keim et. al [140] present an extension to Shneiderman’s [279] information seeking mantra for information visualization (§2.1) but for visual analytics. The steps involved in the extended mantra are analyse first, show the important, zoom, filter and analyse further, and then show details on demand.

2.3.2

Research Challenges for Visual Analytics

There are a number of challenges facing the visual analytics community [308]. Some of the challenges that are related to our research include the following. The creation of accessible, walkup-usable, widely distributable analysis application

(27)

2.3. VISUAL ANALYTICS 20 that bring the benefits of visual discovery to as broad a user base as possible [144]. Develop techniques to scale to a variety of display form factors to take advantage of whatever capabilities are available to support analysis and collaboration [257]. Develop techniques that gracefully scale from a single user to a collaborative (multi-user) environment [257].

These challenges leads to collaborative visual analytics which is of interest to our research. Collaboration is a characteristic of nearly all visual analytics work [221]. There has been little research done on how people collaborate and interact with visual analytics tools [221, 252]. Heer and Agrawala [106] present some design considerations for collaborative visual analytics which includes a list of collaborative visualization tasks, techniques to improve shared context and awareness, and suggestions for increasing engagement and allocating effort. Isenberg et al. [121, 122] are exploring the use of multi-touch tables for interacting with and sharing information visualizations during collaboration of small co-located teams.

2.3.3

Visual Analytics Tools

There are a number of visual analytics tools, some general purpose and some domain specific. Some notable tools include In-Spire [344], WireVis [51], and Jigsaw [287]. None of these tools focus on the domain of software; they instead use structured and unstructured document collections and are more general purpose visual analytic tools. Information on other visual analytics tools can be found elsewhere [144, 309].

Figure 2.6(a) shows a number of views from In-Spire [344], one of the first general purpose visual analytics tools. The top right view shows the galaxy visualization where dots represent documents and clusters represent central topics or themes. The lower right shows the theme view visualization which provides an overview and shows a relief map where the highest peaks represent the most prevalent topics in the collection.

Figure 2.6(b) shows a view of WireVis [51] which is a financial fraud visual analytics tool created in collaboration with the Bank of America. WireVis uses keywords found in bank transaction records augmented with linked visualiza-tions that give fraud analysts simulataneous information overviews and detailed exploratory capabilities. The top left displays a heatmap, top right search by example, lower left a ”strings and beads” time display, and lower right a keyword graph.

Figure 2.7 shows an example of the Jigsaw [287] tool applied to the JRuby application [260]. Although Jigsaw is not designed to be used for visualizing

(28)

2.3. VISUAL ANALYTICS 21

(a) In-Spire [344].

(b) WireVis [51].

Figure 2.6: Visual Analytics Tools.

software it is flexible enough so that software can be analyzed.

The Document View (Figure 2.7(a)) shows the entities from a selected document in the collection of documents, (e.g. Java classes within the JRuby application). The bottom left panel in the Document View indicates which document (Java file) is selected in the collection and how often the document has been viewed. A word cloud at the top shows the frequency of entities across the whole document collection. The bottom view shows the Java class file.

(29)

doc-2.3. VISUAL ANALYTICS 22

(a) Document View [260].

(b) List View [260].

(30)

2.3. VISUAL ANALYTICS 23 ument collection on the left part of the display ordered by frequency and the documents (Java files) on the right. In this figure the IRubyObject class is selected on the left highlighted in yellow and linked with associations on the right for what documents this class appears in. The colouring scheme for the classes on the left shows the frequency with which they appear with each other. For instance the stronger the orange the more frequently that entity appears with the IRubyObject class and white if it does not appear at all. It is also possible to select multiple entities on the left to show what documents they appear in together.

2.3.4

Visual Software Analytics

Since the beginnings of software visualization research the field has focused pri-marily on algorithm animation (1980s), software architecture (1990s), and software evolution and mining from software repositories (2000s)8_{. Most software}

visu-alization systems [68, 290, 354] in the past have focused on visualizing just one piece of software at one time, using one or more visualization techniques, and are designed for single users. Diehl [68] claims that visual analytics has yet to reach software visualization. We believe that applying visual analytics to software will help to assist developers to understand software better.

We view visual software analyticsas a super-set of information visualization, software visualization, and empirical software engineering. The visualizations will help provide insight into software, using multiple visualization techniques at once (e.g. tree maps, focus + context, node-link diagrams), as well as various data representations (e.g. metrics, revision history, class hierarchy, and micro-patterns [93]).

Anslow et al. [11], Ruan [260], and Telea and Voinea [303] present some work in this new area of applying visual analytics to software. They all use a combination of static, dynamic, and evolution analysis techniques and visualization tools for supporting decision-making of industrial software and open source projects. Work by Anslow et al. [11] is described in Chapter 3 and work by Ruan [260] is presented in Figures 2.7(a) and 2.7(b). The main differences between the works is that Anslow et al. [11] and Ruan [260] have focused on open source Java programs, used existing analysis, information visualization, and visual analytics tools, and experimented with large screen computer displays and multi-touch technologies. Telea and Voinea [303] have used proprietary C/C++ programs and their own commercial software visualization tool SolidFX [300, 302].

Figure 2.8 shows sample views by Tela and Voinea [303] using their software 8_{Stephan Diehl, Dagstuhl Seminar: Information Visualization - Human-Centered Issues in}

(31)

2.4. SOFTWARE MEASUREMENT 24 visualization tool SolidFX [300, 302] applied to a C project with 3.5 million lines of code (1881 files) and one million lines of code in headers (2454 files) in 15 releases [303]. Figure 2.8(a) shows the team assesment analysis view with the top image showing the number of modification requests per file. Files with more than 30 modification requests are coloured red and are spread over many packages. The bottom image shows the same structure but coloured by team entity which clearly shows that Team A (red) has developed the most high modification requests. Figure 2.8(b) shows the evolution of changes in documents over time. The top half shows all documents including documentation files (in grey) while the bottom half only shows source documents (in yellow). New files are added to the top of the stack while older files are at the bottom. Red dots represent changes in documents. What can be learned from this view is that the number of source documents increases in sync with the number of other files (i.e. developers of this code base document the code as they develop it).

2.4

Software Measurement

In order for us to create visualizations of software we need a copora of software in object-oriented programming languages to use and methods to measure the software. A key observation leading to this proposal is that software that could comprise such corpora have become available for study only in the last decade. Free and open-source software (FOSS) that is freely accessible over the Internet is one source of corpora that could be used. Having the software corpus is not enough to create software visualizations, we need to next be able to measure the software contained within the collection by using software metrics analysis tools. We now describe some existing software corpora, software metrics, and software metrics analysis tools for generating data for our visualizations.

2.4.1

Software Corpora

The Qualitas Corpus [228] is organised as a collection of software systems intended to be used for empirical studies in software engineering. The primary goal is to provide a resource that supports reproducible studies of software. The current release of the Corpus contains 100 open-source Java software systems, 23 systems with multiple versions, with 400 versions total. Using the corpus we have studied various attributes such as how inheritance [305] and fields [304] are used in Java programs. Other studies have included looking at power laws [19] and cycles [190] in Java programs.

(32)

2.4. SOFTWARE MEASUREMENT 25

(a) Team Assessment Analysis View.

(b) Documentation Evolution.

Figure 2.8: Visual analytics in softwre product assessments [303], using SolidFX [302].

(33)

2.4. SOFTWARE MEASUREMENT 26 The DaCapo Benchmarks [27, 28] provides a set of benchmark open source Java software projects. The benchmarks include a diverse and widely used set of nontrivial applications. They are suitable for empirical software engineering research as they are a controlled, tractable workload amenable to analysis and experiments. The 14 projects include avora, batik, eclipse, fop, h2, jython, luindex, lusearch, pmd, sunflow, tomcat, tradebeans, tradesoap, and xalan. The majority of these benchmark applications are also in the Qualitas Corpus.

Ohloh9 _{is a web site that aims to characterise open source software}

devel-opment projects. The site retrieves data from revision control repositories and provides statistics such as longevity of projects, their licenses and software metrics such as source lines of code and commit statistics. The site does not provide metrics about the structure of the software.

There exist some open source code search engines which operate over software corpora and existing web sites. Sourcerer [14, 171] indexes source code found on the web using various metrics techniques and presents the results using CodeRank which is a technique for prioritising search results. Other free commercially developed search engines include Google Code Search10_{, Krugle}11_{and Koders}12_.

These search engines allow users to search for files or code, different languages, licenses, and specific code features such as classes, methods, and interfaces, and code examples.

2.4.2

Software Metrics

A software metric measures some property of a piece of software such as the number of lines of code [85]. Applying software metrics can help determine the quality of software. Lanza et al. [166] claim that there is no magic software metric that has been found and consider the definition of a universal software design quality metric as the holy grail of software engineering. Chidamber and Kemerer [57] provide the most widely cited suite of metrics. These metrics include WMC (Weighted Methods per Class), DIT (Depth of Inheritance Tree), NOC (Number of Children), CBO (Coupling Between Objects), RFC (Response for a Class), and LCOM (Lack of Cohesion in Methods).

There exist a number of software metric tools (as standalone or plugins for IDEs) that allow analysing software. Some of these tools include SemmleCode13_[317],

Sc-9_{http://www.ohloh.net}

10_{http://www.google.com/codesearch} 11_{http://www.krugle.org}

12_{http://www.koders.com} 13_{http://semmle.com/}

(34)

2.4. SOFTWARE MEASUREMENT 27 iTools Understand14, Structure10115, and Creole16. SemmleCode and SciTools have a number of built-in features to collect metrics about software application(s) and allow further customisation. These tools employ basic visualization techniques such as graphs and tree structures, neither of them use any specific software vi-sualization techniques. Structure 101 and Creole mainly focus on dependencies between entities and both have plug-ins to Eclipse.

Gill and Maman [93] created micro-patterns which is related to software metrics to classify Java class implementations, while Singer et al. [281] created nano-patterns to characterize and classify Java methods. They both have analysis tools for detecting these kinds of patterns in Java programs, but neither of them have been applied to large software corpora or the Java Standard API. Sourcerer [14, 171] can also search for micro-patterns.

14_{http://www.scitools.com/products/understand/}

15_{http://www.headwaysoftware.com/products/structure101/} 16_{http://www.thechiselgroup.org/creole}

(35)

Chapter 3

Experimental Software Visualization

We are interested in understanding what Java software looks like to help with software reuse, maintenance, and re-engineering. We believe creating software visualizations will help to assist developers to understand software. We are interested in using the Qualitas Corpus (§2.4) as that is the most comprehensive collection of software for use in empiricial studies and the collection we have the most experience with. We have used existing information visualization techniques, general purpose information visualization tools, tool-kits, and markup languages to create a range of software visualizations from the Java software in the Qualitas Corpus and the Java Standard API. In this chapter we present some preliminary results of our experimental non multi-touch software visualization and highlight the strengths and weaknesses of our software visualizations.

3.1

Extensible 3D (X3D) Graphics

Our first experiment is a summary from Craig Anslow’s Masters Thesis [7] and pa-per [9] presented at the ACM International Symposium on Software Visualization (SoftVis). We wanted to see if X3D [43, 330], can support a range of 3D software visualization techniques to determine if the technology is viable for use in soft-ware visualization. More precisely we wanted to experiment with automatically creating X3D software visualizations over the web, evaluate X3D’s animation and interactivity aspects, examine the text, layout, and extensibility features, test the integration capabilities, and analyse the performance display capabilities.

Figure 3.1 shows some of our example software visualizations implemented in X3D. Figure 3.1(a) shows bubble, selection, and insertion sort algorithms all animating at once. The combined views allow a user to see both the current state of the array and the history of an algorithm’s execution. The animation replicates a similar example by Najork and Brown [196]. Figure 3.1(b) shows a documentation

(36)

3.1. EXTENSIBLE 3D (X3D) GRAPHICS 29

(a) X3D Elementary Sorting Algorithm Animation.

(b) X3D UML Class Diagram showing 100 classes integrated with Javadoc.

Figure 3.1: Example X3D Software Visualizations.

related visualization that has a 3D UML class diagram of a Java program and the associated Javadoc. When a user clicks on a class or package in the visualization in the left frame the associated Javadoc entity is displayed in the right frame. The

(37)

3.1. EXTENSIBLE 3D (X3D) GRAPHICS 30 UML diagram replicates a similar example by McIntosh et al. [188, 189]. We now discuss our evaluation results.

3.1.1

Design

X3D is a free open standards file format and run-time architecture to represent, communicate, and deploy 3D scenes and objects over the web using XML. The X3D specification is comprised of components which contain nodes (e.g. geometry) that are declared in a scene graph. Content can be created using text editors, X3D editors (e.g. X3D-Edit a Netbeans plug-in [42]), digital content creation tools (3DS Max, Maya, Blender), or XSLT transformations. There are language bindings to ECMAScript and Java. We used three of the main X3D browser free version im-plementations including (in order of preference): BS Contact VRML/X3D Player1 (6MB download), Octaga Player2_{(5MB), and Flux Player}3 _{(1.5MB). Each of these}

X3D browsers operate on Windows and can be plugged into Mozilla Firefox and Microsoft Internet Explorer or operate as stand alone. There is also the Web3D Consortium’s stand-alone open source test-bed implementation Xj3D4 _(12MB).

X3D content can be rendered in either OpenGL or DirectX. Some of these X3D browsers do not implement all of the X3D specification nor do they make the Scene Access Interface (SAI) run-time API available. This makes it hard for developers to create consistent X3D software visualizations across all X3D browsers.

3.1.2

Graphical Capability

A rich set of graphical elements exist to create high quality visual pictures required in software visualizations. Points and lines can be implemented using nodes from the 2D geometry component, and areas and volumes from the 3D geometry component. Shapes have size, height, radius, colour, and transparency fields, and can be animated to change in a software visualization. Shapes can be orientated in any order (e.g. translated then rotated) and change position during a software visualization. When nodes that are connected in a graph visualization are moved in a scene, scripting is required to preserve the node-link relationships. Text can also be displayed using the shape node. Lighting (directional light) and environment (background) effects can be applied to a scene. Textures can be applied to shapes using images (.png, .gif, and .jpg), sound (.wav, .mp3), or video (.mpg). Sound is used in Figure 3.1(a) to signify ordering of elements. Videos have been used for

1_{http://www.bitmanagement.de/} 2_{http://www.octaga.com}

3_{http://mediamachines.wordpress.com/flux-player-and-flux-studio/} 4_{http://www.xj3d.org/}

(38)

3.1. EXTENSIBLE 3D (X3D) GRAPHICS 31 providing additional information in our UML diagrams when an entity is selected. Node prototyping can be used to extend X3D. Developers specify node prototypes and then use prototype instances which are then mapped to geometry nodes.

3.1.3

Visualization Techniques

There is no specific software visualization component or library. We replicated a range of software visualization techniques in X3D including algorithm animations (Figure 3.1(a)), 3D UML diagrams (class, package, and sequence diagrams), docu-mentation related visualizations (source code, API Javadoc – Figure 3.1(b), and video visualizations), and execution trace visualizations (3D compound shapes and 3D information visualization metaphors). The data for our visualizations have been encoded using three different approaches: in the X3D scene, transformed from XML execution traces into X3D geometry primitives, and as node prototype instances. Our visualizations can display Java and C++ programs. X3D can be used to represent program synchronization (Figure 3.1(a)). Multiple views can be accomplished by displaying the data in different positions in the scene (Figure 3.1(a)) or integrating external web pages (Figure 3.1(b)).

Figure 3.2: X3D Animation Routing Event Model.

X3D relies too heavily on the routing event model (Figure 3.2) for animation. ROUTE directives are used between each event in the model. A ROUTE directive takes input from one node’s field and outputs values to another node’s field (e.g. timer to a position interpolator then to a geometry node such as a cube). The design of the model is very cumbersome as ROUTE directives require the explicit name of each node as opposed to an instance based model for object-oriented methods.

3.1.4

Performance

Using our prototype tool [8] our smaller XML execution traces take less than a few seconds5 _{to generate a X3D software visualization and our larger traces}

(10-50MB) take less than two minutes to produce 10K-100K nodes. The longest 5_{Performance transformation and rendering timings of both client and server were measured}

(39)

3.1. EXTENSIBLE 3D (X3D) GRAPHICS 32 time spent in creating our web software visualizations is the rendering of the X3D scenes rather than the style-sheet transformation. It takes less than 10 seconds to render about 10K nodes, three minutes for 50K nodes, but up to 10 minutes to render 100K nodes. The size of the files ranged from less than 100KB for our small visualizations to 10-18MB for our large visualizations. We converted the visualizations to the X3D binary format which reduced the files by about 75%.

3.1.5

User Control and Navigation

X3D supports basic user controls such as start and stop buttons for temporal control in algorithm animations (Figure 3.1(a)). No specific software visualization user controls exist in X3D. More complicated controls such as speed, pause, fast forward, rewind, and step in algorithm animations require the use of scripts. X3D has very good 3D user navigation support for software visualization including the following techniques: walk, examine, fly, look-at, slide, and pan. A user can change the navigation type, the speed of navigation, and viewpoint at run-time using the X3D browsers’ user control menus.

3.1.6

User Tasks

Users can gain an overview of the entire software visualization if a viewpoint is defined that contains the whole data set. Users can zoom using the navigation controls or by selecting a pre-defined viewpoint. Filtering can be achieved using the boolean filter field or changing the transparency, scale, or size values of a node once a user clicks or moves a user control. Details-on-demand can be provided through the level-of-detail or switch nodes which displays additional information about a node once a user clicks or moves within a certain distance from a node in the scene. Showing relationships among items in a visualization requires the use of scripting. There is no built in support for creating a history of user actions nor extracting sub-collections of information.

Our detailed evaluation can be found elsewhere [7]. In summary, the advan-tages of X3D for software visualization are rich graphics, extensibility, and XML integration. The disadvantages of X3D are lack of software visualization user controls, a primitive animation model, and weak support for filtering and layout. Nonetheless we encourage software visualization developers to adopt X3D if they need 3D for the web.

(40)

3.2. GOOGLE VISUALIZATION API 33

3.2

Google Visualization API

We want to see how effective the Google Visualization API6 _{is for use in software}

visualization. We evaluate the API against our experience of creating software visualizations with the API and applying a check-list style framework for evaluat-ing graphics technologies for use in software visualization [7]. The goal of this is to experiment with creating visualizations over the web, evaluate the animation and interactivity aspects, examine the text, layout, and extensibility features, test the integration capabilities, and analyze the performance display capabilities of the visualizations. We now summarize the results of the evaluation.

3.2.1

Design

The Google Visualization API is designed to support visualizations over the web with any compliant server-side data source. The API is implemented as a JavaScript library, interoperates with Google spreadsheets, and can be integrated with the Google Web Toolkit. Developers can use the API so long as they agree to the terms of service, but no modifications can be made to the core of the library, nor can it be used offline. Error handling in the API is limited but third-party browser plugins can be used for debugging. Developers proficient in JavaScript can learn the API within a couple of days to become very competent.

Supported data input formats include: Google Spreadsheets, HTML pages with data tables, JSON or CSV file formats, or web accessible databases. There are custom built PHP, Python, and .Net libraries to access data stored in external databases. Alternatively data can be encoded in a data table within a JavaScript function. A SQL-like query language exists which allows developers to perform various data manipulations with a query to the data source and is independent to the implementation of any specific data source. The API can’t write data to any data source.

3.2.2

Graphical Capability

Visualizations displayed in Firefox are rendered in Scalable Vector Graphics (SVG) and Vector Markup Language (VML) for Microsoft Internet Explorer. Visual-izations that support animation use Flash. 2D is supported, while some of the charts can display 2 1/2 dimensions. Distortion oriented techniques are possible including focus+context.

(41)

3.2. GOOGLE VISUALIZATION API 34

3.2.3

Visualization Techniques

No specific software visualization techniques [68] or graphs/node-link diagrams are supported. 17 visualization types exist including: tables, spreadsheet like charts, hierarchical charts, maps, and more sophisticated temporal displays. Most of the focus of these visualizations are on spreadsheet charts since the underlying data source is structured data.

We illustrate some of the possible uses of the API with a data-set showing the frequency of dependency types as expressed in byte code of JGraph version 5.10.2.0 (graph drawing application). Figure 3.3 shows a pie, area, and bar charts all using data from the same Google Spreadsheet. A dependency exists from one type (class or interface) to another if the other is referred to in the byte-code of the first. The kind of dependency, such as InvokeVirtual, Get indicates what kind of instruction the dependency is part of, or how it is being referred to (e.g. Parameter, Return, Field). The charts all show that there is a lot of abstract classes and there is high use of accessor and mutator methods.

The motion chart (Figure 3.4(a)) implemented in Flash allows users to explore several indicators or trends over time. The chart shows the version release history of JGraph against Jung another graph drawing application. The chart allows a user to press play to animate the data points evolving over time. It shows that JGraph was created before Jung and by the end of 2008 JGraph had reached version five while Jung was still in version two. It is also evident that JGraph has more development iteration cycles than Jung, but this doesn’t measure the quality of code between the applications.

The annotated timeline chart (Figure 3.4(b)) uses the same data set as the motion chart. The chart allows data points to be annotated such as the date and version of a Beta release or when different versions of a language were used in the application such as Java 1.4 and 1.6. The bottom part of the chart has features to zoom-in and display adjustable segments of time within the data set and then slide the segment over the whole time series.

The API can be extended by developers to create their own visualizations other than the predefined ones, such as the Tag Cloud and Magic Table. Currently there are six visualization types created by non-Google developers which range from basic counting to more domain specific areas such as heat maps. The tag cloud (Figure 3.4(c)) displays in alphabetical order text at different font sizes depending on the frequency of the words in the data set. The magic table (Figure 3.4(d)) displays the data source that is used as input represented as a table. If a cell has a numerical value then filled bars can be used to represent the values using a colour ordering. A graphical fish eye view can be enabled which displays

(42)

3.2. GOOGLE VISUALIZATION API 35

(a) Pie Chart

(b) Area Chart

(c) Bar Chart

Figure 3.3: Basic chart visualization types.

zoomed-in values of cells when the mouse hovers over a selection of cells and the neighbouring cells are distorted.

The data within a visualization is independent of the display. Data in the visualizations can still be referenced. Basic layout features are supported but no advanced techniques such as force directed exist. The motion chart is the only visualization that supports animation. Displaying multiple views of different parts of the data set is possible. No video or sound capabilities are supported.

(43)

3.2. GOOGLE VISUALIZATION API 36

(a) Motion Chart

(b) Annotated Time Line Chart

(c) Tag Cloud (d) Magic Table

Figure 3.4: Specialised ((a) and (b)) and Externally ((c) and (d)) created visualiza-tion types.

(44)

3.2. GOOGLE VISUALIZATION API 37

3.2.4

Performance

We tested the visualizations using: Microsoft Internet Explorer (version 7.0) and Mozilla Firefox (version 3.0.6) on Windows, Firefox on NetBSD, and Safari (version 3.2.1) on MacOSX. There is no perceptual differences with rendering the visual-izations on the various browsers. However, the motion chart had some latency lagging issues when the trail of items option was selected. Google Spreadsheets can be used as input to the visualizations which has a limit of 200K cells and importing of spreadsheets from other formats to Google Spreadsheets have a limit of approximately 1MB. We have not used any large data sets stored in external data sources yet. It takes approximately five seconds7for one visualization to render. We embedded 12 visualizations on one web page and it took approximately 20 seconds for the visualizations to display.

3.2.5

User Control and Navigation

Users can select a point or value in the visualization which creates a pop up box listing the label type and the value. For example the pie chart allows users to select pieces of the pie (either on the pie or from the legend) which pop out from the main core of the pie. Users can’t rotate or resize visualizations since these properties are all defined when first created. Users can’t manipulate objects within the visualization to move them to different parts of the screen. Users can only interact with the visualization using a mouse. The API provides listeners that can respond to events that are triggered in the visualization such as when a user clicks an item in a visualization an alert window will appear. Navigation within a visualization is not supported. If a visualization is larger than the current screen size then the web browser scroll bar for navigation is required. Viewpoints of specific locations in a visualization are not supported either.

3.2.6

User Tasks

All the visualizations provide an overview and show details on demand for the user, and some support being able to zoom into items of interest such as Figure 3.4(b). Only one externally developed visualization allows filtering out uninteresting items in a visualization. Showing relationships among items in a visualization is not supported, a history of user actions can’t be saved, and it is not possible to extract sub-collections of information from a visualization.

7_{Performance timings were conducted on a Dell Optiplex GX745 workstation with NetBSD,}

2.8GHz Core Duo processor, 2GB RAM, and ATI x1300 Graphi

Multi-touch Table User Interfaces for Collaborative Visual Software Analytics