• No results found

Data Visualization. Scientific Principles, Design Choices and Implementation in LabKey. Cory Nathe Software Engineer, LabKey

N/A
N/A
Protected

Academic year: 2021

Share "Data Visualization. Scientific Principles, Design Choices and Implementation in LabKey. Cory Nathe Software Engineer, LabKey"

Copied!
65
0
0

Loading.... (view fulltext now)

Full text

(1)

Data Visualization

Scientific Principles, Design Choices and

Implementation in LabKey

Catherine Richards, PhD, MPH Staff Scientist, HICOR

[email protected]

Cory Nathe

Software Engineer, LabKey [email protected]

(2)

Outline

o

Scientific Principles and Design Choices

o

Implementation in LabKey

(3)

Scientific Principles and Design Choices

o

Why use data visualizations

o

Choosing the best chart type and visual attributes

o

Incorporating design best practices

(4)

Why use data visualizations?

o

Leverage visual system to absorb large amounts of information very

quickly

• Identify patterns or outliers

o

Inspire new questions

(5)
(6)

Data Viz show patterns tables do not

o Average X = 9

o Average Y = 7.5

o Y=3+0.5X --> same linear model

(7)
(8)

Scientific Principles and Design Choices

o

Why use data visualizations

o

Choosing the best chart type and visual attributes

(9)

Chart Types

(10)
(11)

Visual Attributes

o

Data encoding: mapping data to visual attributes

o

Process

• Choose data dimensions to graph • Classify data types

(12)

Data Dimensions

o

Unique information

(13)

Data Dimensions

o

Most common

• Visualizations with 3 or 4 data dimensions

o

Rare

• Visualizations with 6,7 or more

(14)

Data Types

o

Nominal

o

Ordinal

o

Quantitative

• Interval • Ratio

(15)

Data Types

o

Nominal (labels)

• Fruits: apples, oranges, pears

o

Ordinal

• Restaurant inspection grades: A, B, C

o

Quantitative

• Interval (location of zero arbitrary) • Dates

• Location

• Ratio (zero fixed)

• Physical measurement: weight, height

(16)

Operations Permitted with Data Types

o

Nominal (labels)

• Operations: =, ≠

o

Ordinal

• Operations: =, ≠, <,>, ≤, ≥

o

Interval (location of zero arbitrary)

• Operations: =, ≠, <, >, ≤, ≥, -(subtraction) • Can measure distances or spans

o

Ratio (zero fixed)

• Operations :=, ≠, <, >, ≤, ≥, -, /(division), *(multiplication) • Can measure ratios or proportions

(17)

Visual Attributes

(18)

Science of Data Viz

o

Psychophysics

• Branch of psychology that deals with relationship between physical stimuli and sensory response

(19)

Ranking of Elementary Perceptual Tasks

(20)

Length-Position Experiment

(21)

Length-Position Experiment

Cleveland & McGill. JASA. 1984. 79 (387): 531-554 Most accurate

(22)

Ranking of Elementary Perceptual Tasks

(23)

Chart Types

(24)

Chart Types

(25)

Chart Types

(26)

Chart Types

(27)

Chart Types

(28)

Chart Types

(29)

Scientific Principles and Design Choices

o

Why use data visualizations

o

Choosing the best chart type and visual attributes

(30)

Incorporating Design Best Practices

o

Graphic design

• Color theory • Typography

o

Tufte’s Rules

(31)

Tufte’s Rules

1.Reduce chart-junk and increase data-to-ink ratio

2.Maximize contrast

3.Use readable labels

4.

Don’t repeat yourself

5.Instead of legends label data series (points) directly

6.Avoid smoothing and 3D

7.Sort for comprehension

(32)

Tufte’s Rules

(33)

Tufte’s Rules

(34)

Tufte’s Rules

(35)

Tufte’s Rules

(36)

Tufte’s Rules

(37)

Tufte’s Rules

(38)

Tufte’s Rules

(39)

Tufte’s Rules

(40)

Outline

o

Scientific Principles and Design Choices

o

Implementation in LabKey

(41)

LabKey Built-in Reports

o

For non-developers

• Plotting tools built in to LabKey Data Regions

• Rendered using LabKey Visualization API (built on D3js library) • Example: box plot, scatter plot, time chart

o

For developers

• JavaScript Views

• R Reports (Rserver/Knitr)

• Advanced View (invoke command line program) • Module Reports (using LABKEY.Report.execute)

o

Shown in Data Views Browser

• Customize grouping, label, thumbnail, etc. • Control visibility (private vs. shared)

(42)

LabKey Data API Access

o

Access data from study dataset, external schema, list, etc.

o

LabKey Client APIs

• Examples: JavaScript, Java, Perl, Python, Rlabkey, SAS Macros, HTTP Interface

• Secure, auditable, programmatic access to data and services • Exporting data grid as a Script

(43)

LabKey JavaScript Visualization API

o

Shapes / Geoms:

• Point / Bin • Path • ErrorBar • BoxPlot / BarPlot

o

Interactions:

• Callback function for point click • Callback function for mouse

over/out • Brushing (1D, 2D) o

Plot Helpers

• PieChart • LeveyJenningsPlot • SurvivalCurvePlot

(44)

LabKey Visualization - Live Demo

JavaScript based charts from LabKey Demo Study

• Data Region > Charts/Views menu • Generic Chart (box/scatter plot) • Time Chart

• JavaScript View • Reports Webpart

(45)
(46)
(47)
(48)
(49)
(50)

Examples (1 of 3)

Panorama - Levey-Jennings report, Pareto plot

(51)

Examples (2 of 3)

Dataspace - scatter with gutter plots

(52)

Examples (3 of 3)

HIDRA Argos - pie chart, survival curve, bar plot, timeline report

Argos, an application developed in partnership with Fred Hutch. The Timeline report was created by the Oncoscape Core team and is maintained by Lisa McFerrin. Oncoscapeis supported by Fred Hutch and STTR.

(53)

Outline

o

Scientific Principles and Design Choices

o

Implementation in LabKey

(54)

HICOR IQ - Overview

o

Regional Oncology Informatics Platform

o

GOAL: to provide patients, payers, providers and health systems with

transparent information to support decision-making in cancer care

(55)

HICOR IQ - Overview

o

The initial launch includes a limited initial set of reports based on

ASCO 2012 Choosing Wisely Recommendations

o

The initial functionality allows users to select metrics of interest,

configure plots based on regional or clinic views, and generate

reports categorized by sub-groups

(56)

HICOR IQ - Live Demo

o

Data Views direct link to different metrics

o

Configure report (apply filters, switch chart type)

o

Bar plot, Scatter plot, Time plot

o

Population size, filters, exclusions

(57)
(58)
(59)
(60)
(61)

HICOR IQ - Implementation

o

Collaboration between HICOR and LabKey

• Iterative layout and user experience design • D3 code creation for plot rendering

o

Custom Java module

• New database schema and tables

• Use of OLAP cube for accessing measures and dimensions • Plots generated with dimple JavaScript D3 library

o

Additional data security

• Data can not be directly accessed from schema browser

(62)

HICOR IQ - Code Example

renderPlot: function () {

...

//initialize the svg

svg = dimple.newSvg("#" + this.renderId, fullWidth, fullHeight);

//create the chart component and set margins chart = new dimple.chart(svg, data);

chart.setBounds(margin.l, margin.t, plotWidth, plotHeight);

//configure the x-axis

x = chart.addCategoryAxis("x", "Group"); x.floatingBarWidth = 20;

//configure the y-axis

y = chart.addMeasureAxis("y", "Value"); y.showGridlines = false;

y.ticks=4;

y.overrideMax=1.0; y.tickFormat = "%";

//add a bar series to the plot

s = chart.addSeries(null, dimple.plot.bar);

//sorting the x-axis variable x.addOrderRule("Group");

//render the chart as an svg and remove the dimple title

chart.draw();

x.titleShape.remove();

//use D3 to update some content and add titles

this.renderTitle(svg, fullWidth, 0);

this.styleAxis(svg, x, y, margin);

//define the content of the bar hover tooltip

this.overrideTooltipText(s, data, function(row) {

return [

"Group: " + row.Group, "Utilization: " +row.Value

]; });

(63)

HICOR IQ - Code Example

renderPlot: function () { ... //initialize the svg svg = dimple.newSvg("#" + this.renderId, fullWidth, fullHeight);

//create the chart component and set margins

chart = new dimple.chart(svg, data);

chart.setBounds(margin.l, margin.t, plotWidth, plotHeight);

//configure the x-axis

x = chart.addCategoryAxis("x", "Group"); x.floatingBarWidth = 20;

//configure the y-axis

y = chart.addMeasureAxis("y", "Value"); y.showGridlines = false;

y.ticks=4;

y.overrideMax=1.0; y.tickFormat = "%";

//add a bar series to the plot

s = chart.addSeries(null, dimple.plot.bar);

//sorting the x-axis variable

x.addOrderRule("Group");

//render the chart as an svg and remove the dimple title

chart.draw();

x.titleShape.remove();

//use D3 to update some content and add titles

this.renderTitle(svg, fullWidth, 0);

this.styleAxis(svg, x, y, margin);

//define the content of the bar hover tooltip

this.overrideTooltipText(s, data, function(row) {

return [

"Group: " + row.Group,

"Utilization: " +row.Value

]; }); var y = d3.scale.linear() .range([height, 0]); y.domain([0, 1.00]); d3.svg.axis() .scale(y) .orient("left") .tickValues([0, .25, .5, .75, 1])

.tickFormat(function(d) { return d * 100 + "%"; }); ...

svg.append("g")

.attr("class", "y axis") .call(axis)

.style("font-weight","bold") .style("font-family", "Arial") .append("text")

.attr("class", "ylabel") .attr("y", -20)

.attr("x", -40) .attr("dy", ".71em") .text(label);

(64)

HICOR IQ - Future

o

Allow new metric definition and data loading

o

Split module for security (server) vs. plotting (client)

o

Identification of “My Clinic” for comparison in scatter plot

o

Include static reports

o

Clinic / Payor dashboard report

o

Better organization of sub-metrics

(65)

Thank You

Any questions?

Catherine Richards, PhD, MPH Staff Scientist, HICOR

[email protected]

(soon to be Director, Scientific and User Engagement at Aetion)

Cory Nathe

Software Engineer, LabKey [email protected]

References

Related documents

To open the Results tab for the active scenario: Click the GO button on the toolbar, or select Analysis \ Compute from the pull-down menu.. In the resulting dialog, click the

Unit G1-5, Aung San Stadium (South Wing) Kun Chan Road, Mingalar Taung Nyunt Township Yangon MYANMAR.. Thu

The Ministry of Justice, with a view to introducing new Turkish Penal Code (Law No.5237) which came into force on 1 July 2005 held seminars on “the Introduction of New Penal

The Emaar Gift Card can be redeemed at the selected outlets at The Dubai Mall, Dubai Marina Mall, Souk Al Bahar, Gold &amp; Diamond Park and all other Emaar assets such as

Finally, protease sensitivity studies in Pax3 mutants bearing engineered Factor Xa sites either in the linker separating the PAl and RED motif (position 100), or upstream the

The INSTEPRO software allows to simultaneous control (data acquisition, computing, sending control variables to the direct control level), data archiving (process variables,

The same buffer used in the digitization phase is handed down the different layers without any copy-operations and is finally used to send the data out via the

The other indicators, namely, establishment of material types required, testing for quality of materials, procurement of materials, storage of materials, inventory control,