• No results found

UniGR Workshop: Big Data «The challenge of visualizing big data»

N/A
N/A
Protected

Academic year: 2021

Share "UniGR Workshop: Big Data «The challenge of visualizing big data»"

Copied!
24
0
0

Loading.... (view fulltext now)

Full text

(1)

Dept. ISC

Informatics, Systems & Collaboration

UniGR Workshop: Big Data

«The challenge of visualizing big data»

Dr Ir Benoît Otjacques

Deputy Scientific Director ISC

(2)

The Future is Data-based

2

(3)

Who we are

+/- 30 members (all MSc, MEng or PhD in Computer Science)

Network of Partners from Luxembourg and abroad

Funding from

Ministry of Research

EU / National research programs (FNR)

Contract research (private/public)

Outputs

Scientific

Papers

R&D Studies

Proof-of-concept

Prototypes

Professional

Applications

Applied Research

Fundamental

research

ISC

(4)

Mission

use of

computer science

to ease the

understanding

of complex

big data

coming from multiple and heterogeneous sources

by primarily

using visual representations

accessed via any type of devices

in various contexts of use.

(5)

Interactive

Visualization

of Data

Data

Provisionin

g

Data

Processing

& Analysis

Software

Tools

Delivery

More than Graphics:

Usable software tools

More than Graphics:

Usable software tools

More than Data:

Consider Meta-data

More than Data:

Consider Meta-data

More than Preprocessing:

Visual Analytics

More than Preprocessing:

Visual Analytics

Scope

One of the largest team in

Europe focused on this topic

(> 20 permanent positions)

(6)

What we do

Scientific Vis

Computer Graphics

Medical Imaging

CAD/CAM

Virtual Reality

(7)

What we do

Infovis

Visual Analytics

Visual Data Mining

Data Analytics

Abstract Data

(8)

www.calluna.lu

(9)

What we do

Domain agnostic

(10)

What we do

Business / Science

Field expert

Field Question

Generic Problem

Reuse / Adapt / Invent Potential generic solution(s)

Instantiate a Generic Solution

Solution usable on the field

Our Group

How to analyse my

network of friends?

How to analyse

network data ?

Graph drawing, dynamic graphs, adjacency

matrices, graph clustering…

Multi-level graph

drawing with

semantic labelling

Web-based app with interactive

visualization of social network contacts

(11)

Raw

Data

Formatted &

Structured

Data

Processed

Data

Visual

Representation

Data

Acquisition

User Interaction

Data Analysis

& Mining Algorithms

Drawing &

Rendering Algorithms

User with a

problem to

solve

Infovis & Visual Analytics

What does Big Data change?

(12)

2 major challenges in Visual Analytics

What’s the problem?

Static Data

Dynamic Data

Small, Mid-sized

Big

Well studied

Open issues type B

Open issues type A

Highly challenging

(A and B) >> A+B

• Scalability

• Dynamics

(13)

Big Static Data

What’s the VA problem? It’s Big!

Heterogeneous high volume data sources

Scalability of data provisioning HW/SW infrastructure

Scalability of mining algorithms

Scalability of visual representations

Software engineering issues

How to run queries on distributed systems to explore big data sets?

How to visualize a million multi-variate items on a screen?

How to lower the time needed to run a clustering algorithm on xGbytes?

How to design an interactive user interface loading big data in < 1 sec?

(14)

36000 French “Communes” on a single screen

Weighted by population size, spatially constrained

What if data processing is running in the background?

What if the user wants seamless nagivation in the data set?

Can this map be generated in <0.1 sec on a classic laptop?

How a competing algo scales…

(15)

Dynamic

Mid-sized Data

What’s the VA problem? Data changes!

Heterogeneous data streams

Dynamic data provisioning HW/SW infrastructure

Evolution of mining algorithms

Evolution of visual representations

Software engineering issues

How to aggregate data streams?

How to visualize a continuously changing data structure?

How to adapt clustering algorithms to consider dynamic data?

How to design an interactive user interface continuously fed by data?

(16)

Clustering of streams

W(t

1

)

W(t

2

)

W(t

3

)

W(t

n

)

time

W(t

i

)

What if a MDS projection must be computed in real time to visualize the clusters?

What if the user wants to adapt clustering parameters at run time?

What if the connexion to a data stream is lost?

V(t

1

)

V(t

2

)

V(t

3

)

V(t

i

)

V(t

n

)

C1(t

i

)

C2(t

i

)

C3(t

i

)

C1a(t

i+1

)

C1b(t

i+1

)

W(t

i+1

)

V(t

i+1

)

Mental map?

Update

frequency?

(17)

Big Dynamic

Data

My God! Data are big and are changing!

Solutions for type A and type B problems often do

not work for (A and B) problems

Pre-computation (batch mode) available for big static data sets

streams?

Real time fusion of data streams

still possible if 10

n

heterogeneous streams?

Stability of mental maps of the user?

Aggregation strategy for multiscale data wrt time and wrt space?

What if the user device is a smartphone with poor computing resources?

(18)

How/when to update it? How/when to compute it?

How not to loose the user? How to interact with it?

(19)

Enabling decisions through Visual Analytics

19

Rethinking/adapt existing algorithms /

techniques w.r.t Big Data

Big Systems

Big Data

Visual Analytics

techniques

Batch Interactive Streaming

Data Provisioning

(20)

Collaborations

Your Scientific / Business Problem

• Data Provisioning is an issue

• Data Visualization is an issue

• Data Analytics is an issue

• You need a software tool to do this

(21)

Before joining ISC, its members were there…

(22)
(23)

Conclusion…

We are here today to join our respective forces …

to face a BIG challenge

(24)

Contact:

Dr Ir Benoît Otjacques

[email protected]

References

Related documents

Therefore, various laboratory equipment used in learning media with the help of ICT can be developed simulation application.. Particularly in the field of

Young People's Health in Context: Health Behaviour in School-aged Children (HBSC) study?. Health Policy for Children and

A hybrid statistical model representing both the pose and shape variation of the carpal bones is built, based on a number of 3D CT data sets obtained from different subjects

Nurses feel that both the software and the nurse are essential to clinical decision-making, and describe a process of ‘dual decision- making’, with the nurse as active decision

The publisher or other rights-holder may allow further reproduction and re-use of this version - refer to the White Rose Research Online record for this item.. Where records

In contrast to the generally positive findings on the achievement of African children educated in their home language and/or in bilingual programmes (Heugh, 2009;

This study aims to determine the spider fauna from the ground and understory (herbs, shrubs and small trees) of the TMCF in El Triunfo Biosphere Reserve (REBITRI for its