• No results found

From Distributed Computing to Distributed Artificial Intelligence

N/A
N/A
Protected

Academic year: 2021

Share "From Distributed Computing to Distributed Artificial Intelligence"

Copied!
32
0
0

Loading.... (view fulltext now)

Full text

(1)

From Distributed Computing to Distributed

Artificial Intelligence

Dr. Christos Filippidis, NCSR Demokritos

(2)

Big Data and the Fourth Paradigm

The two dominant paradigms for scientific discovery:

● Theory

● Experiments

large-scale computer simulations emerging as the third paradigm in the 20th century

The fourth paradigm, which seeks to exploit information buried in massive datasets, has emerged as an essential complement to the three existing paradigms

The complexity and challenge of the fourth paradigm arises from the increasing rate, heterogeneity, and volume of data generation.

● Large Hadron Collider (LHC) currently generate tens of petabytes of reduced data

per year

● observational and simulation data in the climate domain are expected to reach

exabytes by 2021

(3)

LHC Data Challenge

Starting from this event (particle collision) …

You are looking for this “signature”…

Data Collection

Data Storage

Data

Processing

Data Collection

Data Storage

Data

Processing

•Selectivity: 1 in 1013

Like looking for 1 person in a thousand world populations!

Or for a needle in 20 million haystacks!

(4)

CMS

ATLAS

LHCb

~15 PetaBytes / year

~10

10

events / year

~10

3

batch and

interactive

users

~ 20.000.000 CD / year

Concorde(15 Km) Balloon (30 Km) CD stack with 1 year LHC data! (~ 20 Km) Mt. Blanc (4.8 Km)

(5)
(6)

Definition of Grid systems

Collection of geographically distributed

heterogeneous resources

“Most generalized, globalized form of distributed computing”

“An infrastructure that enables flexible,

secure, coordinated resource sharing among

dynamic collections of individuals,

institutions and resources”

(7)
(8)
(9)
(10)
(11)
(12)
(13)
(14)
(15)
(16)
(17)
(18)
(19)

Exascale Challenges

● Current Petascale systems is unlike to scale to eXascale environments, due to the

disparity among computational power, machine memory and I/O bandwidth

● The exascale simulations will not be able to write enough data out to permanent

storage to ensure a reliable analysis

● Current Grid infrastructures are not user friendly and are far from efficient, for

small groups and individuals

● Grid infrastructures, when implemented by HEP VOs, tends to be centralized,

from the data point of view.

(20)

IKAROS Platform

20

android .apk

android .apk

android .apk

Data/Metadata-Collector Ikaros-EG plugin

“job” creation Content provider

+ mobile devices

+ WI-FI, 3G

mobile-Grid

android .apk

android .apk android .apk

(21)

Elastic Transfer (eT)

●Create your Personal Storage Cloud

●Directly, transfer your files from your workstation to another PC ●Third-party Data transfer

●Flexible data & storage sharing

●You are on the road, behind fifteen firewalls, and want to share some web

application you're developing locally, or just share a set of files with someone real quick (Reverse HTTP)

(22)

Nice! So, now can I...

● Discover whether corruption in

politics is a location-based issue?

● Check what is the best route to a

house by the sea, with low rent?

● Find the ideal husband/wife?

● Determine how to improve my

(23)

Well, you kind of can...

If you

can read through petabytes of information

can determine what is useful and what is not

contact 30 different organizations hosting the data

have experts combining the data

visualize them in a meaningful way

(24)
(25)

Bits and pieces

●If you had individual people producing simple statements

● People need food ● Souvlaki is food

● Souvlaki contains meat

●Decipherable by machines

● <people, need, food> ● <souvlaki, is, food>

● <souvlaki, contains, meat>

●Could computers combine knowledge to be “intelligent”?

● <?,need,meat>: Who needs meat?

(26)

Distributed Artificial Intelligence to the rescue!

(27)
(28)

How does it work?

● You use MACHINES (agents will do fine...)!

● You query LOTS of resources...

● With BILLIONS of small, statements

● You REASON upon them

● You provide answers in realistic time

(29)

Challenges

Data providers speak different languages

Data providers can go offline

Even knowing who to ask is a problem

Responding in time can be challenging

(30)

SemaGrow: Distributed, Heterogeneous,

Semantic Query Processing

●Distributed queries over SPARQL endpoints

●On-the-fly mapping across data provider languages

●Adaptive to problematic data providers

●Allows complex queries

(31)

Summary

● Distributed computing allows

● Generating amazing amounts of data

● Handling amazing amounts of data

● Computational availability and fail-over

● On-demand computation power

● Security

●Distributed artificial intelligence allows

● Asking complex questions over data

● Combining data

● Generating knowledge

(32)

From Distributed Computing to Distributed

Artificial Intelligence

Dr. Christos Filippidis, NCSR Demokritos Dr. George Giannakopoulos, NCSR Demokritos

References

Related documents

Nicole Oxendine (Columbia College Chicago, Dance Movement Therapy and Counseling Nicole established a dance movement program for students at the Chicago Vocational High School.

The wind turbine was used to convert non-polluting and renewable wind energy into electricity. Due to the long operation and harsh working environment, wind

Natural Language Processing and Information Systems: 20th Interna- tional Conference on Applications of Natural Language to In- formation Systems, NLDB 2015, Passau, Germany, June

In the study presented here, we selected three rep- resentative pathogenic PV mAbs cloned from 3 different PV patients: F706, an anti-Dsg3 IgG4 isolated by heterohybridoma, F779,

In any case, for such source countries the policymaker needs to achieve two objectives: limit the size of overall migration and change its composition such that only people with

time pressure sebesar 0,003 yang lebih kecil dari 5% (0,05) maka Ho diterima dan dapat disimpulkan bahwa time pressure memiliki pengaruh secara parsial terhadap penghentian

significant coefficient on labour practices for the high standard group with its coefficient being much smaller than its counterpart for the low standard group, which when

This paper tests how economic indicators influence vote intention in presidential elections in two emerging markets: Brazil and Mexico.. The paper finds that no theory is capable