• No results found

A Big Picture for Big Data

N/A
N/A
Protected

Academic year: 2021

Share "A Big Picture for Big Data"

Copied!
18
0
0

Loading.... (view fulltext now)

Full text

(1)

Big Picture :: FOSS4G-Europe :: ©2014 Jacobs University

A Big Picture for Big Data

FOSS4G-Europe, Bremen, 2014-07-15

Peter Baumann

Jacobs University | rasdaman GmbH [email protected]

(2)

Big Picture :: FOSS4G-Europe :: ©2014 Jacobs University

Our Stds Involvement

• ISO: member, SC32 / WG3 SQL; SC32 Big Data Study Group; OGC liaison, TC211

• Open Geospatial Consortium: co-chair, BigData.DWG, WCS.SWG, Coverages.DWG;

co-founder, Temporal.DWG

• Charter Member, OSGeo

• Research Data Alliance: co-chair, Big Data Interest Group and Geospatial Interest

Group

• member, ERCIM Expert Group Big Data

• member, Belmont Forum, WP 3 Harmonization of global environmental data

infrastructure

• council member, CGI / IUGS

• founding member and secretary, CODATA Germany

(3)

Big Picture :: FOSS4G-Europe :: ©2014 Jacobs University

Big Data Market Forecast

(4)

Big Picture :: FOSS4G-Europe :: ©2014 Jacobs University

Oracle Reference Architecture

(5)

Big Picture :: FOSS4G-Europe :: ©2014 Jacobs University

Think Big Analytics RA

(6)

Big Picture :: FOSS4G-Europe :: ©2014 Jacobs University

Ulitzer New Enterprise Reference Architecture

(7)

Big Picture :: FOSS4G-Europe :: ©2014 Jacobs University

Apache Big Data Framework

(8)

Big Picture :: FOSS4G-Europe :: ©2014 Jacobs University

NIST Big Data Reference Architecture*

[DRAFT NIST Big Data Interoperability Framework: Vol. 6, Reference Architecture, April 23, 2014]

(9)

Big Picture :: FOSS4G-Europe :: ©2014 Jacobs University

Reference Architectures?

Static

Mixing concept, architecture, interfaces, tools

often 3-tier variants

(10)

Big Picture :: FOSS4G-Europe :: ©2014 Jacobs University

My Reference Architecture ;-)

machine

2

machine

[src: OGC & own]

...for Sensor, Image, Simulation, Statistics Data

= spatio-temporal data cubes

(11)

Big Picture :: FOSS4G-Europe :: ©2014 Jacobs University

A Data Structure Centric View

Stock trading: 1-D sequences (i.e.,

arrays

)

Social networks: large, homogeneous

graphs

Ontologies: small, heterogeneous

graphs

Climate modelling: 4D/5D

arrays

Satellite imagery: 2D/3D

arrays

(+irregularity)

Genome: long string

arrays

Particle physics:

sets

of events

Bio taxonomies:

hierarchies

(such as XML)

Documents: key/value stores:

sets

of unique identifiers + whatever

etc.

analyzing domain standards

(like OGC WCS) can help

(12)

Big Picture :: FOSS4G-Europe :: ©2014 Jacobs University

Embracing Variety

Observation: (Big) Data means many things, at least:

• Heterogeneous data types

• Heterogeneous ways to access (download, extract, analyze, mix, ...)

Proposition:

Query languages

as common c/s interfaces

• Sets: SQL

• Arrays: SQL/MDA

• Graphs: SparQL, graph QLs

• Hierarchies: XQuery/XPath

(13)

Big Picture :: FOSS4G-Europe :: ©2014 Jacobs University

Sample Domain-Specific Queries

Arrays: data cube paradigm

Hierarchies: XPath

Graphs:

• Graph QLS (CYPHER):

• SPARQL:

14

SELECT la.scene.red – la.scene.nir FROM LandsatArchive as la

WHERE avg_cells( la.green > 17 )

SELECT ?name ?email WHERE {

?person rdf:type foaf:Person. ?person foaf:name ?name. ?person foaf:mbox ?email. }

/book/chapter[ @title = „Introduction“ ]

MATCH (n:Crew)-[r:KNOWS*]-m WHERE n.name='Neo'

(14)

Big Picture :: FOSS4G-Europe :: ©2014 Jacobs University

Sample Integrated Queries

Arrays + sets (SQL)

Arrays + hierarchies (XQuery)

15

select encode( scene.nir, „image/tiff“ )

from LandsatScenes

where acquired between „1990-06-01“ and „1990-06-30“

for $c in doc(“WCPS”)//coverage/[ some( $c.nir > $c.red)] return

<id> { $c/@id } </id>

(15)

Big Picture :: FOSS4G-Europe :: ©2014 Jacobs University

Major Consortia for Big Data Standardization

ISO/IEC JTC 1/SC 32: Data management & interchange

• SC32 Big Data Analytics Study Group

ITU-T SG13: Cloud computing for Big Data

W3C: Web & Semantic related standards for markup, structure, query,

semantics, & interchange

Open Geospatial Consortium (OGC): Geospatial related standards for the

specification, structure, query, and processing of location related data

• BigData.DWG, http://external.opengeospatial.org/twiki_public/BigDataDwg/

Organization for the Advancement of Structured Information Standards

(OASIS): Information access and exchange.

• structured data content; security, Cloud computing, SOA, Web services, the Smart

(16)

Big Picture :: FOSS4G-Europe :: ©2014 Jacobs University

Major Consortia for Big Data Standardization /2

Research Data Alliance (RDA)

• Big Data Analytics Interest Group: use case based

• Geospatial Interest Group

Transaction Processing Performance Council

• Benchmarks for Big Data Systems; Specification of TPC Express, BenchmarkTM for

Hadoop

TM Forum

• Enable enterprises, service providers and suppliers continuously transform to succeed

in the digital economy

• Share experiences to solve critical business challenges including IT transformation,

business process optimization, big data analytics, cloud management, and cyber security.

(17)

Big Picture :: FOSS4G-Europe :: ©2014 Jacobs University

Potential Gaps in Standardization

 Big Data use cases, definitions,

vocabulary, reference architectures

 Specifications & standardization of metadata including provenance

 Application models (batch, streaming, etc.)

 Query languages for diverse data types

(XML, RDF, JSON, multimedia, etc.) and Big Data operations (e.g. matrix

operations)

 Domain-specific languages

 Semantics of eventual consistency

 efficient network protocols

 ontologies & taxonomies

 Security, privacy access controls

 Analytics, resource discovery, mining

 Data sharing & exchange

 Data storage

 human consumption (visualization!)

 Energy measurement for Big Data

(18)

Big Picture :: FOSS4G-Europe :: ©2014 Jacobs University

Conclusion

Query language concept successful...

• ...if & when data model adequate

Query language as general c/s paradigm

• Client can hide behind visual interface

• Server can optimize

• Agnostic of low-level protocol: KVP, SOAP, REST, ...

More options for future

• Security, uncertainty, etc.

Open research: next-gen mediators as „glue“

References

Related documents

Dalam penelitian Mardiana et all (2016), menunjukan bahwa secara parsial maupun simultan variabel self assessment , tingkat pengetahuan perpajakan, tingkat

Jumlah pelepasan untuk premium insurans nyawa dan caruman kepada KWSP atau kumpulan wang lain yang diluluskan adalah terhad kepada RM6,000 untuk individu dan RM6,000 untuk isteri

• Web Crypto APIs allow web apps to build their own security model, by managing cryptographic primitives, independently from HTTPS operations.. • Usual use

In another social psychological study Dittmar (2007) investigated the internalisation of consumer culture values and beliefs providing important empirical evidence

recommendations intended primarily to inform CMS in the development of final rules for the Medicare Shared Savings Program but also to assist the Innovation Center as it

changes and drivers - In-migration process - Land access and conversion of forest to agricultural land - Access to market and credit - Technical and organizational

Federal Work-Study (student employment): A program providing part-time jobs for undergraduate and graduate students with financial need, allowing them to earn money to help

To calculate the “ dose of the day ” and assess the dose to the bladder and rectum, the CBCTs were rigidly auto- aligned to the planning CT using RayStation treatment planning