320473 Databases & Web Applications Lab
320454 Big Data Project A
Instructor: Peter Baumann
email:
Big Science Data
[OGC Ocean Science Interoperability Experiment;
image source: Mbari]
MultiSolid
Coverage
OGC Coverage Types
Coverage = digital representation of
space/time varying phenomenon
•
n-D
«FeatureType»
Abstract
Coverage
MultiPoint
Coverage
MultiCurve
Coverage
MultiSurface
Coverage
Grid
Coverage
Referenceable
GridCoverage
Rectified
GridCoverage
Facing the Coverage Deluge
coverage
server
sensor feeds
[OGC SWE]
sensor feeds
Taming the Coverage Deluge
coverage
server
Let’s Take a Closer Look...
Divergent access patterns
for ingest and retrieval
Server must
mediate
between access patterns
Our Research
Large-Scale Scientific Information Services (L-SIS) Research Group
flexible, scalable services on
massive multi-dimensional scientific data
•
Particular focus: n-D
arrays
•
Massive = multi-TB … multi-PB per object
Results:
•
rasdaman array DBMS (
www.rasdaman.org
),
demo at
www.earthlook.org
•
Geoservice standards: OGC WCS suite,
http://external.opengeospatial.org/twiki_public/CoveragesDWG/WebHome
„raster data manager“:
Array Database
= SQL + n-D arrays
•
“tile streaming” architecture:
scaling from laptop to cloud
rasdaman: Scalable Array Analytics
www.rasdaman.org
rasdaman Web visitors
select img
.green[x0:x1,y0:y1] > 130
[Diedrich et al 2001]
Use Case:
Big Earth Data Analytics
Up to 130 TB databases for all Earth sciences + planetary science
•
EU FP7-INFRA, 3 years, 5.85 mEUR
Platform: rasdaman; strictly open standards
Earth
Server
Planetary
Science
Mars geology
Cryospheric
Science
landcover mapping
Oceanography
marine model runs +
in-situ data
Geology
geological models
Airborne
Science
high-altitude drones
Atmospheric
Science
climate variables
Database Visualization
select
encode(
struct {
red:
(char) s.image.b7[x0:x1,x0:x1],
green:
(char) s.image.b5[x0:x1,x0:x1],
blue:
(char) s.image.b0[x0:x1,x0:x1],
alpha:
(char) scale( d.elev, 20 )
},
"image/png"
)
ad-hoc federation
mixed hardware
Parallel / Distributed Query Processing
Dataset B
Dataset A
Dataset D
Dataset C
select
max((A.nir - A.red) / (A.nir + A.red))
-
max((B.nir - B.red) / (B.nir + B.red))
-
max((C.nir - C.red) / (C.nir + C.red))
-
max((D.nir - D.red) / (D.nir + D.red))
Secured Archive Integration
First-ever direct,
ad-hoc mix
from
protected
NASA & ESA services
in OGC WCS/WCPS Web client (EarthServer + CobWeb)
Next: On-Board Query Intelligence
Democratize direct data access
Summary
Project work
•
embedded in international projects & collaborations
Present
Publish
Big Picture
320302
Databases and Web Applications
•
Fall lecture, undergrad + grad
•
Advanced course in spring: Information Architectures
320473
Databases and Web Applications Lab
•
Lab, grad
320454
Big Data Project A
•
Project, grad
New meeting slot:
Project Task
Pick a topic
•
http://www.faculty.jacobs-university.de/pbaumann/iu-bremen.de_pbaumann//Courses/ResearchTopics/
Perform task – planful:
Spec document
20% -- Sep 26
Oct 03
Prototype 1: breakthrough implementation 20% -- Oct 17
Prototype 2: ready for benchmark
20% -- Oct 31
Benchmark results
20% -- Nov 14
Publication
10% -- Nov 28
Resources
rasdaman website
•
www.rasdaman.org
demo
•
www.earthlook.org
Our publications
•
http://www.faculty.jacobs-university.de/pbaumann/iu-bremen.de_pbaumann//pubs.php
Instructor:
•
p.baumann@...
Main Evaluation Criteria
complete
wrt. requirements
Solid
engineering
•
bug-free, project & code documentation, coding quality, ...
user-friendliness
and appealing
look&feel
complexity
(in absolute terms and in comparison to other teams' work)
Good
writeup
•
Specification, documentation, paper
Project/Lab Topics