© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS
PICASSO Big Data Expert Group
© Fraunhofer · Seite 3
Semantic Web Layer Cake 2001
http://www.w3.org/2001/10/03-sww-1/slide7-0.html
• Monolithic based on XML
• Focus on heavyweight
Semantic (Ontologies, Logic, Reasoning)
The Semantic Web Layer Cake 2015 –
“A Little Semantics Goes a Long Way”
Unicode URIs XML JSON CSV RDB HTML RDF RDF/XML JSON-LD CSV2RDF R2RML RDFa RDF Data Shapes RDF-Schema Vocabularies Ontologien SKOS Thesauri Logik SWRL Regeln SPARQL (Acces s co ntr ol), Si gnatu r, Enc ryptio n (HT TPS /CE RT/D AN E),
• Lingua Franca of Data
integration with many
technology interfaces (XML, HTML, JSON, CSV, RDB,…) • Focus on lightweight vocabularies, rules, thesauri etc. • Less “invasive”
© Fraunhofer – Seite 5
INTEGRATING BIG DATA &
LINKED DATA
Blueprint of the Data Aggregator Platform
Follows typical Lambda Architecture
Integrated on top of existing Big Data distribution
+ Semantic Layer (Retaining Semantics using LD approach )
Batch Layer
Speed Layer Data Storage
Real-time data & Transactions … Batch View Real-time View me ss age p assin g message passing
Applications & Showcases Real-time dashboards
Domain-specific BDE apps
Big Data Analytics In-stream Mining BDE Platform & In telligence Input data Stream Spatial Social Statistical Temporal Transactiona l Imagery
© Fraunhofer · Seite 7
Adding a Semantic Layer to Data Lakes
7
Management Accounting
Regulatory Reporting Risk Treasury Accounting
Semantic Data Lake • central place for
model, schema and data historization
• Combination of Scale Out (cost reduction) and semantics (increased control & flexibility) • grows incrementally (pay-as-you-go) Inbound Data Sources Outbound and Consumption
Inbound Raw Data Store
Data Lake (order of magnitude cheaper scalable data store) Knowledge Graph for Relationship Definition and Meta Data
Frontend to Access Relationship and KPI Definition
/ Documentation Frontend to Access (ad hoc) Reports
Outbound Data Delivery to Target Systems
[1] Wrobel, Voss, Köhler, Beyer, Auer: Big Data, Big Opportunities - Anwendungssituation und Forschungsbedarf. Informatik [2] Debattista, Lange, Scerri, Auer: Linked 'Big' Data. IEEE/ACM Big Data Computing BDC 2015: 92-98
JSON-LD CSVW R2RML
© Fraunhofer · Seite 9
Vocabulary-based Integration facilitates Data-driven
Businesses
Die Arbeiten zum Industrial Data Space sind
komplementär verzahnt mit der Plattform Industrie 4.0
Handel 4.0 Bank 4.0 Versicherung
4.0
…
Industrie 4.0
Fokus auf die produzierende
Industrie Smart Services
Übertragung, Netzwerke
Echtzeitsysteme
Industrial Data Space
Fokus auf Daten
Daten
© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS
The Industrial Data Space Initiative
Community of >30 large German and European Companies
Pre-competitive, publicly funded innovation project involving 11 Fraunhofer institutes for developing IDS reference architecture
Current signatories of the MoU to support the Industrial Data Space
Semantic Data Linking for Enterprise Data Value Chains
Data Lake Pure Internet
centralized, monopolistic federated, secure, „trusted“, standard-based completely dezentral, open, unsecure
Data management Central Repository Decentral Decentral
Data Ownership Central Decentral Decentral
Data Linking Single provider Federated, on demand Missing
Data Security Bilateral Certified system Bilateral
Market structure Central Provider Role system Unstructured
Transport infrastructure Internet Internet Internet
Industrial Data Space
© Fraunhofer · Seite 13 Bilder: © Fotolia 77260795 ∙ 73040142 58947296 ∙ 68898041
Basic principles of the Industrial Data Space
On Demand
Vernetzung
Linked Light
Semantics
Security
with
Industrial
Data
Container
Certified
Roles
On Demand
Interlinking
Industrial Data Space:
On Demand Interlinking
Service A Service C Service E Service B Service D Service G Service F Enterprise 4 Enterprise 1 Enterprise 6 Enterprise 2 Enterprise 3 Enterprise 5All Data stays with its Ownern and are controlled and secured. Only on request for a
© Fraunhofer · Seite 15 VERTRAULICH
---Linked Light Semantics
A lighweight approach for Data Interlinking
Q: istockphoto.com
Classical Enterprise systems
Fixed Data schema
Globale Enforcement
Closed
Manuel Transformation
High cost
Linked Light Semantics
Reference vocabularies Bridge between local
Representations Intelligent and structured
interlinked Automatic translation/mapping Leight-weight Internet / WWW Web pages Only Links Completely open Lack of standardization No structure
Industrial Data Space
Upload / Download / Search
Internet
Apps Vocabulary
Industrial Data Space Broker
Clearing
Registry Index
Industrial Data Space App Store
Internal IDS
Connector
Company A Internal IDS
Connector External IDS Connector External IDS Connector Upload Third Party Cloud Provider Download Upload / Download
© Fraunhofer · Seite 17
Industry 4.0
Semantic Administrative Shell &
Reference Architecture for Industry 4.0 (RAMI4.0)
Administrative Shell (Verwaltungsschale)provides a digital identity for arbitrary Industry 4.0 components (e.g. sensors, actors/robots) exposing data covering the whole life-cycle
Reference Architecture for Industry 4.0 (RAMI4.0) provides a conceptual framework for implementing
comprehensive Industry 4.0 scenarios We have implemented both concepts
along with a number of IEC and ISO standards in a comprehensive
information model ready to be implemented in productive environments
© Fraunhofer · Seite 19
Summary
Challenges and Opportunities - Interoperability and Standardization
• Adding a semantic layer to Big Data technology
• Integrating Linked Data and Big Data technology
• Towards Enterprise Knowledge Graphs and Data Spaces