IOT & Big Data: The Future
Information Processing Architecture
Dr. Michael Faden
Dirk Weise
Oder:
Agenda
1.
Introduction
2.
Terms and Definitions
3.
Reference Architecture
4.
Architecture Building Blocks
5.
Architecture Patterns
Agenda
1.
Introduction
2.
Terms and Definitions
3.
Reference Architecture
4.
Architecture Building Blocks
5.
Architecture Patterns
Discussions about Big Data
Is our use case Big
Data or not?
Do we have to use
Hadoop or NoSQL for
Big Data?
Do we miss out if we
don‘t do Big Data?
What shall we do
with all that Sensor
Data
Understand Drivers for Big Data Undertakings
A business scenario bridges three drivers end to end:
•
Opportunity – There is something one could do.
•
Capability – There is something we can do.
•
Business demand – There is something we need to do.
Opportunity
Business Scenario
Business Demand
BIG
Respelled
B
usiness Value: Investment in Big Data, be it in
terms of personnel or technology, has to be rectified
by convincing business demand.
I
ntegration: Big Data sources have to be integrated
with the enterprise architecture – semantically,
technologically and procedurally.
G
overnance: Big Data must not dilute the enterpise
information model. Business processes have to be
aligned to data source properties (e.g. topicality,
reliability).
BIG
Respelled
B
usiness Value: Investment in Big Data, be it in
terms of personnel or technology, has to be rectified
by convincing business demand.
I
ntegration: Big Data sources have to be integrated
with the enterprise architecture – semantically,
technologically and procedurally.
G
overnance: Big Data must not dilute the enterpise
information model. Business processes have to be
aligned to data source properties (e.g. topicality,
reliability).
Separate Concepts from Implementation
Required capability
•
Conceptual – requirements capture and high-level solution
•
Independent from technology, organisation and processes
Provided capability
•
Concrete – detailed solution and implementation
•
Specific to technology, organisation and processes
Required Capability
Provided Capability
What?
Agenda
1.
Introduction
2.
Terms and Definitions
3.
Reference Architecture
4.
Architecture Building Blocks
5.
Architecture Patterns
Terminology
•
Architecture
•
Fundamentals of a system
•
Context, structure, abstraction, process
•
Architecture View
•
System seen from the perspective of specific concerns
•
Comprehensible, comprehensive
•
Building Block
•
Potentially reusable component of an architecture
•
Relevant, interrelated, fractal
ISO/IEC 42010
Systems and software engineering – Architecture description
TOGAF Version 9.1
Our Approach
•
Reference architecture. The canvas for our architecture
•
ABBs. Commonly found requirements
•
SBBs. Commonly applied solutions
Refer
en
ce
A
rchi
tectur
e
A
rchi
tectur
e
Bui
ldi
ng
Bl
ocks
Solution
Bui
ldi
ng
Bl
ocks
Method & Patterns
Requirements view
Implementation view
Agenda
1.
Introduction
2.
Terms and Definitions
3.
Reference Architecture
4.
Architecture Building Blocks
5.
Architecture Patterns
Reference Architecture
Data
Sources
Ingestion
Data
Discovery Lab
Governance
Information Provisioning
Information
Query &
Visualisation
Data Factory &
Managed Enterprise
Information
Agenda
1.
Introduction
2.
Terms and Definitions
3.
Reference Architecture
4.
Architecture Building Blocks
5.
Architecture Patterns
Architecture Building Blocks
Data Sources
Data Engines & Poly-structured Sources Structured Data Sources Master & Reference Data Sources Content Data Ingestion Data Source Connectivity & Capturing Raw Data at Rest Information Provisioning Virtualisation & Query Federation
Information Query & Visualisation
Prebuilt & Ad-hoc BI Assets Information Services Export (push) Query (pull)
Data Factory & Managed Enterprise Information
Managed Information
Layer
Data Factory
Layer Discovery Lab
Advanced Analysis & Data Science Tools Discovery Lab Sandboxes
Governance Legal Compliance Information Quality & Accountability Security Metadata
Management ManagementMaster Data
Raw Data in Motion
Access & Performance
Agenda
1.
Introduction
2.
Terms and Definitions
3.
Reference Architecture
4.
Architecture Building Blocks
5.
Architecture Patterns
Architecture Patterns – Schema on Write
Information Provisioning Information Query &
Visualisation Data Factory &
Managed Enterprise Information
Data Sources Data Ingestion
Data Source Connectivity &
Capturing
Raw Data at Rest
Factory Discovery Lab
Governance
Raw Data in Motion
Managed Information
Information Provisioning Information Query & Visualisation Data Factory &
Managed Enterprise Information
Architecture Patterns – Classic BI
Data Sources Data Ingestion
Data Source Connectivity &
Capturing
Factory Discovery Lab
Organisation Governance Managed Information Structured Data Sources Master & Reference Data Sources Query (pull) Access & Performance Layer
Prebuilt & Ad-hoc BI Assets Legal Compliance Information Quality & Accountability Security Metadata
Management ManagementMaster Data
IT Operations StakeholdersBusiness BI Competence
Information Provisioning Information Query & Visualisation Data Factory &
Managed Enterprise Information
Architecture Patterns – Classic BI (in its own words)
Data Sources Data Ingestion
Data Source Connectivity &
Capturing
ETL incl. Staging &
Cleansing Advanced Analytics
Governance Core DWH Structured Data Sources Master & Reference Data Sources Query (pull) Data Marts Reporting & Ad-hoc Queries Legal Compliance Information Quality & Accountability Security Metadata
Management ManagementMaster Data
ODS incl. Historisation
BI Portal etc.
Architecture Patterns – Schema on Read
Information Provisioning Information Query &
Visualisation
Data Sources Data Ingestion
Data Source Connectivity &
Capturing
Raw Data at Rest
Data Factory & Managed Enterprise Information
Discovery Lab
Organisation Governance
Information Provisioning Information Query & Visualisation
Architecture Patterns – Lambda Architecture
Data Sources Data Ingestion
Data Source Connectivity &
Capturing
Raw Data at Rest
Information Provisioning Information Query &
Visualisation Data Factory &
Managed Enterprise Information
Factory
Discovery Lab Governance
Factory
Raw Data in Motion
Access & Performance Layer Managed Information Managed Informations
Discovery Lab
Advanced Analysis & Data Science Tools Discovery Lab Sandboxes
Information Provisioning Information Query &
Visualisation
Architecture Patterns – Discovery Lab
Data Sources Data Ingestion
Data Source Connectivity &
Capturing
Raw Data at Rest
Data Factory & Managed Enterprise Information
Organisation Governance
Raw Data in Motion
IT Operations StakeholdersBusiness BI Competence
Centre
Managed Information
Agenda
1.
Introduction
2.
Terms and Definitions
3.
Reference Architecture
4.
Architecture Building Blocks
5.
Architecture Patterns
Solutions – Data Ingestion
Data Source Connectivity & Capturing Data Factory Raw Data at Rest An RDBMS Way (synchronous) Architecture SolutionsRaw Data in Motion
DB Upsert … DB Staging Tables DB Triggers An MS Way (asynchronous) EventHub Stre am Insi ght SQLServer Similar: • CDC Similar:• ESB
Solutions – Data Factory (Integration)
Data Store (RDBMS) Data Store (Hadoop) Data Store (NoSQL) Data Store Architecture Solutions Data Store (RDBMS) DB Link JDBC SQOOP, JDBC Hive, Pig, DrillSPARQL
To RDBMS
Data Store (Hadoop)
SQOOP, JDBC
Hive, Pig, Drill
SPARQL To Hadoop Data Store (NoSQL) SPARQL SPARQL SPARQL To NoSQL
Solutions – Data Factory (Interpretation)
Architecture Solutions MEI Raw Data Parsing NLP OCR NormalisationSchema Application Identity Resolution
MEI Raw Data Matching Algorithms Reference Data Operational Data Data Cleansing MEI Raw Data Cleansing Algorithms Reference Data Data Capture Raw Data in Motion Raw Data at Rest Raw Data Raw Data Raw Data MEI
Solution Building Block
Data
Sources
Ingestion
Data
Discovery Lab
Governance
Information Provisioning
Information
Query &
Visualisation
Data Factory &
Managed Enterprise
Information
Azure
HDInsight
Power BI
ML Studio
BASEL BERN BRUGG GENF LAUSANNE ZÜRICH DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. HAMBURG MÜNCHEN STUTTGART WIEN