DICE Horizon 2020 Project Grant Agreement no. 644869
h>p://www.dice-‐h2020.eu Funded by the Horizon 2020 Framework Programme of the European Union
DICE: Quality-‐Driven
Development of Data-‐
Intensive Cloud
ApplicaPons
G. Casale,
D. Ardagna
, M. Artac, F. Barbier,
E. Di Ni6o, A. Henry, G. Iuhasz, C. Joubert,
J. Merseguer, V. I. Munteanu, J. F. Pérez,
D. Petcu, M. Rossi, C. Sheridan, I. Spais,
D. Vladušič
MiSE 2015@ICSE, Florence, Italy, May 17 2015
Overview and goals
o
MDE oYen features quality assurance (QA)
techniques for developers
o
How should quality-‐aware MDE support data-‐
intensive soYware systems?
o
ExisPng models and QA techniques largely ignore
properPes of data
o
Characterize the behavior of new technologies
o
DICE: a quality-‐aware MDE methodology inspired by
DevOps for data-‐intensive cloud applicaPons
2 ©DICE
o
SoYware market rapidly shiYing to Big Data
§
32% compound annual growth rate in EU through 2016
§
35% Big data projects are successful [CapGemini 2015]
o
European call for soYware quality assurance (QA)
§
ISTAG: call to define environments “for understanding the
consequences of different implementaNon alternaNves (e.g.
quality, robustness, performance, maintenance,
evolvability, ...)”
o
QA evolving too slowly compared to the trends in
soYware development (Big data, Cloud, DevOps ...)
MoPvaPon
3 ©DICE MiSE 2015@ICSE, Florence, Italy, May 17 2015
DataInc example
o
DataInc is a small soYware vendor selling cloud-‐based environmental
soYware
o
DICEnv, a warning system for floods in rural regions
o
monitoring local environmental condiPons
o
fetching precipitaPons data from satellite image stream
o
DICEnv exploits Big Data technologies and cloud capacity for online
water simulaPons and MapReduce for batch processing of historical
data
o
DICEnv is a criPcal system:
o
is expected to remain up 24/7
o
should quickly ramp up data intake rates, as well as memory and compute
capaciPes, to update more frequently the hazard management control
room
4 ©DICE MiSE 2015@ICSE, Florence, Italy, May 17 2015
DataInc example
o
The contract requires delivering an iniPal version
of DICEnv within 3 months serving a small area,
increasing coverage on a monthly basis
o
Challenges:
o
How to implement a complex cloud applicaPon in
such a short Pme?
o
How to saPsfy all the quality requirements?
o
What architecture should be adopted?
5 ©DICE MiSE 2015@ICSE, Florence, Italy, May 17 2015
Plalorm-‐Indep.
Model
Domain Models
Quality-‐Aware MDE Today
6 ©DICE QA Models Architecture Model Plalorm-‐Specific Model C# Java C++ Plalorm DescripPon MARTE
AnalyPcal Models
Cost-‐Quality Models
Code generaPon
Plalorm-‐Indep.
Model
Domain Models
Quality-‐Aware MDE Today
7 ©DICE Architecture Model Plalorm-‐Specific Model Code generaPon C# Java C++ Plalorm DescripPon MARTE
Issues PIM layer:
• staNc characterisNcs of data
• dynamic characterisNcs of data
• data dependencies DICEnv modeling issues:
• individual dependencies
between components and data streams
• relaPonships between compute and memory requirements
• lack of an explicit annotaPon for data characterisPcs
Plalorm-‐Indep.
Model
Domain Models
Quality-‐Aware MDE Today
8 ©DICE Architecture Model Plalorm-‐Specific Model Code generaPon C# Java C++ Plalorm DescripPon MARTE
Issues at PSM layer:
• heterogeneity of Big Data technologies
• automaNc translaNon of PSM models into
deployment plans
QA tools limitaPons:
• contenPon at processing
resources with limited features for memory consumpPon
• fork and joining are complex to be described analyPcally preserving tractability
Plalorm-‐Indep.
Model
Domain Models
An HolisPc Approach: DICE
9 ©DICE ConPnuous ValidaPon ConPnuous Monitoring
Data Awareness Architecture Model Plalorm-‐Specific Model Plalorm DescripPon DICE MARTE Deployment & ConPnuous IntegraPon DICE IDE Big Data QA Models
Embracing DevOps
o
SoYware development process is evolving
§
Developer: “I want to change my code”
§
Operator: “I want systems to be stable”
o
...but code changes are the cause of most instabiliPes!
o
DevOps closes the gap between Dev and Ops
§
Lean release cycles with automated tests and tools
§
Deep modelling of systems is the key to automaPon
10 ©DICE
Agile
Development DevOps
Business Dev Ops
Embracing DevOps
11 ©DICE
o
QA must become lean as well
§
ConPnuous quality checks and model versioning
o
Modelling of the operaPons
§
Dev needs awareness of infrastructure and costs
o
ConPnuous feedback
§
Forward and backward model synchronisaPon
§
Tracking of self-‐adaptaPon events (e.g. auto-‐scaling)
o
Big data coming from conPnuous monitoring
§
QA has its own Big data, use machine learning?
Benefits
o
Tackling skill shortage and steep learning curves
§
Data-‐aware methods, models, and OSS tools
o
Shorter Pme to market for Big Data applicaPons
§
Cost reducPon, without sacrificing product quality
o
Decrease development and tesPng costs
§
Select opPmal architectures that can meet SLAs
o
Reduce number and severity of quality incidents
§
IteraPve refinement of applicaPon design
12 ©DICE MiSE 2015@ICSE, Florence, Italy, May 17 2015
DICE Plalorm Independent Model (DPIM)
13 ©DICE MiSE 2015@ICSE, Florence, Italy, May 17 2015
DICE Plalorm and Technology Specific Model
(DTSM)
14 ©DICE MiSE 2015@ICSE, Florence, Italy, May 17 2015
DICE Plalorm, Technology and
Deployment Specific Model (DDSM)
15 ©DICE MiSE 2015@ICSE, Florence, Italy, May 17 2015
DICE Profile: PIM Level
o
FuncPonal approach to data to be expanded
o
Data dependencies
§
graph relaPonships between data, archives and streams
o
QA focuses on quanPtaPve aspects of data
o
StaPc characterisPcs of data
§
volumes, value, storage locaPon, replicaPon pa>ern,
consistency policies, data access costs, known schedules of
data transfers, data access control / privacy, ...
o
Dynamic characterisPcs of data
§
cache hit/miss probabiliPes, read/write/update rates,
bursPness, ...
16 ©DICE MiSE 2015@ICSE, Florence, Italy, May 17 2015
DICE Profile: PSM Level
o
Need for technology-‐specific abstracPons
§
Hadoop: Number of mappers and reducers , ...
§
In-‐memory DBs: Peak memory and variable threading
§
Streaming: merge/split/operators, networking, ...
§
Storage: Supported operaPons, cost/byte , ...
§
NoSQL: Consistency policies , ...
o
GeneraPon of deployment plan
§
Proposed Chef + TOSCA extension
o
Interest is both on private and public clouds
17 ©DICE MiSE 2015@ICSE, Florence, Italy, May 17 2015
§
Risk of harm
§
Privacy & data protecPon
DICE QA: Quality Dimensions
o
Reliability
o
Efficiency
o
Safety
18 ©DICE
§
Performance
§
Time behaviour
§
Costs
§
Availability
§
Fault-‐tolerance
DICE QA: Tools
o
Discrete-‐event simulaNon
: assess reliability and efficiency
in Big Data applicaPons, accounPng for stochasPc
evoluPon of the environment
o
stochasPc Petri nets or queueing networks, rely on simulaPon
o
Formal verificaNon tools
: assess safety risks in Big Data
applicaPons, e.g. find design flaws causing order and
Pming violaPons in message and state sequences
o
temporal logic formulae and bounded model checking,
saPsfiability modulo theories solvers
o
quanPfier-‐eliminaPon techniques to extend temporal logic-‐
based verificaPon
19 ©DICE MiSE 2015@ICSE, Florence, Italy, May 17 2015
DICE QA: Tools
o
Architecture opNmizaNon tool
: find architectural
improvements to opPmise costs and quality
o
decomposiPon-‐based analysis approach
o
resort to fluid approximaPon of stochasPc models
o
Feedback analysis
: automated extracPon from the
monitored data of key parameters required to define
simulaPon and verificaPon models
o
extract model parameters through log mining and staPsPcal
esPmaPon methods
o
breakdown resource consumpPon into its atomic components
on the end-‐to-‐end path of requests
20 ©DICE MiSE 2015@ICSE, Florence, Italy, May 17 2015
DICE Project
h>p://www.dice-‐h2020.eu
o
Horizon 2020 Research & InnovaPon AcPon
§
Quality-‐Aware Development for Big Data applicaPons
§
Feb 2015 -‐ Jan 2019, 4M Euros budget
§
9 partners (Academia & SMEs), 7 EU countries
21 ©DICE MiSE 2015@ICSE, Florence, Italy, May 17 2015