• No results found

Cost-optimized, Policy-based Data Management in Cloud Environments

N/A
N/A
Protected

Academic year: 2021

Share "Cost-optimized, Policy-based Data Management in Cloud Environments"

Copied!
42
0
0

Loading.... (view fulltext now)

Full text

(1)

Cost-optimized, Policy-based Data Management in Cloud Environments

Databases and Information Systems Research Group University of Basel

Ilir Fetai

Filip-Martin Brinkmann

(2)

1-2

Current State in the Cloud:

A “zoo” of different DBMS

• Current DBMS differ in:

– APIs

– Guarantees (Ex: consistency models, replication, etc.) – Limitations (Ex: no support for archiving, etc.)

– Costs

(3)

DMS “wish list”

• DMS = Data Management System

Construct “my” DBMS via declarative specifications – Declarative specifications = Policies

• Policy management needed

– A way to express requirements of applications, transactions, end-users, etc.,

and monitor their fulfilment – Examples:

Control data consistency

Specify possible data replication locations

Control data archiving

Control costs!

(4)

1-4

Scenario

Client:

– use services

– have different requirements

Service Provider:

– provides different services

– uses underlying DMS infrastructure – has requirements on data

management

Cloud DMS Provider:

– provides the DMS infrastructure

(5)

Agenda

• Policy-based and modular Data Management Systems (POLAR DMS)

• Cost-effective & Policy-based Data Management

– Data Consistency – Data Replication – Data Archiving

• Architecture of POLAR DMS

• Conclusion

(6)

1-6

Agenda

• Policy-based and modular Data Management Systems (POLAR DMS)

• Cost-effective & Policy-based Data Management

– Data Consistency – Data Replication – Data Archiving

• Architecture of POLAR DMS

• Conclusion

(7)

POLAR DMS - Requirements

• Requirements: User-friendly &

understandable policy-based and modular DMS

– Specify policies at different levels: Cloud provider,

application/service provider, but even end-customer

– DMS adjusts at runtime based on policies

– Minimal core

– Implement DMS functionality as exchangeable modules

(8)

1-8

POLAR DMS - Requirements

• Requirements: User-friendly &

understandable policy-based and modular DMS

– Specify policies at different levels: Cloud provider,

application/service provider, but even end-customer

– DMS adjusts at runtime based on policies

– Minimal core

– Implement DMS functionality as exchangeable modules

(9)

Policies in POLAR DMS

• End-user policies

– May override default transaction policies

– Ex: paid services

• Transaction-policies

– May override default application policies

• Application-policies

– Applications may specify default policies valid for entire application

• System-policies

– The underlying system may also

(10)

1-10

Conflicting Policies

• POLAR DMS must also handle policy conflicts!

(11)

Modularity in POLAR DMS

Build your own DMS by composing it of modules

• Service Provider:

– Application - Developers:

Specify policies at transaction level. The system may need to adjust at runtime (load

modules, etc.)

– Application - Architects:

Choose the modules you need for your architecture or specify application-wide policies

• Cloud DMS Provider

– Administrators: Specify

system policies and configure the DMS

(12)

1-12

Cost-effective & Policy-based Data Management

• POLAR DMS provides

– Policy Mechanism – Modular Architecture

• In the context of POLAR DMS

– Analyse a concrete subset of possible data management policies

– Data management aspects: consistency, replication & archiving

(13)

Agenda

• Policy-based and modular Data Management Systems (POLAR DMS)

• Cost-effective & Policy-based Data Management

– Data Consistency – Data Replication – Data Archiving

• Architecture of POLAR DMS

• Conclusion

(14)

1-14

Agenda

• Policy-based and modular Data Management Systems (POLAR DMS)

• Cost-effective & Policy-based Data Management

– Data Consistency – Data Replication – Data Archiving

• Architecture of POLAR DMS

• Conclusion

(15)

Data Consistency

• Data Consistency Level

– A contract between DMS provider and customer

– Customer: From application developer to end-customer

• A range of consistency levels exists

– Strong consistency: Low availability, but more “user-friendly”

– Weak Consistency: High availability, but less “user-friendly”

EC SI 1SR

(16)

1-16

Data consistency in the Cloud

• Current Cloud Data Systems provide weak consistency

– More oriented towards scalability

• If strong consistency needed, use traditional DBMS

– Are less scalable due to the “consistency overhead”

• Disadvantages of current solutions

– Different DBMS types for different consistency levels

– No way to further influence behaviour of data consistency via policies!

• Kraska et al [3]: Consistency rationing

– Categorize data based on their consistency requirements: 1SR, EC, Adaptive

Our approach: Adaptive consistency based on policies at

transaction level!

(17)

Cost-based Consistency Control - C3

• Idea:

– Implement a range of consistency levels – Provide a policy API at transaction level

– Adjust consistency at runtime based on the policies - C3 [1]

• Current Consistency Levels (CL): 1SR, SI and EC

– Cost models for 1SR, SI and EC defined in [2]

– 1SR & SI generate no or rarely inconsistencies, but are expensive – EC is cheap, but may generate high inconsistency costs

• Policies: Cost-budget (CB) for a transaction & inconsistency costs

(IC)

(18)

1-18

C3 Architecture

• Provides an API at transaction level (CB, IC)

– Different combinations possible

– Enforce a consistency level by setting

• Is based on a modular architecture

– Implement new protocols – Define their cost models

– C3 will choose the “cheapest” consistency protocol IC =∞

(19)

ECO-1SR

• C3: Provides a “meta-consistency” level

– improve transaction costs & performance by switching between concrete consistency levels

• Can the concrete levels be improved?

– Again, based on policies

– Work-in-Progress: Efficient & cost-optimized 1SR → ECO-1SR

(20)

1-20

ECO-1SR vs. Traditional 1SR

• Traditional implementation of 1SR

– Two-Phase-Locking (2PL) and Two- Phase-Commit (2PC)

– All replicas synchronized eagerly before transaction returns to the user

• ECO-1SR

Optimize transaction execution

Optimize transaction commit:

combination of eager & lazy replica synchronization

(21)

ECO-1SR vs. Traditional 1SR

• Traditional implementation of 1SR

– Two-Phase-Locking (2PL) and Two- Phase-Commit (2PC)

– All replicas synchronized eagerly before transaction returns to the user

• ECO-1SR

– Optimize transaction execution

– Optimize transaction commit:

combination of eager & lazy replica synchronization

(22)

1-22

ECO-1SR Optimizations

• Transaction execution: which replica is best suited for the transaction execution

– Choose the replica with highest capacity and freshness, and lowest cost

• Transaction commit: how many and which replicas to synchronize eagerly

– How many: Depending on data popularity (i.e. transaction popularity) – Which: Choose the one with highest capacity, highest popularity and

lowest cost

– All other replicas updated lazily

– Active refresh in the case transactions access stale data

→ The 'right' choice reduces the overhead for providing 1SR

(23)

Agenda

• Policy-based and modular Data Management Systems (POLAR DMS)

• Cost-effective & Policy-based Data Management

– Data Consistency – Data Replication – Data Archiving

• Architecture of POLAR DMS

• Conclusion

(24)

1-24

Cost-based Data Replication

• How to provide efficient and cost-optimized replication?

– What is the best number of replicas ? – What is the best replica type ?

– What is the best replica location ?

Example 1

– In the next period many high important transactions expected

– High importance: high profit or high penalty if constraints violated!

– Should be more replicas be generated? What replica types are more cost- effective?

Example 2

– Legal issues: Data can be stored/replicated only in specific locations – Specify replication policies per data-object or application

(25)

Agenda

• Policy-based and modular Data Management Systems (POLAR DMS)

• Cost-effective & Policy-based Data Management

– Data Consistency – Data Replication – Data Archiving

• Architecture of POLAR DMS

• Conclusion

(26)

Data Archiving - Motivation

• Assumption: freshness-aware key-value store → timestamped/versioned data items

→ freshness-centric operations

• Observation:

– concurrently different versions of the same data item by design – different applications/clients: different freshness requirements

Key: CharliesAccount Value: 200.00

Timestamp: 3

Key: CharliesAccount Value: 200.00

Timestamp: 3

Key: CharliesAccount Value: 1000.00

Timestamp: 1

Key: CharliesAccount Value: 500.00

Timestamp: 2 lazy

replication lazy

replication eager replication

(27)

Data Archiving - Motivation

• Assumption: freshness-aware key-value store → timestamped data items

→ freshness-centric operations

• Observation:

– concurrently different versions of the same data item by design – different applications/clients: different freshness requirements

• Idea: not deal with, but exploit these properties

• Goals:

– archive cloud data in situ – allow for

(28)

K/V Store Infrastructure

• Example:

freshness-aware K/V store for webshop

• Few writes, many reads

→ few update, many read servers

• Read-semantics

– read()

latest local data sufficient

Update

server Update

server

Update server

Read

server Read

server

Read server Read

server

webshop

read(items on stock) update(items on stock)

itemsOnStock : 100 timestamp: 1 eager

replication

lazy/pull replication

(29)

K/V Store Infrastructure II

• Example:

freshness-aware K/V store for a webshop

• Few writes, many reads

→ few update, many read servers

• Read-semantics

– read()

latest local data sufficient

readNotOlderThan(time t)

when certain freshness level required

possibly route / refresh necessary

operation is feasible, however

Update

server Update

server

Update server

Read

server Read

server

Read server Read

server

readNotOlderThan(7) itemsOnStock : 100 timestamp: 1

itemsOnStock : 80 timestamp: 3 route()

itemsOnStock : 10 timestamp: 7 refresh()

(30)

K/V Store Infrastructure III

• Other read-semantics are necessary as well.

• However, they are either

– expensive or

– impossible. (Or even both!)

• Examples:

– readNotYoungerThan(t)

“How is the stock of a book developing?”

– readBetween(t1, t2)

“How was it yesterday?”

– readAllBetween(t1, t2)

“How exactly did it fluctuate during our sales week?”

Data has to be reconstructed via – local logs or

– route/“defresh”.

Update

server Update

server

Update server

Read

server Read

server

Read server Read

server

webshop

readNotYoungerThan(2)

itemsOnStock : 100 timestamp: 1

itemsOnStock : 80 timestamp: 3 route/defresh()

log

(31)

Archive Infrastructure

• An archive helps “rolling back time”

• Different possible archive layouts possible:

Update

server Update server

Update server

Read

server Read

server

Read server

Archive update

server

Archive read server

Read server Update

server Update

server

Update server

Read

server Read

server

Read server

Archive update

server

Archive read server

Read server

Archive read server Archive

update server

Archive read server

(32)

1-32

Data Archiving with Policies

• Control over archiving is necessary

– Which data?

– How fast?

– How widely spread?

– What about successive updates?

– How long? → minimal, maximal TTL – Who may access data?

– How much $ to spend?

– etc.

→ Policies are needed

(33)

Data Archiving with Policies

• Control over archiving is necessary

– Which data?

– How fast?

– How widely spread?

– What about successive updates?

– How long? → minimal, maximal TTL – Who may access data?

– How much $ to spend?

– etc.

Data Policy

Key: CharliesAccount

Speed: eager

Spread: max

Successive updates: infinite

TTLmin: 10 years

(34)

1-34

Read*-Semantics

• Rich(er) read*-semantics can be handled efficiently with the archive

– without archive:

read()

readNotOlderThan(t) – with archive:

readNotYoungerThan(t)

readBetween(t1, t2)

readAllBetween(t1,  t2)

Update server

Read server

Read server Read

server

webshop

Update server

Production Archive

t = 2 t = 3 t = 5

Read server

t = 1..4 t = 3..5

readAllBetween(2,4) readAllBetween(3,4)

Step 1 Step 2

Step 3

(35)

Putting the Pieces Together

• High-level, conceptual vision of modular & policy-based DMS

• Concepts of

– Data Consistency – Data Replication – Data Archiving

• Recall: we want to reach

– policy-based and – modular DMSs

• In order to put the pieces together: we built a concrete architecture

(36)

1-36

Agenda

• Policy-based and modular Database Management Systems (POLAR DMS)

• Cost-effective & Policy-based Database Management

– Data Consistency – Data Replication – Data Archiving

• Architecture of POLAR DMS

• Conclusion

(37)

Architecture

of POLAR DMS

(38)

1-38

POLAR DMS

Architecture of POLAR DMS

• Based on OSGi

– Module-based framework for Java – Facilitates run-time changes of

modules

• Basic architecture

– Implement core modules: Data Access, Routing, etc.

– Implement additional functional

modules: C3, ECO-1SR, Data Archiving, etc.

• Policy controlled module selection

– Finding optimal module selection and configuration

core extensions

policy system

S3 Access ECO-1SR C3

(39)

Agenda

• Policy-based and modular Data Management Systems (POLAR DMS)

• Cost-effective & Policy-based Data Management

– Data Consistency – Data Replication – Data Archiving

• Architecture of POLAR DMS

• Conclusion

(40)

1-40

Conclusion

• POLAR DMS consists of a unified model for:

categorizing,assessing and

creating Data Management Systems (DMS)

• This is achieved by a policy-driven framework architecture

• We built concrete functional modules, enabling cost-effective and policy-based

– Data Consistency – Data Replication – Data Archiving

• We provide a concrete architecture of POLAR DMS

(41)

Thank You!

Questions?

(42)

1-42

References

[1]: Ilir Fetai and Heiko Schuldt, “Cost-Based Data Consistency in a Data-as-a-Service Cloud Environment”. Proceedings of the 5th

International Conference on Cloud Computing (CLOUD 2012), Honolulu, HI, USA, 2012/6.

[2]: Ilir Fetai and Heiko Schuldt, “Cost-Based Adaptive Concurrency Control in the Cloud”. Technical Report, Department of Mathematics and Computer Science, University of Basel, 2012/2.

• [3]: Kraska et al, “Consistency rationing in the cloud: pay only when it

matters”. Proc. VLDB Endow., 2009.

References

Related documents

Ms Panopoulou presented in brief the new developments as in the updated Strategic Report 2014 and asked for its official approval. Therefore, the document was

Leblon “is more serene and somewhat more upscale than Ipanema, and the beach is more family-oriented.” The Baixo Baby kiosk has toys for tots.—TripAdvisor Member, Rio de

• Uluru–kata tjuta National Park - A World Heritage Living Cultural Landscape • Uluru- Kata Tjuta National Park – a World Heritage area.. • Uluru–kata tjuta National Park

illustrate a better adsorption mechanism, which is related to an improved bounding between the MB and the adsorbent particles (Table 2).. The adsorption kinetics of the

Consuming guava leaves (Psidii folium) extracts with dose 500 mg for 14 days have a significant effect on changes in hemoglobin and thrombocyte levels in

 If at any stage in the process to provide a National IP Service, a Service Taker withdraws its application for such Service, the Operator will be charged the full