• No results found

Top Five Challenges Of Cloud Computing

N/A
N/A
Protected

Academic year: 2020

Share "Top Five Challenges Of Cloud Computing"

Copied!
49
0
0

Loading.... (view fulltext now)

Full text

(1)

Unit 02

Alekhya.P

Asst Professor

(2)

Top Five Challenges Of Cloud

Computing

1. Security and Privacy

2. Service Delivery and Billing

3. Interoperability and Portability

4. Reliability and Availability

(3)

3. Cloud applications and paradigms

• Existing cloud applications and new opportunities

• Architectural styles for cloud applications

• Coordination based on a state machine model – the Zookeeper

• The MapReduce programming model

• Clouds for science and engineering

• High performance computing on a cloud

• Legacy applications on a cloud

• Social computing, digital content, and cloud computing

Cloud Computing - RCIS May

(4)

Cloud applications

• Ideal applications for cloud computing:

– Web services;

– Database services;

– Transaction-based services - the resource requirements of transaction-oriented services benefit from an elastic environment where resources are available when needed and where one pays only for the resources it consumes.

• Applications unlikely to perform well on a cloud:

Applications with a complex workflow and multiple dependencies, as is often the case in high-performance computing.

– Applications which require intensive communication among concurrent instances.

– When the workload cannot be arbitrarily partitioned.

Cloud Computing - RCIS May

(5)

Existing and new application opportunities

• Three broad categories of existing applications:

– Processing pipelines;

– Batch processing systems;

– Web applications.

• Potentially new applications

– Batch processing for decision support systems and business analytics.

– Mobile interactive applications which process large volumes of data from different types of sensors.

– Science and engineering could greatly benefit from cloud computing as

many applications in these areas are compute-intensive and data-intensive.

Cloud Computing - RCIS May

(6)

Processing pipelines

• WS Data Pipeline is a web service that helps you reliably process and move data

between different AWS compute and storage services as well as on-premise data sources at specified intervals

• Indexing large datasets created by web crawler engines.

• Data mining - searching large collections of records to locate items of interests.

• Image processing

– image conversion, e.g., enlarge an image or create thumbnails;

– compress or encrypt images.

• Video transcoding from one video format to another, e.g., from AVI to MPEG.

• Document processing;

– convert large collection of documents from one format to another, e.g., from Word to PDF

– encrypt the documents;

– use Optical Character Recognition to produce digital images of documents.

Cloud Computing - RCIS May

(7)

Batch processing applications

Batch processing is the execution of a series of programs ("jobs") on a computer without manual intervention.

• Jobs are set up so they can be run to completion without human interaction. All

input parameters are predefined through scripts, command-line arguments,

control files, or job control language.

• Generation of daily, weekly, monthly, and annual activity reports for retail,

manufacturing, other economical sectors.

• Processing, aggregation, and summaries of daily transactions for financial

institutions, insurance companies, and healthcare organizations.

• Processing billing and payroll records.

• Management of the software development, e.g., nightly updates of software

repositories.

• Automatic testing and verification of software and hardware systems.

Cloud Computing - RCIS May

(8)

Web access

• Sites for online commerce

• Sites with a periodic or temporary presence.

– Conferences or other events.

– Active during a particular season (e.g., the Holidays Season) or income tax reporting.

• Sites for promotional activities.

• Sites that ``sleep'' during the night and auto-scale during the day.

Cloud Computing - RCIS May

(9)

Architectural styles for cloud applications

• Based on the client-server paradigm. Often clients and servers

communicate using Remote Procedure Calls (RPCs).

• Stateless servers - view a client request as an independent transaction

and respond to it; the client is not required to first establish a connection to the server.

• Simple Object Access Protocol (SOAP) - application protocol for Web

applications; message format based on the XML. Uses TCP or UDP transport protocols.

• Representational State Transfer (REST) - software architecture for

distributed hypermedia systems. Supports client communication with stateless servers; it is platform independent, language independent, supports data caching, and can be used in the presence of firewalls.

Cloud Computing - RCIS May

(10)

• In computing, a stateless protocol is a communications protocol that

treats each request as an independent transaction that is unrelated to any previous request so that the communication consists of independent pairs of request and response. A stateless protocol does not require

the server to retain session information or status about each

communications partner for the duration of multiple requests. In contrast, a protocol which requires keeping of the internal state on the server is known as a stateful protocol.

• Examples of stateless protocols include the Internet Protocol (IP) which is the foundation for the Internet, and the Hypertext Transfer

(11)

Architectural styles for cloud applications

• ORB-Object Request Broker the middleware which facilitates

communication of networked applications. The ORB at the sending site transforms the data structures used internally by a sending process to a byte sequence and transmits this byte sequence over the network; the ORB at the receiving site maps the byte sequence to the data structures used internally by the receiving process.

• The Common Object Request Broker Architecture (CORBA) was

developed in the early 1990s to allow networked applications developed in different programming languages and running on systems with

different architecture and system software to work with one another. supports data caching, and can be used in the presence of firewalls.

• At the heart of the system is the Interface Definition Language (IDL) used

to specify the interface of an object; the IDL representation is then mapped to the set of programming languages including: C, C++, Java,

Smalltalk, Ruby, Lisp, and Python. Networked applications pass CORBA by reference and pass data by value.

Cloud Computing - RCIS May

(12)

• The Web Services Description Language (WSDL) (see

http://www.w3.org/TR/wsdl) was introduced in 2001 as an XML-based grammar to describe communication between end points of a networked application

• REST almost always uses HTTP to support all four CRUD

(13)

CSC 8320 RPC Slide 13

Introduction

Blocking state

client server

request

reply

Executing state Call procedure and wait

for reply

Receive request and start process execution

Send reply and wait for next execution

(14)
(15)

Coordination - ZooKeeper

• Cloud elasticity  distribute computations and data across multiple

systems; coordination among these systems is a critical function in a distributed environment.

• ZooKeeper

– distributed coordination service for large-scale distributed systems;

– high throughput and low latency service; zookeepeerr.apache.org

– implements a version of the Paxos consensus algorithm;

– open-source software written in Java with bindings for Java and C.

– the servers in the pack communicate and elect a leader;

– a database is replicated on each server; consistency of the replicas is maintained;

– a client connect to a single server, synchronizes its clock with the server, and sends requests, receives responses and watch events through a TCP connection.

Cloud Computing - RCIS May

(16)

Cloud Computing - RCIS May 2013

Server Server Server Server Server

Client Client Client Client Client Client Client Client

(17)

Zookeeper communication

• Messaging layer  responsible for the election of a new leader when

the current leader fails.

• Messaging protocols uses:

– packets - sequence of bytes sent through a FIFO channel,

– proposals - units of agreement, and

– messages - sequence of bytes atomically broadcast to all servers.

• A message is included into a proposal and it is agreed upon before it is

delivered.

• Proposals are agreed upon by exchanging packets with a quorum of

servers as required by the Paxos algorithm.

Cloud Computing - RCIS May

(18)

Zookeeper communication (cont’d)

Messaging layer guarantees

– Reliable delivery: if a message m is delivered to one server, it will be eventually delivered to all servers;

– Total order: if message m a is delivered before message n to one server, a will be delivered before n to all servers;

– Causal order: if message n is sent after m has been delivered by the sender of n, then m must be ordered before n.

Cloud Computing - RCIS May

(19)

Shared hierarchical namespace similar to a file system;

znodes instead of inodes

Cloud Computing - RCIS May 2013

/

/a

/b

/c

/a/1

/a/2

/b/1

/c/1

/c/2

19

Z node-State infomation

(20)

ZooKeeper service guarantees

• Atomicity - a transaction either completes or fails.

• Sequential consistency of updates - updates are applied strictly in the

order they are received.

• Single system image for the clients - a client receives the same response

regardless of the server it connects to.

• Persistence of updates - once applied, an update persists until it is

overwritten by a client.

• Reliability - the system is guaranteed to function correctly as long as the

majority of servers function correctly

.

Cloud Computing - RCIS May

(21)

Zookeeper API

• Seven operations:

– create - add a node at a given location on the tree;

– delete - delete a node;

– get data - read data from a node;

– set data - write data to a node;

– get children - retrieve a list of the children of the node

– synch - wait for the data to propagate.

Cloud Computing - RCIS May

(22)

Workows - coordination of multiple

activities

Many cloud applications require the completion of multiple

interdependent tasks

The description of a complex activity involving such an

ensemble of tasks is known as a workflow

Task is the central concept in workflow

modeling;

A task is a unit of work to be performed on the cloud and it

is characterized by several attributes, such as

• Name - a string of characters uniquely identifying

the task.

• Description - a natural language description of

the task.

• Actions - an action is a modification of the

environment caused by the execution of

the task.

(23)

• Postconditions

- boolean expressions that

must be true after the action(s) of the task do

take place.

• Attributes - provide indications of the type

and quantity of resources necessary for

the

execution of the task, the actors in charge of the

tasks, the security requirements, whether the

task is reversible or not, and other task

characteristics.

• Exceptions - provide information on how

to handle abnormal events. The exceptions

(24)

Replanning means restructuring of a process, redefinition of the relationship among various tasks

• A composite task is a structure describing a subset of tasks and the order of their execution.

• A primitive task is one that cannot be decomposed into simpler tasks.

A composite task inherits some properties from workflows;

it consists of tasks, has one start symbol, and possibly several end symbols.

At the same time, a composite task inherits some properties from tasks; it has a name, preconditions, and postconditions.

• A routing task is a special-purpose task connecting two tasks in a workflow description.

• The task that has just completed execution is called the predecessor task,

The one to be initiated next is called the successor task.

A routing task could trigger the sequential, concurrent, or iterative execution.

• Several types of routing tasks exist:

 A fork routing task

(25)

A

fork routing task triggers execution of several successor tasks.

Several semantics for

this construct are possible:

1. All successor tasks are enabled;

2. Each successor task is associated with a condition; the

conditions for all tasks are evaluated and only the tasks with a true

condition are enabled;

3. Each successor task is associated with a condition; the

conditions for all tasks are evaluated but, the conditions are

mutually exclusive and only one condition may be true thus, only

one task is enabled;

4. Nondeterministic,

k out of n > k successors are selected

at random to be enabled.

• A join routing task waits for completion of its predecessor tasks.

There are several

semantics for the join routing task:

1. the successor is enabled after all predecessors end;

2. the successor is enabled after

k out of n > k predecessors

end;

(26)
(27)

A process description, also called a workflow schema, is

a structure describing the tasks

or

activities to be

executed and the order of their execution; a process

description contains

one start symbol and one end symbol. A process

description can be provided in a

workflow definition

language (WFDL), supporting constructs for choice,

concurrent execution,

the classical

fork, join constructs,

and iterative execution.

(28)
(29)
(30)

MapReduce philosophy

1. MapReduce is a software framework that allows developers to write programs that

process massive amounts of unstructured data in parallel across a distributed cluster of processors or stand-alone computers

2. An application starts:

– A master instance;

– M worker instances for the Map phase, and later

– R worker instances for the Reduce phase.

3. The master instance partitions the input data in M segments.

4. A map instance reads its input data segment and processers the data.

5. The results of the processing are stored on the local disks of the servers where the

map instances run.

6. When all map instances have finished processing their data the R reduce instances

read the results of the first phase and merges the partial results.

7. The final results are written by the reduce instances to a shared storage server.

8. The master instance monitors the reduce instances and when all of them report task completion the application is terminated.

Cloud Computing - RCIS May

(31)

Cloud Computing - RCIS May 2013 Segment 1 Segment 1 Segment 2 Segment 3 Segment M Map instance 1 Map instance 2 Map instance 3 Map instance M Local disk Local disk Local disk Local disk Master instance Application 1 2 1 3 4 1 Reduce instance 1 Reduce instance 2 Reduce instance R

Map phase Reduce phase

(32)
(33)

Case study: GrepTheWeb

• The application illustrates the means to

– create an on-demand infrastructure;

– run it on a massively distributed system in a manner that allows it to run in parallel and scale up and down based on the number of users and the problem size

-A program which select lines in a file which match a given pattern.

• GrepTheWeb

Performs a search of a very large set of records to identify

records that satisfy a regular expression.

It is analogous to the Unix

grep

command.

The source is a collection of document URLs produced by the

Alexa Web Search, a software system that crawls the web

every night.

Uses message passing to trigger the activities of multiple

controller threads which launch the application, initiate

processing, shutdown the system, and create billing records.

Cloud Computing - RCIS May

(34)

(a) The simplified workflow showing the inputs: - the regular expression; - the input records generated

by the web crawler; - the user commands to report

the current status and to terminate the processing.

(b) The detailed workflow; the system is based on message passing between several queues; four controller threads periodically

poll their associated input queues, retrieve messages, and carry out

the required actions

A Hadoop cluster is a special type of computational cluster designed

specifically for storing and analyzing huge amounts of unstructured data in a distributed

computing environment

Cloud Computing - RCIS May 2013 SQS Controller EC2 Cluster Input records Regular expression Output Status Simple S3

DB (a) Status DB Launch controller Monitor controller Shutdown controller Billing controller Billing service Launch

queue Monitorqueue

Billing queue Shutdown queue Amazon SimpleDB Output Input Amazon S3 HDHS

(35)

Clouds for science and engineering

• Research in virtually all areas of science and engineering share common

traits:

– Collect large volumes of experimental data.

– Manage very large volumes of data.

– Build and evaluate models of systems/processes/phenomena.

– Integrate data and literature.

– Document the experiments.

– Share the data with others; data preservation for a long periods of time.

• All these activities require “big” data storage and systems capable to

deliver abundant computing cycles; computing clouds are able to provide such resources and support collaborative environments.

Cloud Computing - RCIS May

(36)

Online data discovery

• Phases of data discovery in large scientific data sets:

– recognition of the information problem;

– generation of search queries using one or more search engines;

– evaluation of the search results;

– evaluation of the web documents;

– comparing information from different sources.

• Large scientific data sets:

– biomedical and genomic data from the National Center for Biotechnology Information (NCBI)

– astrophysics data from NASA

– atmospheric data from the National Oceanic and Atmospheric Administration (NOAA) and the National Center for Atmospheric Research (NCAR).

Cloud Computing - RCIS May

(37)

High performance computing on a cloud

• Comparative benchmark of EC2 and three supercomputers at the National

Energy Research Scientific Computing Center (NERSC) at Lawrence Berkeley National Laboratory. NERSC has some 3,000 researchers and involves 400 projects based on some 600 codes.

• Conclusion - communication intensive applications are affected by the

increased latency and lower bandwidth of the cloud. The low latency and high bandwidth of the interconnection network of a supercomputer

cannot be matched by a cloud.

Cloud Computing - RCIS May

(38)

Legacy applications on the cloud

• Is it feasible to run legacy applications on a cloud?

• Cirrus - a general platform for executing legacy Windows applications on

the cloud. A Cirrus job - a prologue, commands, and parameters. The prologue sets up the running environment; the commands are sequences of shell scripts including Azure-storage-related commands to transfer data between Azure blob storage and the instance.

• BLAST - a biology code which finds regions of local similarity between

sequences; it compares nucleotide or protein sequences to sequence databases and calculates the statistical significance of matches; used to infer functional and evolutionary relationships between sequences and identify members of gene families.

• AzureBLAST - a version of BLAST running on the Azure platform.

Cloud Computing - RCIS May

(39)

Cirrus

Cloud Computing - RCIS May 2013 Web portal Web service Web role Job registration

Job manager role

(40)

Execution of loosely-coupled workloads using the Azure

platform

Cloud Computing - RCIS May 2013 Portal Client Queues BigJob Manager Worker Role BigJob Agent

task 1 task 1 task k

Worker Role BigJob Agent task k+1 task k+2 task n Service Mahagement API Blob query post results

start VM start replicas query state

(41)

Social computing and digital content

• Networks allowing researchers to share data and provide a virtual

environment supporting remote execution of workflows are domaon specific:

– MyExperiment for biology.

– nanoHub for nanoscience.

• Volunteer computing - a large population of users donate resources such

as CPU cycles and storage space for a specific project:

– Mersenne Prime Search

– SETI@Home,

– Folding@home,

– Storage@Home

– PlanetLab

• Berkeley Open Infrastructure for Network Computing (BOINC) 

middleware for a distributed infrastructure suitable for different applications.

Cloud Computing - RCIS May

(42)

What is Case-Based Reasoning?

Storing information from previous experiences

Using previously gained knowledge to solve

current problems

(43)

Basic Steps

Identify the problem/case

Look for a similar, previously experienced case

Predict a solution, possibly different from past

experiences

Evaluate the solution

(44)

4 General Concepts

Retrieve

Find similar cases

Reuse

Use old case to solve

current case

Revise

Create a solution

Retain

(45)

Representation Issues

Decide what to store

Choosing the appropriate structure

Organizing memory

(46)

How are old cases used?

Compositional: Combine several solutions to

make one best solution

Transformational: Use the solution from a

previous case

(47)

Why Case-Based Reasoning is

Good

Low maintenance

Improves with use

Uses learned knowledge plus general

knowledge

(48)

Trends in Case Based Reasoning

Integrate with other learning methods

Integrate with other reasoning components

Incorporate into massive parallel processing

(49)

Applications of CBR

Help desks

Diagnosing problems

References

Related documents

Victorian, Mid Victorian, Edwardian and Streets Parishes and Wards of The City of London West Surrey FHS Guides to Middlesex Records (poor law, manorial, wills, bts, lay

Jacob Philip (Japhi), the eldest son of Kuruvilla Jacob and Mary completed his diploma in Electrical Engineering and married Annamma (DOB 17/06/1945), daughter of A O

More specifically, we address the following set of questions: How far do banks from advanced economies readjust their cross-border loans to emerging markets in response to

The scattergram represents the distribution with age of 69 determinations of concentration of potassium in serum of 39 premature infants with respiratory distress syndrome (Table

Field experiments were conducted at Ebonyi State University Research Farm during 2009 and 2010 farming seasons to evaluate the effect of intercropping maize with

The cholesterol lowering drugs include statins, fibrates, bile acid sequestrants, inhibitors of intestinal sterol absorption, nicotinic acid derivatives and others

Testaments) African Religious heritage and Contemporary Christian Living. The syllabus aims at creating awareness in the learner of his/her life and relationship with God as

OVCAR-3 and SKOV-3 cells were cultured for 24 hours with 5 μg/ml of cisplatin alone or with 0 or 100 ng/ml CCL25 plus 1 μg/ml of anti-human CCR9 or isotype and untreated cell were