!"#$%&'()*'+),-./)0'
12/#'+.3-)04,.'
0.3-)04,.5.&672&.,%"#87.4'
Università degli studi di Roma “Tor Vergata”
Facoltà di Ingegneria
9##+':,%-.;),0'
9##+':,%-.;),0'<)#42,)0'
3
9##+':,%-.;),0'<)#42,)0'
E.Casalicchio, L.Silvestri "Architectures for autonomic service management in cloud-based systems," Computers and Communications (ISCC), 2011 IEEE Symposium on, pp.161-166, June 28 2011-July 1 2011, Kerkyra (Corfù), Greece
!"#$%&'()*'+),-./)0'
•
Compute
– Amazon Elastic Compute Cloud (EC2)
– Amazon Elastic MapReduce – Auto Scaling
– Elastic Load Balancing
•
Storage
– Simple Storage Service (S3) – Elastic Block Store (EBS)
•
Database
– Amazon SimpleDB
– Amazon Relational Database Service (RDS)
– Amazon DynamoDB
•
Messaging
– Simple Queue Service (SQS) – Simple Notification Service
(SNS)
•
Networking & Content Delivery
– Amazon Route 53
– Amazon Virtual Private Cloud (VPC)
– Amazon CloudFront
•
Deployment & Management
– Amazon CloudWatch – AWS Elastic Beanstalk – AWS Identity and Access
Management (IAM)
5
!"#$%&'=>8'
• Provides resizable compute capacity in the
cloud
• Allows to increase/decrease capacity (start/
stop instances) within minutes
• Pay-per-use on hourly basis
• From one to thousands of server instances can
be launched simultaneously
• Guarantees complete control over instances
– root SSH access, GUI, command line tools, APIs
• offers advanced services
– Elastic Block Store – Elastic Load Balancer – CloudWatch + AutoScaling – Elastic IP
– Amazon Elastic Beanstalk
=>8'1%/#?%&0'
•
Regions
– geographically dispersed
– consist of
one or more
availability zones
– Current regions:US East (Northern Virginia), US West
(Oregon),US West (Northern California), EU (Ireland), Asia
Pacific (Singapore), Asia Pacific (Singapore), Asia Pacific
(Tokyo), South America (Sao Paulo)
– Special Region AWS GovCloud
•
Availability Zones
– distinct locations in the same region engineered to be insulated
from failures in other availability zones
– used to protect applications from failure of a single location
•
Load Balancing
– allowed only between different Availability Zones in the same
Region
– not supported between different Regions
7
=>8'9&04#&/)'@AB)0'
•
On-Demand Instances
– billing per-hour with no long-term commitments
•
Reserved Instances
– one-time payment to reserve an instance for 1 or 3 years – significant discount on hourly usage charge
•
Spot Instances
– enable users to bid for unused EC2 capacity
– Spot Price fluctuates periodically depending on supply of/demand for Spot Instance capacity
=>8'9&04#&/)0'
•
C'
=>8'>%"B24)'D&.4'
B,%-.;)0'4E)')F2.-#3)&4'>:D'/#B#/.4A'%G'#'
C7HIC78'JK$'8HHL'MB4),%&'%,'8HHL'N)%&'B,%/)00%,'
9=>8':,./)0'OD+'=#04P'
10=>8':,./)0'O=DP'
11
=>8'!;;.?%'+),-./)0':,./)0'
12
=>8'9&4),G#/)'
• AWS Management Console
• Command Line Tools
• Java-based command-line client
• AWS SDKS (available for Java, PHP and .NET) • Third Party Libraries
• Query and SOAP APIs
13
!"#$%&'+."B3)'+4%,#6)'+),-./)'
•
+4%,#6)'G%,'4E)'9&4),&)4'
•
<)#42,)0'
– Q,.4)R',)#;R';)3)4)'%&'%*S)/40'/%&4#.&.&6'G,%"'C'*A4)'4%'T'@U')#/E' – )#/E'%*S)/4'.0'04%,);'.&'#'*2/V)4'#&;',)4,.)-);'-.#'#'2&.F2)'V)A' – K@@:'O;)G#234P'%,'U.4@%,,)&4';%Q&3%#;'B,%4%/%3' – 02BB%,40'G%,'"23?B3)'#//)00'/%&4,%3'")/E#&.0"0R'#0'Q)33'#0')&/,AB?%&'G%,'*%4E' 0)/2,)'4,#&0.4'#&;'0)/2,)'04%,#6)'%&';.0V' – %*S)/40'#,)',);2&;#&43A'04%,);'%&'"23?B3)';)-./)0'#/,%00'"23?B3)'G#/.3.?)0'.&'#&' !"#$%&'+W'X)6.%&' – :D@'#&;'>M:Y'%B),#?%&0'0A&/E,%&%203A'04%,)'A%2,';#4#'#/,%00'"23?B3)'G#/.3.?)0' *)G%,)',)42,&.&6'+D>>=++' – ;#4#'.&4)6,.4A',)623#,3A'-),.Z);'20.&6'/E)/V02"' – X);2/);'X);2&;#&/A'+4%,#6)'OX++P'#33%Q0'4%',);2/)'/%04'04%,.&6'&%&I/,.?/#3';#4#'#4' 3%Q),'3)-)3'%G',);2&;#&/A'•
:,./.&6'
– +4%,#6)["%&4E'O\H7HWLI\H7C]H'B),'JUR'Z,04'@U'G,))P' – ,)F2)04'O\H7C'B),'CHHH':D@R'>M:YR'1.04',)F2)040R'\H7HC'B),'CHHHH'J=@',)F0P' – ^#4#'@,#&0G),'O9_'G,))R'MD@'G,%"'\H7HT'4%'\H7C8'B),'JUP' 14!"#$%&'=3#0?/'`#BX);2/)'
•
=&#*3)0'4%'B,%/)00'-#04'#"%2&40'%G';#4#'
– !BB3./#?%&0a'Q)*'.&;)b.&6R';#4#'".&.&6R'3%6Z3)'#A0.0R';#4#'QE#,)E%20.&6R' Z&#&/.#3'#A0.0R'0/.)&?Z/'0."23#?%&c'•
D?3.$)0'#'E%04);'
K#;%%B'
G,#")Q%,V',2&&.&6'%&'4E)'Q)*I0/#3)'
.&G,#04,2/42,)'%G'!"#$%&'=>8'#&;'+W'
– !B#/E)'K#;%%B'.0'#&'%B)&'0%2,/)'d#-#'0%eQ#,)'G,#")Q%,V'4E#4'02BB%,40' ;#4#I.&4)&0.-)';.04,.*24);'#BB3./#?%&0',2&&.&6'%&'3#,6)'/3204),0'%G'/%""%;.4A' E#,;Q#,)'•
K#;%%B'."B3)")&4#?%&'%G'4E)'
`#BX);2/)'
G,#")Q%,V
' – ;#4#'.&'#'S%*'f%Q'02*;.-.;);'.&'0"#33),'/E2&V0'0%'4E#4'4E)A'/#&'*)'B,%/)00);' .&'B#,#33)3'O"#B'G2&/?%&P' – B,%/)00);';#4#'#,)',)/%"*.&);'.&4%'4E)'Z'0%32?%&'O,);2/)'G2&/?%&P'•
!33%Q0'A%2'4%'."B3)")&4';#4#'B,%/)00.&6'#BB3./#?%&0'.&'"#&A'3#&62#6)0'
.&/32;.&6'd#-#R':),3R'X2*AR':A4E%&R':K:R'XR'%,'>gg'
15^#4#*#0)'
•
!"#$%&'X^+'
– X)3#?%'^U'O`A+h1'%,'M,#/3)'^U'=&6.&)P' – !24%"#?/'`#)")&4'O+%eQ#,)':#4/E.&6R'U#/V2BP'#&;'`%&.4%,.&6' – <%,'4E)'`A+h1'^U'=&6.&)R'A%2'/#&'#30%'#00%/.#4)'%&)'%,'"%,)'X)#;'X)B3./#0''
•
!"#$%&'+."B3)^U'
– E.6E3A'#-#.3#*3)'#&;'f)b.*3)'&%&I,)3#?%';#4#'04%,)' – #24%"#?/#33A'/,)#4)0'"23?B3)'6)%6,#BE./#33A';.04,.*24);'/%B.)0'%G')#/E';#4#' .4)"'A%2'04%,)'
•
!"#$%&'^A&#"%^U'
– #'G233A'"#);'_%+h1';#4#*#0)'0),-./)'4E#4'B,%-.;)0'G#04'#&;'B,);./4#*3)' B),G%,"#&/)'Q.4E'0)#"3)00'0/#3#*.3.4A' – #24%"#?/#33A'0B,)#;0'4E)';#4#'#&;'4,#i/'G%,'4E)'4#*3)'%-),'#'02i/.)&4' &2"*),'%G'0),-),0'4%'E#&;3)'4E)',)F2)04'/#B#/.4A'0B)/.Z);'*A'4E)'/204%"),' #&;'4E)'#"%2&4'%G';#4#'04%,);R'QE.3)'"#.&4#.&.&6'/%&0.04)&4R'G#04' B),G%,"#&/)' – !33';#4#'.4)"0'#,)'04%,);'%&'+%3.;'+4#4)'^,.-)0'O++^0P'#&;'#,)'#24%"#?/#33A' ,)B3./#4);'#/,%00'"23?B3)'!-#.3#*.3.4A'j%&)0'.&'#'X)6.%&' – .&4)6,#?%&'Q.4E'=3#0?/'`#BX);2/)' 16!"#$%&'+."B3)'h2)2)'+),-./)'
•
`)00#6)'F2)2.&6'0),-./)'4E#4')&#*3)0'#0A&/E,%&%20'")00#6)'
*#0);'/%""2&./#?%&'*)4Q))&';.04,.*24);'/%"B%&)&40'%G'#&'
#BB3./#?%&'
•
(E)&'#'")00#6)'.0',)/).-);R'.4'*)/%")0'k3%/V);l'QE.3)'*).&6'
B,%/)00);'
–
9G'4E)'")00#6)'B,%/)00.&6'G#.30R'4E)'3%/V'Q.33')bB.,)'#&;'4E)'")00#6)'
Q.33'*)'#-#.3#*3)'#6#.&'
17!"#$%&'X%24)'TW'
•
K.6E3A'#-#.3#*3)'#&;'0/#3#*3)'^_+'Q)*'0),-./)'
•
!&0Q),0'^_+'F2),.)0'Q.4E'3%Q'3#4)&/A'*A'20.&6'#'63%*#3'
&)4Q%,V'%G'^_+'0),-),0'
•
h2),.)0'#,)'#24%"#?/#33A',%24);'4%'4E)'&)#,)04'^_+'0),-),'
•
^)0.6&);'4%'#24%"#?/#33A'0/#3)'4%'E#&;3)'-),A'3#,6)'F2),A'
-%32")0'Q.4E%24'#&A'E2"#&'.&4),-)&?%&'
•
:,./.&6'
–
K%04);'j%&)0'
• \H7TH'B),'E%04);'$%&)'['"%&4E'G%,'4E)'Z,04'8T'E%04);'$%&)0' \H7CH'B),'E%04);'$%&)'['"%&4E'G%,'#;;.?%'E%04);'$%&)0'–
h2),.)0'
• \H7TH'B),'".33.%&'F2),.)0'm'Z,04'C'U.33.%&'F2),.)0'['"%&4E' \H78T'B),'".33.%&'F2),.)0'm'%-),'C'U.33.%&'F2),.)0'['"%&4E' 18!"#$%&'>3%2;<,%&4'
•
()*'0),-./)'G%,'/%&4)&4';)3.-),A'
•
;)3.-),0'A%2,'04#?/'#&;'04,)#".&6'/%&4)&4'20.&6'#'63%*#3'&)4Q%,V'%G');6)'
3%/#?%&0'
•
X)F2)040'G%,'A%2,'%*S)/40'#,)'#24%"#?/#33A',%24);'4%'4E)'&)#,)04');6)'
3%/#?%&'
•
M*S)/40'%,6#&.$);'.&'
;.04,.*2?%&0'
– !';.04,.*2?%&'0B)/.Z)0'4E)'3%/#?%&'%G'4E)'%,.6.'-),0.%&'%G'#&'%*S)/4' – !';.04,.*2?%&'E#0'#'2&.F2)'>3%2;<,%&47&)4';%"#.&'&#")'O)767'#*/C8W7/3%2;G,%&47&)4P'•
!&'
%,.6.&'0),-),
'.0'4E)'3%/#?%&'%G'4E)';)Z&.?-)'-),0.%&'%G'#&'%*S)/47''
– @E.0'/%23;'*)'#&%4E),'!"#$%&'()*'+),-./)'m'!"#$%&'+W'*2/V)4R'!"#$%&'=>8'.&04#&/)'m' %,'4E.0'/%23;'*)'#&')b4),'%,.6.&'0),-),7'
•
:,./.&6'
– X)6.%'^#4#'@,#&0G),'O\H7H8HI\H78TH[JUP' – X)F2)04'O\H7HCHI\H7H88'B),'CHHHH'K@@:R'K@@:+',)F2)040P' 19=3#0?/'U)#&04#3V'
•
!33%Q0'4%';)B3%A'#&;'"#)'#BB3./#?%&0'.&'4E)'!(+'/3%2;'3)-),#6.&6'
!(+'0),-./)0'02/E'#0'=>8R'+WR'+_+R'=3#0?/'1%#;'U#3#&/.&6R'#&;'!24%I
+/#3.&6'
•
^)B3%A")&4'
– D0),0'2B3%#;'#'(!X'Z3)'/%&4#.&.&6'#'d#-#'()*'!BB3./#?%&' – =3#0?/'U)#&04#3V'E#&;3)0'4E)'B,%-.0.%&.&6'%G'#'3%#;'*#3#&/),'#&;'4E)' ;)B3%A")&4'%G'4E)'(!X'Z3)'4%'%&)'%,'"%,)'=>8'.&04#&/)0',2&&.&6'4E)'!B#/E)' @%"/#4'#BB3./#?%&'0),-),'•
<)#42,)0'
– ;)B3%A'&)Q'#BB3./#?%&'-),0.%&0'4%',2&&.&6')&-.,%&")&40' – )I"#.3'&%?Z/#?%&0'4E,%26E'+_+'QE)&'#BB3./#?%&'E)#34E'/E#&6)0'%,' #BB3./#?%&'0),-),0'#,)'#;;);'%,',)"%-);'•
!24%'+/#3.&6'#&;'1%#;'U#3#&/.&6'B#,#")4),0'G233A'/204%".$#*3)'4E,%26E'
4E)'!(+'`#)")&4'>%&0%3)'
•
:,./.&6a'&%'#;;.?%'/E#,6)'G%,'=3#0?/'U)#&04#3V'R'4E)'20),'B#A0'%&3A'G%,'
4E)'2&;),3A.&6'!(+',)0%2,/)0'4E#4'A%2,'#BB3./#?%&'/%&02")07''
20!"#$%&'=3#0?/'U3%/V'+4%,)'
• Offers persistent storage for EC2 instances
• Provides off-instance storage that persist independently from the
life of an instance
• EBS volumes from 1GB to 1 TB
• EBS volumes can be used ad instance’s boot partitions or attached
to running instances as standard block devices
• A volume can only be attached to one instance at a time, but many
volumes can be attached to a single instance
• EBS volumes can be attached only to instances in the same
Availability Zone
• EBS volumes automatically replicated within the same Availability
Zone to avoid data loss
• EBS provides the ability to create point-in-time snapshots of
volumes that can be stored using S3
21
=3#0?/'9:'n'o.,42#3':,.-#4)'>3%2;'
• Elastic IP
– addresses are not associated with a particular instance but with
a user account
– the user control an elastic IP address address until he explicitly
release it
– allow to mask instance or Availability Zone failures by quickly
remapping the Elastic IP address to another instance/load
balancer
• Virtual Private Cloud
– enables enterprises to connect their existing infrastructure to a
set of isolated AWS compute resources via a Virtual Private
Network (VPN) connection
!"#$%&'>3%2;(#4/E'
• Provides monitoring for AWS cloud resources and applications
• CloudWatch is Metric repository
• AWS services put metrics in the repository • users retrieve statistics based on those metrics
23
>3%2;(#4/E'/%&/)B40'
• Metric
– a time ordered set of data points
– PutMetricData API allows users to create custom metrics
• Statistics
– metric data aggregations over specified periods of time
– available statistics: Minimum, Maximum, Sum, Average, SampleCount – can be retrieved by GetMetricStatistics API
• Period
– length of time associated with a specific CloudWatch statistic
– expressed in seconds, range from 60 (one minute) to 1209600 (two weeks)
• Alarm
– watches a single metric over a specified time period
– performs one or more actions based on the value of the metric relative to a given threshold over a number of time periods
>3%2;(#4/E'")4,./0'
• EC2 metrics
– CPUUTilization!
– DiskReadOps/DiskWriteOps!
– DiskReadBytes/DiskWriteBytes!
– NetworkIn/NetworkOut!
• Elastic Load Balancing Metrics
– Latency!
– RequestCount!
– HealthyHostCount/UnHealthyHostCount!
– Count of HTTP Response Codes (2xx, 3xx, 4xx, 5xx)generated
by Load Balancer or back-end instances
25
>3%2;(#4/E'9&4),G#/)'
• Command Line Tools
• Libraries
– (JAVA, PHP, Python, Ruby, Android, iOS, Wndows and .NET)
• Query API
– HTTP/HTTPS GET or POST requests
• AWS Management Console
>3%2;(#4/E'!3#,"0'n'!24%'+/#3.&6'
• An alarm watches a single metric over a time period and performs one or more actions based on the value of the metric relative to a given
threshold over a number of time periods
• Possible states: OK, ALARM, INSUFFICIENT_DATA! • When an alarm changes its state an action is invoked
– notification through Amazon SNS – Auto Scaling policy
Example
Threshold = 3
minimum breach = 3 periods
27
!24%'+/#3.&6'
• Auto Scaling allows to scale EC2 capacity up or down
automatically according to user defined conditions
• Enabled by Amazon CloudWatch
– uses CloudWatch alarms
!24%'+/#3.&6':%3./.)0'
• Auto Scaling policies defines action to take when an alarm state
changes
• For every monitored event 2 policies should be defined
– a
scale-up
policy
– a
scale-down
policy
• A policy can be created using PutScalingPolicy API with the
following parameters:
–
AdjustmentType: possible values are ChangeInCapacity,
ExactCapacity, PercentChangeInCapacity!
–
Cooldown: amount of time after a scaling activity completes
before any further trigger-related scaling activities can start
–
PolicyName!
–
ScalingAdjustment: the number of instances by which to
scale (positive or negative)
29
=3#0?/'1%#;'U#3#&/.&6'
• Automatically distributes incoming traffic across multiple
EC2 instances
=1U'G)#42,)0'
• Detects unhealthy instances within a poll and automatically reroute
traffic to healthy instances
• Can be enabled across multiple Availability Zones within a Region
– NOT between Availability Zones in different Regions!
• Uses a Least Loaded balancing policy
• Supports sticky sessions
– load balancer generatedHTTP cookies (browser based session lifetime)
– application-generated HTTP cookies (application-specific session lifetimes)
• Supports HTTPS
• Enables the client to define an application healthcheck for the
instances through the following parameters
– Threshold, Interval, Target, Timeout, UnhealthyThreshold!
• Provides APIs to add/remove instances
– RegisterInstancesWithLoadBalancer! – DeregisterInstancesWithLoadBalancer!
31
!24%&%"./'>3%2;'!,/E.4)/42,)'
33
!24%&%"./'>3%2;'9"B3)")&4#?%&'
=b#"B3)a'`);.#Q.V.'%&'=>8'
• To test the AWS Auto Scaling capabilities we deployed Mediawiki
on Amazon EC2
– MediaWiki is a free software open source wiki package written in PHP
• We populated the DB with a dump from Wikipedia
• We replicated traffic from a real Wikipedia workload trace properly
reduced
35
=b#"B3)a'@)04*);'0)42B'
• 1-10 Amazon EC2 m1.small instances
– 32 bit Linux VMs with 1 EC2 Compute Unit and 1.7 GB memory – Each VM replicates the front-end of the MediaWiki web application – Apache 2.2.16 is used as application server
• 1 Amazon EC2 m1.large instance
– 64 bit Linux VM with 4 EC2 Compute Units (2 cores) and 7.5 GB memory
– MySQL 5.1.52 is used to implement the back-end tier
– The system is dimensioned to guarantee that the centralized DB never represents the system bottleneck
• 1 Amazon Elastic Load Balancer
• 1 EC2 m1.small instance as workload generator
• All components run in the same Availability Zone
– the effects of network latency are reduced at the minimum
=b#"B3)a'!24%'+/#3.&6':%3./.)0'
• Utilization-based, one alarm (UT-1AL)
– add 1 instance if average CPU utilization > 62% – remove 1 instance if average CPU utilization < 50% • Utilization-based, two alarms (UT-2AL)
– add 2 istances if utilization > 70%, 1 if utilization>62% – remove 1 instance if utilization < 50%, 2 if utilization < 25% • Latency-based, one alarm (LAT-1AL)
– add 1 instance if latency (average response time seen by the ELB) is > 0.2 seconds
– remove 1 instance if average CPU utilization < 50% • Latency based, two alarms (LAT-2AL)
– add 2 istances if latency > 0.5 sec, 1 if latency > 0.2 sec – remove 1 instances if utilization < 50%,2 if utilization < 25%
37
=>8'B,%*3)"0'
• ELB bugs
– problems with start/stop instances, better use launch/terminate – if an instance crashes it remains forever in “unhealthy” status – unhealthy instances are not automatically replaced
• CloudWatch problems
– metric variation over a time interval is missing
– request count considers only the requests processed by the load balancer (system throughput behind the ELB)
– a metric to know the number ofrequest arrived at the load balancer is missing
• General Problems
– no real-time billing
– performance level of a single VM is quite variable – load balancing policy cannot be customized