AWS Storage: Minimizing Costs While Retaining Functionality August 27, 2012
AWS Storage:
Minimizing Costs While Retaining Functionality
This whitepaper, the second in our Cost Series, discusses persistent storage with Amazon Web Services. It will focus upon Elastic Block Store (EBS) and Simple Storage Service (S3). These services represent, respectively, the hard drives and the system drives for data storage. When discussing EBS and S3, this paper provides background, discusses usage, and presents 10 individual guidelines to maximize benefits while minimizing costs.
Please be aware that all EBS Standard, EBS Provisioned IOPS, EBS Snapshots, S3 Standard, and S3 RRS storage cost calculations are, unless otherwise noted, based upon the following calculation: Cost =
# of TB * # GB in 1TB * Monthly Storage Cost
* # of Months.
EBS – Basic Overview
Elastic Block Store (EBS), functioning as the “hard drive” for an EC2 instance, provides block level storage volumes which range from 1 GB to 1 TB. As a
differentiator, EBS functions with less latency and less
durability than S3. Although devices can be mounted to existing EC2 instances, the volumes are also network-attached and can persist independently. While each EBS volume can only be attached to a single instance at a time, multiple EBS volumes can be attached to an instance. EBS performance can be controlled, enhanced, and scaled by employing an optional Provisional IOPS volume.
EBS volumes perform like raw, unformatted block devices. Users can overlay a file system or employ EBS as if it was a traditional hard drive (or any other block device). When EBS volumes are employed as boot partitions for EC2 instances, the boot partition data persists even after terminating the instance. (This enables users to quickly restart terminated instances
Currently, AWS offers EBS availability in US East (N. Virginia), US West (Oregon), US West (Northern California), EU (Ireland), Asia Pacific (Singapore), Asia Pacific (Tokyo), South America (Sao Paulo), and the GovCloud (US) Region.
GovCloud will not be addressed within this paper.
Launching EBS
When creating an EBS volume, users choose
a volume type, size, whether to build the
2
AWS Storage: Minimizing Costs While Retaining Functionality August 27, 2012 volume using a pre-existing snapshot, and
assign the EBS volume to a
specific Availability Zone (AZ). The EBS volume can then be attached to instances within that particular AZ.
EBS Volume Types
There are two types of volumes available for EBS: Standard volumes and Provisioned IOPS volumes. Standard volumes are intended for applications with moderate or bursty I/O requirements. Standard volumes deliver approximately 100 IOPS on average with peak ability to burst to hundreds of IOPS. This burst capability is also well suited for boot volumes as it enables rapid instance initiation.
Provisioned IOPS volumes are intended for I/O intensive workloads such as databases.
Importantly, for Provisioned IOPS, the IOPS rate which is specified when creating the volume remains effective for the lifetime of that particular volume. EBS currently supports up to 1000 IOPS per Provisioned IOPS volume. This quantity is expected to increase. However, regardless of the timing of the increase, multiple volumes can be striped together to deliver thousands of IOPS per Amazon EC2 instance.
Finally, EC2 instances which employ Provisioned IOPS should be launched as
“EBS-Optimized” instances. This
maximizes the performance by delivering dedicated throughput of 500 Mbps to 1000 Mbps. Attached EBS-Optimized
Provisioned IOPS volumes are designed to deliver over 90%
of the provisioned IOPS performance 99.9%
of the time.
Durability
EBS durability is increased through the use of both volume replication and snapshots.
Volume replication occurs automatically within the AZ of the EBS volume. This prevents data loss due to physical failure.
Thus, volume durability from a replication perspective depends upon two factors: the volume size, and the quantity of data modification from the prior snapshot.
For example, volumes operating with under 20 GB of data modification since the prior Snapshot should expect an annual failure rate (defined as a complete volume loss) of between 0.1% – 0.5%. This makes EBS volumes roughly 10 times more reliable than a typical commodity disk drive. This level of durability is, however, still significantly less than that of even S3 RRS storage.
Snapshots function to further increase durability where replication fails. As noted, EBS volumes are contained and mirrored within a single AZ. This means that further mirroring data across multiple EBS volumes will not significantly increase durability.
Instead, users need to schedule regular volume snapshots. These snapshots are automatically stored in S3 and replicated across multiple AZ. A single snapshot allows for the regeneration of multiple volumes and devices.
Consequently, the user incurs only the loss of the data changes since the prior snapshot.
The entire volume is not lost.
3
AWS Storage: Minimizing Costs While Retaining Functionality August 27, 2012 EBS Costs
AWS delineates EBS service into three different types – Standard EBS Volumes, EBS Provisioned IOPS, and EBS Snapshots.
All three may accrue charges at different rates for (sometimes) duplicate services.
Costs for Standard and Provision IOPS volumes:
Standard EBS Volumes, the first service category, accrues charges both from the quantity of provisioned storage as calculated on a monthly basis and from the quantity of monthly I/O requests. For example, the current rate in US East (N. Virginia) is $0.10 per GB-month of provisioned storage and
$0.10 per million I/O requests. The second portion represents the only quantity based charge within the EBS pricing scheme.
When considering EBS costs, as AWS bills based upon usage, it is critically important for users to remember that EBS volumes can persist even after an instance is terminated.
For example, volumes containing AMI will automatically persist. This means that, when terminating an instance, users may need to also delete the EBS volume and the S3- stored EBS Snapshots. Otherwise, costs will continue to accrue.
The storage price within Standard EBS Volumes will generally be the least
expensive type of storage within EBS. Also, as in EC2, the price for US East (N.
Virginia) will typically be the least
expensive Region and South America (Sao
Paulo) will typically be the most expensive Region.
EBS Provisioned IOPS Storage, the second service category, is more expensive than storage within EBS Standard. For example, Provisioned IOPS Storage in US East (N.
Virginia) costs $0.125 per GB-month vs.
only $0.10 per GB for Standard; a difference of 25%.
Provisioned IOPS Storage users are also charged a flat rate according to desired performance speed. This is calculated as a flat rate based upon speed chosen on a pro- rata per monthly basis. Depending upon Region, this charge ranges from $0.10 to
$0.12 per hundred IOPS. The example below illustrates the additional charges incurred when employing Provisioned IOPS Storage for 1TB. It assumes continuously used maximum Provisioned IOPS speed (1000) and Standard Storage operating at its maximum non-burst 100 I/O requests per second.
1 Year EBS Cost Comparison for 1TB Data in US East (N. Virginia):
EBS Standard Storage
1 * 1024 * $0.10 * 12 = $1229 EBS I/O Requests
1100 * 86,400 * $0.10 =$315
Total Standard Cost: $1644
EBS Provisioned IOPS Storage 1 * 1024 * $0.125 * 12 = $1536 EBS Provisioned IOPS –month
2$.10 * 1* 1000 *12 = $1200
1
The Standard I/O calculation = I/O Rate per Second
* # Seconds per Day * # Days
2
The Provisioned IOPS speed calculation = Price *
(Per Cent of Month/ 100) * Speed * # of Months)
4
AWS Storage: Minimizing Costs While Retaining Functionality August 27, 2012 Total EBS Provisioned IOPS cost: $2736
Annual Additional Cost for Using
Provisioned IOPS (Continuously): $1092 Provisioned IOPS users can reduce the $100 per month cost by provisioning only for the time period necessary or by provisioning at less than maximum speed. For example, if a user only needs the heightened performance for the last quarter of each month, the user can provision accordingly and reduce the monthly charge to $32.50 per month. This results in the amended annual comparison as follows:
EBS Standard Storage
1 * 1024 * $0.10 * 12 = $1229 EBS I/O Requests
3100 * 86,400 * $0.10 =$315 Total Standard Cost = $1644
EBS Provisioned IOPS Storage 1 * 1024 * $0.125 * 12 = $1536 EBS Provisioned IOPS –month
4$.10 * .25 * 1000 * 12 = $300
$.10 *.75 * 100 * 12 = $90
Total EBS Provisioned IOPS cost: $1926
Annual Additional Cost vs. Standard (25% monthly usage): $282
Annual Saving by Properly Scheduling Provisioned IOPS: $810
For perspective, these are the annual costs associated with using EC2 Linux Instances
3
The Standard I/O calculation = I/O Rate per Second
* # Seconds per Day * # Days
4
The Provisioned IOPS speed calculation = Price * (Per Cent of Month/ 100) * Speed * # of Months)
under Reserved Light Usage 1 year terms in the same Region
5Standard Small Instance: $69 + ($0.039 * 24 * 365) = $411
Standard Medium Instances $138 ($0.078
* 24 * 365) = $821
Standard Extra Large Instance: $552 + ($0.312 * 24 * 365) = $3285
High Memory Extra Large: $353 + ($0.22
* 24 * 365) = $2280
These examples and comparisons illustrate that:
The incremental cost associated with employing Provisioned IOPS may represent a very significant portion of total AWS computing costs.
Users should consider their total I/O Request volume and the amount of monthly time requiring greater peak performance.
Users should compare the incremental costs of Provisioned IOPS carefully to determine if the additional performance warrants the expense.
Users should properly schedule Provisioned IOPS performance as the incremental cost of continual
provisioning may exceed the cost of an EC2 instance.
Users should be aware that using EBS Standard offers savings that, in some cases, exceeds the total cost of the EC2 instance employed.
EBS Snapshot Costs:
5
EC2 Instance price calculation = Upfront charge +
(hourly cost * hours per day * days per year)
5
AWS Storage: Minimizing Costs While Retaining Functionality August 27, 2012 EBS Snapshots, the third service category,
represents the “hidden” cost of using EBS storage. Facially, EBS storage is less expensive than its S3 Standard counterpart and, given EBS’s lower latency, this might tempt users to consistently choose EBS over S3. However, that choice would be
shortsighted. EBS offers significantly less redundancy than even S3 Reduced
Redundancy Storage (RRS) and, consequently, data is significantly less durable. The expected EBS annual failure rate for an entire volume is 0.1- 0.5%. This is 10-50x greater than that of S3 RRS.
Compared to S3 Standard, the failure rate differential is even greater. Consequently, even non-critical data requires EBS Snapshots.
Unfortunately, Snapshots effectively double the cost of all data stored on EBS. For example, 5 TB of data stored in US West (Oregon) will cost $0.10 per GB-month for EBS Standard and then cost and additional
$0.125 per GB-month when it is replicated within a Snapshot.
EBS Cost (excluding I/O Requests) EBS Standard
5 * 1024 * $0.10 * 12 = $6144
EBS Snapshot
5 * 1024 * $0.125 * 12 = $7680 EBS Total= $13,824
Incremental Snapshot Cost = $7680
This cost is particularly important when contrasted with the cost of employing S3. As the following example illustrates, even after avoiding the incremental costs of
Provisioned IOPS, EBS costs dwarf S3 costs. Using the same 5 TB example:
EBS Cost (excluding I/O Requests) EBS Standard
5 * 1024 * $0.10 * 12 = $6144
EBS Snapshot
5 * 1024 * $0.125 * 12 = $7680 EBS Total= $13,824
S3 Standard Cost
5 * 1024 * $0.125 * 12 = $7680 Annual Incremental Cost of Using EBS Standard: $6144
As an aside, when considering Snapshots, users should be aware that the pricing only reflects the total current Snapshot. As AWS states, Snapshots are updated, rather than recreated. This means that users are not billed for both current and historical Snapshots.
Other Cost Considerations:
As described, excluding the choice between Standard and Provisioned IOPS, EBS costs are relatively plain vanilla. The charges are based upon information volume over a monthly basis. Unlike EC2, there are not different sizes or different purchase plans available. Users do not need to be concerned with over provisioning or pre-purchasing.
The cost will not be affected by either activity.
Users should also be aware of variation in Region price. The question as to whether this differential should be determinative upon domiciling location rests upon an individual user’s usage patterns. For
example, a big data/ light compute user may
6
AWS Storage: Minimizing Costs While Retaining Functionality August 27, 2012 find the EBS Region price difference
compelling while a small data/ heavy compute user will defer to EC2 pricing.
At this time, however, that trade-off comparison is purely theoretical as the hierarchy of Region pricing for both EBS and EC2 are identical. US East (N.
Virginia) is always the least expensive Region, South America (Sao Paolo) is always the most expensive Region, and the other Regions share the same pricing order for both services.
S3 –Basic Overview
AWS presents two options for storage service under the S3 banner: Standard and RRS. The only material differences between the options are the level of redundancy and the cost associated with usage.
To create a bucket, users give the bucket a name and select a Region to house the bucket. Once the bucket is created, objects can be uploaded to the bucket using either standard storage, or RRS. A bucket can contain objects utilizing both types of storage. The properties of the bucket, including access controls, notifications and object lifecycle rules can be established upon creation.
Currently, AWS offers S3 availability in US Standard, US West (Oregon), US West (Northern California), EU (Ireland), Asia Pacific (Singapore), Asia Pacific (Tokyo), South America (Sao Paulo), and the GovCloud (US) Region. Using network maps, the US Standard Region automatically routes requests to facilities in either
Northern Virginia or the Pacific Northwest.
For the purposes of this paper, we will not address the GovCloud Region.
S3- Security and Durability
S3 provides authentication mechanisms that help keep data secure from unauthorized access. Objects can be made private or public, and rights can be granted to specific users. To further control access, S3 offers fine grained IAM and ACL controls, Bucket policies, and string query authentications. S3 contains features to help protect against both physical and logical data loss. Additional features also enable SSL and SSE for in- transit and dormant data. Finally, S3 also provides logging which logs all requests made against a bucket.
S3 Standard storage provides a highly durable storage infrastructure by using built- in redundancy and storing objects on
multiple devices across multiple facilities within the designated S3 Region. Most importantly, S3’s standard storage SLA promises 99.999999999% durability and 99.99% availability of objects over a given year while also sustaining the concurrent loss of data in two facilities.
S3 RRS reduces stores data at lower levels of redundancy than Amazon S3’s Standard storage. Similarly to Standard, RRS stores objects on multiple devices across multiple facilities. Although RRS does not replicate objects as many times as Standard, it still provides over 10 times the durability of EBS Standard. The SLA states that RRS provides 99.99% durability and 99.99% availability of objects over a given year.
S3 Costs
7
AWS Storage: Minimizing Costs While Retaining Functionality August 27, 2012 S3 Standard and RRS services and costs are
divided into three categories: Storage pricing, Request pricing, and Transfer pricing. As is to be expected, all of these are determined by usage within a particular Region. For example, the breakpoints in Storage volumes being 1TB, 49 TB, 450 TB and beyond. Each step comes with an approximately 10% price reduction. As a basic rule, users should employ S3 Bucket configurations to automatically delete unnecessary data. Do not save extra data – S3 bills according to quantity.
Most importantly, RRS is significantly less expensive than Standard storage. On average the differential is nearly 30%.
1 Year Cost Comparison of 5 TB Data in US West (Oregon):
S3 Standard
5 * 1024 * $0.125 * 12 = $7680
S3 RRS
5 * 1024 * $0.093 *12 =$5714 Annual Savings by Using RRS: $1966
Transfer Costs and Region Differentiation
Unlike EBS, S3 users are not required to locate data within the same AZ as an instance. However, given the transmission costs, users should generally link data storage and EC2 instances within a single AZ. As our example indicates, if greater than 10-15% of stored data will be accessed and transferred, S3 storage should remain located in the same Region as the user’s EC2 instances.
Comparison using the current S3 Standard storage charges for 8 TB of data in US West (N. California) of $0.14/GB per month and in US Standard of $0.125/GB per month.
1 Year Region Storage Cost Comparison:
US West (N. California)
8 * 1024 * $0.14 * 12 = $13,763 US Standard
8 * 1024 * $0.125 * 12 = $12,288 Annual Region Savings before Transfer =
$1475
Maximum Monthly Data Transfer before Break-even
6$1475 / 1024 * $0.12 * 12) = 1 TB As shown, the question as to whether the differential in Region pricing should be determinative upon domiciling location rests upon an individual user’s usage patterns. If the level of transfer costs exceed the breakeven, users may still want to consider relocating resources to the S3 Region.
For example, a big data/ light compute user may find the S3 Region price difference compelling while a small data/ heavy compute user will defer to EC2 pricing.
At this time, however, a trade-off comparison is purely theoretical. The hierarchy of Region pricing for both S3 and EC2 are nearly identical with US East (N.
Virginia) always the least expensive Region and South America (Sao Paolo) always the most expensive Region.
RRS Costs
6