• No results found

Berlin Storage, Backup and Disaster Recovery in the Cloud AWS Customer Case Study: HERE Maps for Life

N/A
N/A
Protected

Academic year: 2021

Share "Berlin Storage, Backup and Disaster Recovery in the Cloud AWS Customer Case Study: HERE Maps for Life"

Copied!
46
0
0

Loading.... (view fulltext now)

Full text

(1)

Berlin 2015

Storage, Backup and Disaster Recovery in the Cloud

AWS Customer Case Study: HERE „Maps for Life“

(2)

©2015, Amazon Web Services, Inc. or its affiliates. All rights reserved

Storage, Backup and Disaster

Recovery in the Cloud

Robert Schmid, Storage Business Development, AWS

Ali Abbas, Principal Architect, HERE

Case Study: AWS Customer HERE

(3)

What we will cover in this session

Amazon storage options

Amazon Elastic File System

Use cases (Backup, Archive, DR)

Customer Use Case: HERE

(4)

102% year-over-year increase in

data transfer to and from S3

(Q4 2014 vs Q4 2013, not including Amazon use)

(5)

Amazon S3

(6)

$0.03

per GB-month

$360

per TB/year

99.999999999%

durability

Amazon S3

(7)

Amazon Glacier

(8)

$0.01

per GB-month

$120

per TB/year

99.999999999%

durability

Amazon Glacier

Low-cost archiving service

3–5 hours

(9)

Amazon EBS

(10)

EBS

General Purpose (SSD)

Up to 16 TB

10,000 IOPS

Provisioned IOPS (SSD)

Up to 16 TB

20,000 IOPS

$0.10

per GB-month

$0.125

per GB-month

0.065/provisioned IOPS

(11)
(12)

Storage Gateway

Your on-ramp to AWS cloud storage:

• Back up into S3

• Archive into

Amazon Glacier

(13)

Summary: AWS Storage Options

• Object Storage (S3, Glacier)

• Elastic Block Storage (EBS)

• Storage Gateway (iSCSI, VTL)

(14)

Introducing

Amazon Elastic File System

for EC2 Instances

pilot availability later this summer

(15)

What is

EFS

?

• Fully managed file system for EC2 instances

• Provides standard file system semantics (NFSv4)

• Elastically grows to petabyte scale and shrinks elastically

• Delivers performance for a wide variety of workloads

• Highly available and durable

simple

elastic

scalable

(16)

Amazon Storage Use Cases:

(17)

Block File

Customer Data Center

Archive Backup Disaster Recovery

Backup, Archive, Disaster Recovery

AWS Cloud

Glacier S3

Colocation Data Center

DirectConnect Private Storage

for AWS Internet AWS Direct

Connect AWS SGW Storage Gateways S3 Glacier Customer /CSP Assets

(18)

18

AWS Customer Case Study

Ali Abbas

HERE: Maps for Life

Principal Architect

• High Resolution Satellite Imagery

• Predictive Analytics/Machine Learning

ali.abbas@here.com

(19)

19

(20)

20

Maps for Life

Web and Mobile App

(21)

21

Offline Map

Save the maps of your country or state on your phone

Use your phone offline Explore anywhere without an internet connection

(22)

22 Unified Route Planning Route Alternatives Turn-by-turn Navigation

(23)

23 23 Route Alternatives Step-by-step transit Turn-by-turn walk guidance

Urban

Navigation

(24)

24 24

Collections Easy location sharing

(25)

25 25

Train Schedule Traffic incidents

3D Maps

(26)

26

Reality Capture Processing

Satellite/Aerial Delivery

Enterprise Businesses

(27)

27

99.99% availability, 99.999999999% durability

High throughput/Good Performance for most use-cases

Good price ratio

(28)

28

(29)

29

(30)

30

Billion of tiles

• Huge storage requirements due to high resolution content across zoom levels

• Big amount of small tile size to keep track and deliver

• Exponential growth rate (today some billions, tomorrow some trillions)

• Increased data volume refresh rate

• Maintain low latency requirements and service level agreement

(31)

31

Behind the curtain

• Specialized spatial file system to deliver tile imagery with sub-ms lookup time over

the network.

• Simple Architecture with CDN Caches and Core sites (with full dataset)

• Remote sites had CDN type caches with geospatial shard-ing placement

algorithms.

• Some select cache regions suffered sometimes from inter-continental network latency due to non-optimized routing

(32)

32 Core Caches

Shared Store

Specialized Spatial Blob Store

Singleton Store

Specialized Adaptive Spatial Blob Store Mercator based shard-ing layer

(33)

33

Given the success of S3 usage across HERE and the recent enhancement

to the offering, we started to look at S3 to solve 2 main problems with 1

solution

Simplify the storage handling layer with getting rid of the storage compute from our architecture and simplify Operations.

Reduce the network latency from core data to our delivery instances by adding core data presence in each availability regions.

(34)

34

• Easy life-cycle management for recurring update

• Big Data store requirements on-demand (ease capacity planning)

• Easy pipeline integration with SQS/SNS for background jobs

• Good performance out of the box, however did not fulfill our requirements

- Too much variation in response time ~ AVG 150-300ms.

(35)

35

Amazon S3 maintains an index of object key names in each AWS region. Object keys are stored lexicographically across multiple partitions in the index. That is, Amazon S3 stores key names in alphabetical order. The key name dictates which partition the key is stored in. Using a sequential prefix, such as timestamp or an alphabetical sequence, increases the likelihood that Amazon S3 will target a

specific partition for a large number of your keys, overwhelming the I/O capacity of the partition.

http://docs.aws.amazon.com/AmazonS3/latest/dev/request-rate-perf-considerations.html

(36)

36

Amazon S3 maintains an index of object key names in each AWS region. Object keys are stored lexicographically across multiple partitions in the index. That is, Amazon S3 stores key names in alphabetical order. The key name dictates which partition the key is stored in. Using a sequential prefix, such as timestamp or an alphabetical sequence, increases the likelihood that Amazon S3 will target a specific partition for a large number of your keys, overwhelming the I/O capacity of the partition.

http://docs.aws.amazon.com/AmazonS3/latest/dev/request-rate-perf-considerations.html

(37)

37

S3 Load constrain + Satellite

Stored lexicographically across S3 partitions

Satellite example tile ID: 15/18106/11272

z x y 302013232331232 quadkey representation 302013232321201 15/18089/11275 17/72409/45094 30201323233033003

(38)

38

S3 Load constrain + Satellite

Satellite example tile ID: 15/18106/11272

z x y 302013232331232 quadkey representation 17/72409/45094 30201323233033003 302013232321201 15/18089/11275

Stored lexicographically across S3 partitions

Each zoom level has 4^level_detail tiles, a quadkey length is equal to the level of detail of the corresponding tile.

(39)

39

S3 Load constrain + Satellite

Stored lexicographically across S3 partitions

Alternative to quadkeys

use random hash, increase base number

Remaining problem

At the scale of satellite, the ratio of requests in regards to the lexicographic overlap produced with a random hash was still significant and would not scale well.

Performance was still unacceptable in light of our requirements. Billion of PUT requests would considerably increase recurring-updates cost.

(40)

40

S3 Load constrain + Satellite

Stored lexicographically across S3 partitions

Better solution

Reduce the amount of files by creating binary blob on S3, index the tiles inside the blobs and use HTTP range-request for access.

New Challenge

Managing updates got more complicated, more logic requires to distribute tiles inside the blobs and more important the predicted index size was in magnitude of terabytes and growing… cost and complexity overhead.

(41)

41

(42)

42

New Pseudo-Quad Index

• New compact O(1) data-structure to work around the performance constrains of S3 • It minimizes the index size constrain to keep track of tiles and random hashes

• 194.605% size reduction in comparison to generic optimized hash tables

• It reduces and sets boundaries for proximity regions to cause better dispersion on the n-gram load split algorithm used by S3

• Simplified Imagery updates; geometrical consistency across all S3 buckets • Performance:

• S3: >150-300ms

(43)

43

With S3 and PQI we have simplified our architecture

PQI Backend

Tiny ref file

(44)

44

Impact on Architecture

Impact on day-day Operation of our services

Brings us geographically closer to our

customer while not compromising on design patterns to work around network latencies.

Allows us to only focus on our core business and technologies while offloading

(45)

©2015, Amazon Web Services, Inc. or its affiliates. All rights reserved

Thank you!

please meet our Sponsors/Partners

and see us in the EXPO area

(46)

Further information:

http://aws.amazon.com/solutions/

References

Related documents

cloud-based data backup and disaster recovery services provides new opportunities for IT departments to move up on the maturity curve.. Users running traditional backup and recovery

Offerings such as Disaster Recovery Orchestrator with Microsoft Azure or AWS provide a disaster recovery solution that combines the best features of the public cloud and

The study found that nearly half (48 percent) of organizations that rely solely on cloud storage for backup and disaster recovery indicated they could recover both

Yottabyte software defined datacenter incorporates storage, compute and networking elements, built with Yottabyte’s powerful file system at the core, yStor delivers end-to-end

• Disaster recovery: Spectrum Protect node replication target hosted in the cloud (potentially using cloud object storage).. Offer standardized XaaS consumption model for

` Cloud Implementation ` Operations organization consulting ` Backup aaS ` Storage aaS ` Desktop aaS ` Security aaS. ` Disaster

Deduplication Backup Storage for D2D+DR Backup/ media servers WAN Onsite Retention Storage Offsite Disaster Recovery Storage Backup Clients 7 © Copyright 2009 EMC Corporation.

storage Backup/ media server Onsite Retention Storage Offsite Disaster Recovery Storage Retention/ Restore Replication DR Backup Archive to tape As required WAN. •