• No results found

Ryan Horn, Lead Software Engineer at Twilio. November 12, 2014 Las Vegas. BDT312 Using the Cloud to Scale from a Database to a Data Platform

N/A
N/A
Protected

Academic year: 2021

Share "Ryan Horn, Lead Software Engineer at Twilio. November 12, 2014 Las Vegas. BDT312 Using the Cloud to Scale from a Database to a Data Platform"

Copied!
58
0
0

Loading.... (view fulltext now)

Full text

(1)

© 2014 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.

November 12, 2014 | Las Vegas

BDT312

Using the Cloud to Scale

from a Database to a Data Platform

(2)

Hi, I’m Ryan

(3)

What is Twilio?

We provide a communications API that enables phones,

VoIP, and messaging to be embedded into web, desktop and mobile software.

(4)

How Does it Work?

A user calls your number

(5)

What is the User Data Team?

• We scale Twilio's backend database infrastructure • We build customer facing data APIs

(6)
(7)

Calls and Messages are Stateful

Queued Ringing In Progress Completed Queued Sending Sent Delivered

(8)

In the Beginning…

All data was placed in the same physical database regardless of where the call or message was in its lifecycle.

(9)

The Monolithic Database Model

API Web Billing MySQL Call/Message Service Carriers

(10)

Problems at Scale

• Many consumers of data

• Data with different performance characteristics • Failure in the database degrades many services • Horizontal scaling and orchestration is

(11)
(12)

What is a Service-Oriented Architecture?

An architecture in which required system behavior is decomposed into discrete units of functionality, implemented as individual services for applications to compose and consume.

(13)

Communicate Through Interfaces, Not Databases

API Web Billing In Flight MySQL Call/Message Service In Flight Service Post Flight Service Post Flight MySQL Carriers

(14)

Database Can Change Without Changing Every Service API Web Billing In Flight MySQL Call/Message Service In Flight Service Post Flight Service Post Flight Amazon DynamoDB Carriers

(15)

SOA Doesn’t Solve Everything

No matter how many services you put in front of MySQL, it’s still a single point of failure.

(16)
(17)

Implementing Sharding (the easy part)

1. Choose partitioning scheme 2. Implement routing logic

3. Send application queries through router 4. Go!

(18)

Sharding at Twilio

Application Router Shard1

Shard2 Shard0

0-3

3-6

(19)

Rolling it Out With Zero Downtime (the hard part)

• We provide a 24/7, always on service

• Communications is intolerant of inconsistency and latency

(20)

Bringing Up a New Shard

Master1 Slave1 Master2 Slave2 Application 0-9

(21)

Split Odds and Evens for Writes

Master1 Slave1 Master2 Slave2 Application Odds Evens 0-9

(22)

Update Routing

Master1 Slave1 Master2 Slave2 Application Odds Evens 0-4 5-9

(23)

Cut Slave Link

Master1 Slave1 Master2 Slave2 Application 0-4 5-9

(24)
(25)

A Necessary Burden

In the beginning, the burden of managing our

own databases was non-negotiable.

(26)

The Landscape has Changed

We now have a variety of managed database services which solve these problems for us, such as Amazon RDS, Amazon DynamoDB, Amazon SimpleDB, Amazon Redshift, etc.

(27)

Cost Is Never Optimized

Application developers do not (and should not) optimize for database cost.

(28)

Self Managed Databases are Costly

Everything

Else 22%

Databases 78%

(29)

Keeping up With Growth

As growth continues to accelerate, we need to somehow keep up.

(30)

A Change in Approach

• Change our hiring practices and bring in specialists • Remove the context switching

(31)
(32)
(33)

Thinking in Terms of Throughput

Amazon DynamoDB allows us to scale in terms of throughput, not machines. This is the future of

(34)

Operations

Management and scaling of our cluster is fully abstracted away from us.

(35)

Cost Compared to MySQL

MySQL 82%

Amazon

DynamoDB 18%

(36)

Cost with MySQL Fully Replaced

Everything Else 61% Databases

39%

(37)

A Relational Model with Amazon DynamoDB

Many of our services allow for querying data in a way that maps naturally to a relational database.

(38)

GET /Accounts/2/Events

(39)

SELECT * FROM events WHERE IpAddress=“5.6.7.8” ORDER BY date DESC;

(40)

SELECT * FROM events WHERE IpAddress=“5.6.7.8” AND Date<=“2014-10-03” ORDER BY date DESC;

(41)

AccountId (Hash) Date (Range) IpAddress_Date Type 2 2014-10-03 5.6.7.8|2014-10-03 call

2 2014-10-01 5.6.7.8|2014-10-01 message

GET /Accounts/2/Events

(42)

AccountId (Hash) IpAddress_Date (Range) Date Type 2 5.6.7.8|2014-10-03 2014-10-03 call 2 5.6.7.8|2014-10-01 2014-10-01 message GET /Accounts/2/Events?IpAddress=5.6.7.8

(43)

AccountId (Hash) IpAddress_Date (Range) Date Type 2 5.6.7.8|2014-10-03 2014-10-03 call 2 5.6.7.8|2014-10-01 2014-10-01 message GET /Accounts/2/Events?IpAddress=5.6.7.8&Date<=2014-10-03

(44)

Need to Handle Exceeded Throughput Failures

(45)

Handling Exceeded Write Throughput with

Amazon SQS

Queuing events to Amazon SQS processing

asynchronously allows us to gracefully deal with write throughput errors.

(46)

API Web Billing Amazon SQS Events Processor Amazon DynamoDB

(47)

Maximum of 5 Global and 5 Local Indexes

You can manage your own indexes, but your

(48)

Local Index Size Limits

Local secondary indexes provide immediate

consistency… and limit the data set for a given hash key to 10GB.

(49)
(50)

Brief History

2008 - 2011

All business intelligence queries run on replicas of MySQL clusters serving production traffic.

(51)

Brief History

2011 - 2013

Data pushed to Amazon S3 and queried with Pig,

Amazon EMR, improving ability to aggregate, but with high latency.

(52)

Brief History

2013 - Present

Move to Amazon Redshift cut the time these reports took from hours to seconds allowing us to answer

(53)

Pushing Data Into Amazon Redshift

Post Flight Service Kafka SQS (DLQ) Amazon S3 Loader S3 Warehouse Loader Amazon Redshift

(54)
(55)

Managed Services as a Culture

Our focus is on creating an experience that unifies and simplifies communications is a reflection on our adoption of managed services.

(56)

Managed Services as a Culture

Understanding and focusing on our areas of expertise and leveraging managed services for the rest

accelerates the delivery of value and innovation to our customers.

(57)
(58)

References

Related documents

Proposed is a thermographic in situ inspection technique which monitors tow placement with an on board thermal camera using the preheated substrate as a through transmission

*Next Step: Add Advanced Cardiac Life Support (ACLS), Advanced Trauma Life Support (ATLS), Fundamentals of Laparoscopic Surgery (FLS).. The Components „ Research

MANAGEMENT STRUCTURE OF THE POA ESKOM Subsidy Data Capturing Entity SASSA (project owner) International Carbon SWH Data HOUSEHOLDS INSTALLATION and MANAGEMENT COMPANY..

The main contributions of the paper are: (1) A formali- sation of the LQB graph model, a concise representation of user behavior across the physical and cyber spaces; (2)

On the one hand we will create a Web Service interface for the FLIGHT BAPI on the ot her hand we will deploy the Web Service in order to access the Web Service; last but not

allows trade to specify the meta data required by CBP for each Document ‘type’. • Includes elements

Various airlines provide compressed oxygen during flight as a service to passengers who need oxygen therapy.. Fees for this service vary based on the duration of the flight or

Comparing to the different sources of the nitrogen fertilizer, liquid Cyanobacteria was resulted in significantly higher plant height of kale by 3.63 cm, 7.23 cm, 9.9 cm and 13.97