• No results found

MySQL: Cloud vs Bare Metal, Performance and Reliability

N/A
N/A
Protected

Academic year: 2021

Share "MySQL: Cloud vs Bare Metal, Performance and Reliability"

Copied!
34
0
0

Loading.... (view fulltext now)

Full text

(1)

MySQL: Cloud vs Bare Metal, Performance and Reliability

Los Angeles MySQL Meetup

(2)

Let’s meet each other

• Performance geek

– All kinds MySQL and some Sphinx

• Working for Blackbird • Twitter @vfedorkov

(3)

Credits and thank yous

• LAMySQL Meetup

– Joe Devon

– Caroline Rose

• Blackbird, AWS advances tech partner

– Laine Campbell

– http://blackbird.io

(4)

How does it start

• Startup deciding to go with MySQL • Getting successful

• That server gets slower and slower • Now what to do?

(5)

Usual ways to grow

• Buy bigger server (scale up) • Buy another server (scale out) • Tune up your application

(6)

Questions to answer

• Which server to buy?

– Where?

• How long does it take to get new server? • How long does it take to make it work? • Why it didn’t help?

(7)

Which server to get?

• How much would infrastructure will cost me? • Which performance will I get?

• How easy would be to add more servers? • How much would it cost me?

(8)
(9)

Physical servers

• Predictable performance • Raw infrastructure

• Takes long to install • Old-style operations

(10)

Amazon EC2

• Virtually provisioned boxes

– Managed infrastructure

• Boxes • Network

• Attached disks

– Web/API bound operations

• Strong scalability suport

– Full OS access

• Flexible performance

(11)

Amazon RDS

• Fully managed MySQL instances (SaaS)

– Oracle, MSSQL, RedShift, etc. – No OS access

• Same on-demand scalability as EC2

– Scripted scaling up/down

(12)

Resources

Bare metal EC2 RDS

CPU Physical Virtual, limited

Virtual, limited

RAM Physical Shared Shared

IO Predictable Network-based, limited

Network-based, limited

(13)
(14)

Day-to-day MySQL operations

• Hardware faults • Software bugs

• Reliable backup and restore • Monitoring

• Upgrading and replication setup • Networking and Security

(15)

Physical servers vs EC2 vs RDS

Bare metal EC2 RDS

Provisioning By you Ready-to-go boxes DB instances Monitoring Your own OS-level by amazon All by amazon

Backups Your own Or no backups Snapshot-based Manual on MySQL level Automated, no delayed replicas Replication Manual Manual Auto magic Networking All yours Provided by AWS Provided by AWS

Hardware faults

(16)

Amazon cloud details

• Several regions (US, Europe, Asia)

• Several availability zones inside one region • Different instance types

– General

– Memory optimized – Compute-optimized – GPU-instances

(17)

Storage

• EBS – EC2 / RDS • POIPS – EC2 / RDS • SSD-based backend • Ephemeral

– EC2 only, attached SSD

(18)

RDS details

• Parameter groups instead of my.cnf • One click Web interface

– API available

• Limited visibility and operations • Cloud-specific behavior

– Upgrade as a Surprise

• Can be disables in control panel

(19)
(20)

Know your application

• Configure MySQL properly • Tune your queries

• Monitor your system • Predict your traffic

(21)

EC2 and RDS specific

• Hardware might be different

– Even each benchmark run

• IO is highly relies on network • CPU & IO limiters are in place

(22)

First step

• MySQL configuration

– 5% settings makes 95% performance

• Which every query could kill

• What to tune at the first place

– innodb_buffer_pool_size

• Set on ¾ memory in RDS by default

– innodb_file_per_table – key_buffer for MyISAM

(23)

Second step

• Network

– skip_name_resolv

• May affect authentication

– max_connect_errors • FLUSH HOSTS – max_allowed_packet • IO – innodb_flush_log_at_trx_commit – innodb_log_file_size

(24)

Changes to avoid (usually)

• max_join_size • sort_buffer_size • tx_isolation

• Remember about 5% settings and 95% performance!

(25)

Backups

• What would you do if your MySQL has gone?

– Replicas-Replicas-Replicas

• … and binary logs

– For point-in-time recovery – Better to another host

– expire_logs_days – max_binlog_size

• RDS snapshots

• Volume snapshots on EC2

(26)

Consider worst case

• Limit InnoDB IO capacity on EBS

– Affecting replica creation time!

• Tune flush_log_at_trx_commit • Keep tmp_table_size big enough

– Temporary tables are BAD! – There are drawbacks!

(27)

Queries

• Most likely the cause of MySQL slowness

– Affect application speed

• Fastest MySQL query is the query that did not MySQL at all

– Cached

– Sent to another storage (you name it) – Just didn’t run

(28)

Bad queries: full table scans

SELECT * FROM table …

• WHERE DAY(FROM_UNIXTIME(`ts`)) = 205 • WHERE enabled != 1

• WHERE id NOT IN (1,2,3,…,10) • WHERE url LIKE ‘%something%’

– LIKE ‘something%’ is still okay • ORDER BY RAND()

(29)

Poor index usage

• SELECT * / COUNT(*)

FROM users WHERE enabled=1

– Hello half of the table

• How to fix?

– Pre-aggregation – Caches!

• Not in MySQL query cache!

(30)

Temporary tables

• Bad

– Very bad if they hit the disk

• Will hit if there are TEXT or BLOB fields

• When?

– GROUP BY

– Sub SELECTS (better in 5.6) – DISTINCT + ORDER BY

(31)

What else to do?

• Scale

• Run heavy queries on “reporting” replica • Rewrite bad queries where possible

(32)

When load is really high

• Query cache (off) • Thread cache

• Table cache

• Slow query log • wait_timeout

• connection pooling

(33)
(34)

Thank you!

http://blackbird.io,

References

Related documents

Collisions between railway vehicles and vehicles subject to insurance (road vehicles) shall be regulated by the road vehicle's insurance provider in the first instance.

Finally, for a small number of beams, steel brackets (Figs. 6 and 7) were used to increase the level of connection between and tim- ber material and the GFRP plate5. For each beam,

Role of Water Film Thickness on Rheological Characteristics of Self-Consolidating Concrete Containing Silica Fume.. Reza Saleh

For waterways with widths less than 6.0 times the length overall of the design vessel, LOA, the acceptance criterion for the annual frequency of collapse for each pier

To be able to judge absolute levels of healthy older people’s QOL evaluations with regard to some kind of standard, we additionally compared the item mean profiles of male and

connections used to communicate incoming calls from a first network to a second mobile network, us- ing signalling data comprising routing request data communicated from the

CRITERIA Design/management separation Capacity for variations Complexity Speed Cost certainty Clarity of r emedies.. Default Delay Quality failur e RISK AREAS HIGH

We focused on acculturation, familistic beliefs, perceived mutuality between daughters and mothers, and maternal affection, communication, and support as reported by the girls and