• No results found

Database Optimization for Web Developers. Little things that make a big difference

N/A
N/A
Protected

Academic year: 2021

Share "Database Optimization for Web Developers. Little things that make a big difference"

Copied!
22
0
0

Loading.... (view fulltext now)

Full text

(1)

Database Optimization

for Web Developers

(2)

About me

• 

16 years experience in web development and administration

(systems, databases, networks)

• 

Various jobs:

•  Web Developer •  Chief Architect •  Manager of IT

•  Director of Technical Operations •  Database Consultant

• 

Mostly worked with MySQL

•  Developer •  DBA

•  Administrator

(3)

What slows down your web page?

Content load time (CSS / JS / Images)

Poorly optimized code

(4)

Assumptions

Average web developer proficiency with SQL databases

• 

SQL

• 

JOINs

• 

Indexes

No specific database server expected

(5)

MariaDB  [sakila]>  EXPLAIN  SELECT  COUNT(*)  FROM  payment  WHERE  

payment_date  >=  '2005-­‐05-­‐01  00:00:00'  AND  payment_date  <  '2005-­‐06-­‐01   00:00:00';

+-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐+ |  id      |  table      |  type  |  possible_keys  |  key    |  rows    |  Extra      | +-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐+ |        1  |  payment  |  ALL    |  NULL      |  NULL  |  16451  |  Using  where  | +-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐+

Not enough indexes!

(6)

MariaDB  [sakila]>  EXPLAIN  SELECT  COUNT(*)  FROM  payment  WHERE  

payment_date  >=  '2005-­‐05-­‐01  00:00:00'  AND  payment_date  <  '2005-­‐06-­‐01   00:00:00';

+-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐+ |  id      |  table      |  type    |  possible_keys  |  key      |  rows  | +-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐+ |        1  |  payment  |  range  |  payment_date    |  payment_date  |  1155  | +-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐+

Not enough indexes!

(7)

When are indexes used?

• 

Left-most prefix of index

• 

If multiple columns, can use from left to right until column is not specified

exactly

• 

Index cannot continue to be used after a range

(8)

Too many indexes?

MariaDB  [sakila]>  SHOW  KEYS  FROM  film;

+-­‐-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐+ |  Table  |  Key_name      |  Seq_in_index  |  Column_name      | +-­‐-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐+ |  film    |  PRIMARY      |      1  |  film_id      | |  film    |  idx_title      |      1  |  title      | |  film    |  idx_fk_language_id      |      1  |  language_id      | |  film    |  idx_fk_original_language_id  |      1  |  original_language_id  |

|  film    |  idx_title_year      |      1  |  title      | |  film    |  idx_title_year      |      2  |  release_year      |

(9)

Avoid full table scans

SELECT  f.FID,  f.title,  c.category_id,  c.name

FROM  film_list  f  JOIN  category  c  ON  c.name=f.category WHERE  f.price=4.99

+-­‐-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐+ |  table  |  possible_keys  |  key    |  rows  |  Extra      | +-­‐-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐+ |  c      |  NULL      |  NULL  |      16  |      | |  f      |  NULL      |  NULL  |    973  |  Using  where;  Using  join  buffer  | +-­‐-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐+

(10)

Don’t put a query inside a loop

Get all the data in one query, and then loop through the results

Use a (well-indexed) JOIN to eliminate the need for additional queries

If you’re using an ORM, be careful not to put code in a loop that will result in

queries being run; generally, ORMs have bulk-loading methods and ways to

specify JOINs

foreach  ($film_ids  as  $film_id)  {

$result  =  $db-­‐>query("SELECT  *  FROM  film  WHERE  film_id=$film_id");

$row  =  $result-­‐>fetch_array();

echo  "<li><span>".$row[1]."</span></li>\n";

(11)

Other kinds of indexes

B-tree index

Hash index

Full-text index

• 

MyISAM only until MySQL 5.6, MariaDB 10.0.3

• 

Often better to use external full-text indexes, like Sphinx, Lucene, Solr,

ElasticSearch

Spatial index

• 

“Minimum bounding rectangle” before MySQL 5.6 or MariaDB 5.5

BYO index

• 

Hash column for long keys

• 

Grid index

(12)

If you can’t index it, cache it!

Do the expensive stuff less often

Run reports for executives / marketing once a day

Create a read-only cache table for searching

• 

Don’t propagate inactive rows to the active cache; in other

words, limit the number of rows to be searched

• 

Rotate cache tables to keep the active cache table read-only

(13)

Eliminate redundant queries

Ensure that commonly-accessed data is only

retrieved once

Store relevant user information in their session

(with expiration)

Use memcached (or an equivalent fast data

store)

If you use an ORM, it may have ways to

(14)

MyISAM’s “table lock” problem

Read Queue

SELECT (0.1s)

SELECT (3.5s)

SELECT

(121s)

SELECT (0.0s)

SELECT (1.2s)

SELECT (5.1s)

Write Queue

UPDATE

(0.0s)

(15)

Multiversion Concurrency Control

(Time T1)

1: a

2: b

3: c

4: d

5: e

6: f

7: g

8: h

9: i

10: j

(Time T2)

1: a

2: b

3: c

4: d

5: h

6: f

7: g

8: h

9: i

10: j

Query 1 at time T1:

SELECT * FROM tbl

Query 2 at time T2:

UPDATE tbl SET v=‘h’ WHERE k=5

Even if Query 1 gets to row 5

after

time T2, the

database will return ‘e’ for row 5. The data will be

consistent from when the query began.*

(16)

Batch writes

Put the data into a queue

Have a process that reads the queue and does a bulk

insert into the database

Redis has a built-in queue data type (LIST)

(17)

NoSQL interfaces

MySQL

HandlerSocket

memcached

PostgreSQL

unknown

(18)

Replication

This is by no means a cure-all

One should not use replication slaves to solve

concurrency issues without understanding the

ramifications

It’s possible that replication can

increase

the impact of

(19)

Tame your ORM

Most ORMs have a debug console where you can see

every query that is performed

For MySQL, you can use the Percona Toolkit’s

(20)
(21)

Conclusion: Be smart!

Know what your database is doing

• 

Use the readily available tools to get a better idea where

your bottlenecks are

Break your problem down into its component parts

Understand the ramifications of your proposed

(22)

Thanks for coming!

@stevecoug

http://www.stevemeyers.net/

References

Related documents