Database Internals (Overview)

(1)

Database Internals

(Overview)

Eduardo Cunha de Almeida

[email protected]

(2)

Outline of the course

Introduction

Database Systems

(E. Almeida)

Distributed Hash Tables and P2P

(C. Cassagne)

NewSQL

(D. Kim and J. Meira)

NoSQL

(D. Kim)

MapReduce

(E. Almeida and E. Lucas)

Data Streaming

(C. Cassagne)

Machine Learning

(C. Henard)

(3)

Outline of the course

Introduction

Database Systems

(E. Almeida)

Distributed Hash Tables and P2P

(C. Cassagne)

NewSQL

(D. Kim and J. Meira)

NoSQL

(D. Kim)

MapReduce

(E. Almeida and E. Lucas)

Data Streaming

(C. Cassagne)

Machine Learning

(C. Henard)

(4)

Readings

- Joseph Hellerstein, Michael Stonebraker and James Hamilton.

Architecture of a Database System.

Material at:

http://db.cs.berkeley.edu/papers/fntdb07-architecture.pdf

- Hector Garcia-Molina, Jeffrey Ullman, Jennifer Widom.

Database Systems: The Complete Book.

(5)

History

1970's

●

INGRES Project at Berkeley

●

System-R Project at IBM

-> INGRES corp, Sybase, MS SQL Server, PostgreSQL

(6)

History

1980's

●

DB2

●

Oracle

●

Informix

●

Teradata

●

Sybase

(7)

History

1990's

●

Postgres

●

MySQL

●

Illustra (from Postgres)

(8)

History

2000's

●

MonetDB (as open-source)

●

Vertica

●

Greenplum

●

EnterpriseDB

(9)

History

2010's

●

VoltDB (from H-Store)

●

Facebook Presto

●

Google F1

●

Google Megastore

(10)

Database Genealogy

(11)

General DBMS Architecture

SQL Interface, Apps DBMS Database Database Comm. manager Query processor Parser Optimizer

Auth. manager Executor

Trans. Manager Buffer Lock Access Methods Log Shared components Catalog Adm & Monitoring

Access control

Scheduling

(12)

(13)

DB Logic Tablespace Segments Extents DBMS page File OS page Physical

Database Layout

(14)

Tab customers

Tab products Index

(15)

Tab customers

Tab products

Index on prodID

(16)

slot1 N Slot2 (empty) Slot n ... N of slots 1 0 1

Fixed size pages

(rid, attr1, attr2, …, attr n)

Record

...

(17)

rid=(i,...) Data * Number of entries n 24 16 rid=(i,...) Free space

Variable size pages

Variable record size

For write intensive applications

N-ary Storage Model (NSM)

(18)

rid=(i,...) Data * n 24 16 rid=(i,...) Free space

NSM Example

SELECT *

FROM CUSTOMER

WHERE ID=2;

INSERT INTO T

VALUES(2, 'MARIA', 18, CURITIBA, F);

SELECT AVG(AGE)

FROM CUSTOMER;

(19)

rid=(i,...) Data * n 24 16 rid=(i,...) Free space

NSM Example

SELECT *

FROM CUSTOMER

WHERE ID=2;

- Few write operations

- Read the entire tuple at once

- Aggregations may read the entire DB!!

INSERT INTO T

VALUES(2, 'MARIA', 18, CURITIBA, F);

SELECT AVG(AGE)

FROM CUSTOMER;

(20)

(rid1,attr1) (rid1,attr2)

Columnar page

(rid1,attr3) (rid2,attr1) (rid2,attr2) (rid2,attr3) Page: Attr1 Page: Attr2 Page: Attr3

... ... ...

(rid_n,attr1) (rid_n,attr2) (rid_n,attr3)

Improves aggregation and compression

Decomposition Storage Model (DSM)

[Ailamaki, VLDB02]

(21)

(rid1,attr1) (rid1,attr2) (rid1,attr3) (rid2,attr1) (rid2,attr2) (rid2,attr3) Page: Attr1 Page: Attr2 Page: Attr3

... ... ...

DSM Example

SELECT *

FROM CUSTOMER

WHERE ID=2;

INSERT INTO T

VALUES(2, 'MARIA', 18, CURITIBA, F);

SELECT AVG(AGE)

FROM CUSTOMER;

(22)

(rid1,attr1) (rid1,attr2) (rid1,attr3) (rid2,attr1) (rid2,attr2) (rid2,attr3) Page: Attr1 Page: Attr2 Page: Attr3

... ... ...

DSM Example

SELECT *

FROM CUSTOMER

WHERE ID=2;

INSERT INTO T

VALUES(2, 'MARIA', 18, CURITIBA, F);

SELECT AVG(AGE)

FROM CUSTOMER;

- More write operations than NSM

- Read one attribute at a time

- Aggregations are fast do not read the entire DB!!

- Agressive compression

(23)

Quiz

What types of page layout are more likely for the following

queries? Why?

What types of page layout are more likely for the following

queries? Why?

Q1

SELECT *

_{FROM CUSTOMERS;}

Q2

Q3

SELECT SUM(BALANCE)

FROM ACCOUNT;

SELECT A.ID, A.BALANCE, C.NAME

FROM CUSTOMERS C, ACCOUNT A

WHERE A.ID=C.ID

Q4

SELECT C.CITY, SUM(A.BALANCE)

FROM CUSTOMERS C, ACCOUNT A

(24)

(25)

Adam Betty Smith

(26)

P0 k1 P1 k2 P2 ... ... km Pm

Entrada de índice

17

5 13 27

2* 3* 5* 11* 14*

Paginas não folha (só para direcionar)

Paginas folha (entradas de dados) >17

(27)

00 01 10 11 4* 12* 32* 1* 21* 10* Directory Buckets

If insert 5*, then 5=101, bucket '01' (read backwards)

(28)

Non-clustered Clustered Records

Data Index entries

Physically sorted: only one clustered index per table

(29)

Quiz

SELECT NAME, AGE

FROM CUSTOMER

WHERE AGE BETWEEN 18 AND 25;

Considering the following queries, what types of index need

to be implemented? Why?

Considering the following queries, what types of index need

to be implemented? Why?

Q1

SELECT NAME, AGE

FROM CUSTOMER

WHERE CUST_ID= 1234;

Q2

SELECT NAME, CITY, AGE, SSN

FROM CUSTOMER

WHERE AGE BETWEEN 18 AND 25

AND CITY ;

(30)

(31)

(32)

Query processing

SELECT BALANCE

FROM ACCOUNT

WHERE BALANCE > 2500;

Query

∏ BALANCE (

σ

BALANCE > 2500 (ACCOUNT))

σ

BALANCE > 2500 ( ∏ BALANCE (ACCOUNT)) Query expression

(33)

Query processing

SELECT BALANCE

FROM ACCOUNT

WHERE BALANCE > 2500;

Query

∏ BALANCE (

σ

BALANCE > 2500 (ACCOUNT))

May have n plans!!!

How does the DBMS rewrite a query?

σ

BALANCE > 2500 ( ∏ BALANCE (ACCOUNT)) Query expression

(34)

(35)

Query rewrite

Rule: Selection operations are

commutative

(36)

Query rewrite

Rule: Join operations are commutative if

the order of the attributes is not taken

(37)

Query rewrite

(38)

Query rewrite

Rule: Natural-joins are associative.

For theta-joins, this only holds if the

following join involves attributes only

from T2 and T3.

(39)

Query rewrite

Rule: It distributes when all the

attributes in the selection condition

involve only the attributes of one of the

expressions (say T1)

(40)

Query rewrite

Rule: Conjunctive selection operations

can be split into a sequence of individual

selections

(41)

Further rules can be found in the

readings.

(42)

Quiz

Please, rewrite the following expressions using the

presented rules and the following relations:

Please, rewrite the following expressions using the

presented rules and the following relations:

Q1

Q2

Q3

∏ BALANCE, C_ID (σ BALANCE > 2500 (ACCOUNT X CUSTOMER)) ∏ BALANCE, ACC_ID (CUSTOMER X ACCOUNT)

∏ BALANCE, ACC_ID (CUSTOMER X (ACCOUNT X BRANCH)) ACCOUNT(BALANCE, ACC_ID, C_ID, B_ID), CUSTOMER(C_ID), BRANCH(B_ID, B_NAME)

Q4

Q5

σ BALANCE > 2500 AND C_ID > 1000 (ACCOUNT X CUSTOMER)

(43)

Quiz

∏ BALANCE, ACC_ID (CUSTOMER X ACCOUNT)

σ BALANCE > 2500 ( ∏ BALANCE, C_ID (ACCOUNT X CUSTOMER))

∏ BALANCE, ACC_ID (BRANCH x (CUSTOMER X ACCOUNT ))

ACCOUNT(BALANCE, ACC_ID, C_ID, B_ID), CUSTOMER(C_ID), BRANCH(B_ID, B_NAME)

Please, rewrite the following expressions using the

presented rules and the following relations:

Please, rewrite the following expressions using the

presented rules and the following relations:

Q1

Q2

Q3

Q4

Q5

σ BALANCE > 2500 (σ C_ID > 1000 (ACCOUNT X CUSTOMER))

(44)

Database Catalog

DBMS and BD

Data structures

Buffer and page sizes

Indices

Views

The catalog is a meta-database that stores

informations about:

(45)

Database Catalog

Statistics

Cardinality of tables and indices

Domain values

Page sizes for tables and indices

The catalog is a meta-database that stores

informations about:

(46)

Database Catalog

(47)

(48)

Operators

I/O cost for Selection ( ∏ )

Scan

Scan on sorted data

Index scan (Clustered index)

O(M), where M = number of pages

O(log M), where M = number of pages

O(h + 1), where h = height of the index tree

Index scan (Non-Clustered index)

O(h + n), where n = number of fetched tuples.

(49)

Quiz

Compute of the cost for a query selecting 10% of a table 'R'

based on the following information from the catalog:

Compute of the cost for a query selecting 10% of a table 'R'

based on the following information from the catalog:

Relation 'R' 10,000 tuples Tuples/page ratio 100 tuples

Scan

Scan on sorted Clustered index scan

Non-clustered index scan

(50)

Quiz

Compute of the I/O cost for a query selecting 10% of a table

'R' based on the following information from the catalog:

Compute of the I/O cost for a query selecting 10% of a table

'R' based on the following information from the catalog:

Relation 'R' 10,000 tuples Tuples/page ratio 100 tuples

Scan 100 I/O Scan on sorted 6.6 I/O Clustered index scan 3 I/O

Non-clustered

(51)

(52)

Transactions

Definition: “A transaction is an atomic unit of work

that is either completed in its entirety or not done at

all.” [Navathe and Elmasri, 2005]

T1 Database r(balance) balance = 200 balance += 100 r(x)

(53)

active _commitedPartially Failed Terminated Begin _End Read, write abort Commit commited

Transaction State Transition

(54)

ACID properties

Atomicity

: a transaction is either completed in its

entirety or not done at all

Consistency

: a transaction is consistency preserving

if its complete execution takes the database from one

consistent state to another

Isolation

: The execution of a transaction should not be

interfered

Durability

: the changes applied by a commit must

persist in the database

(55)

Schedules

A schedule orders the execution of concurrent

transactions in an interleaving fashion.

T1 T2

r(x)

r(x) w(x) w(x)

(56)

Concurrency Conflicts

Concurrency problems: dirty read and lost update

T1 T2 r(balance) balance += 100 r(x) w(balance) w(x) w(x) r(balance) T1 T2 r(balance) r(balance) balance += 100 w(x) w(x) balance -= 100 w(balance) w(balance) abort balance-=100 w(balance)

(57)

Concurrency Conflicts

Concurrency problems: dirty read and lost update

T1 T2 r(balance) balance += 100 w(balance) r(balance) T1 T2 r(balance) r(balance) balance += 100 balance -= 100 w(balance) w(balance) abort balance-=100 w(balance)