• No results found

CSE 444: Database Internals. Lecture 4 Data storage and (more) buffer management

N/A
N/A
Protected

Academic year: 2021

Share "CSE 444: Database Internals. Lecture 4 Data storage and (more) buffer management"

Copied!
37
0
0

Loading.... (view fulltext now)

Full text

(1)

CSE 444: Database Internals

Lecture 4

Data storage and (more) buffer

management

(2)

Homework Logistics

•  Homework instructions are in a pdf file

•  Submit a single pdf or word file with your solution, or

•  Submit a hard copy in class

CSE 444 - Spring 2015 2

(3)

Important Note

•  Lectures show principles

•  You need to think through what you will actually implement in SimpleDB!

–  Try to implement the simplest solutions

•  If you are confused, tell us!

(4)

DBMS Architecture

Process Manager

Process Manager

Admission Control Connection Mgr

Query Processor

Parser

Query Rewrite Optimizer

Executor

Storage Manager

Access Methods Lock Manager

Buffer Manager Log Manager

Shared Utilities

Memory Mgr Disk Space Mgr Replication Services

Admin Utilities

[Anatomy of a Db System.

J. Hellerstein & M. Stonebraker.

Red Book. 4ed.] 4

(5)

Today: Starting at the Bottom

Consider a relation storing tweets:

Tweets(tid, user, time, content)

How should we store it on disk?

(6)

Design Exercise

•  Design choice: One OS file for each relation

–  This does not always have to be the case! (e.g., SQLite uses one file for whole database)

–  DBMSs can also use disk drives directly

•  An OS file provides an API of the form

–  Seek to some position (or “skip” over B bytes) –  Read/Write B bytes

CSE 444 - Spring 2015 6

(7)

First Principle: Work with Pages

•  Reading/writing to/from disk

–  Seeking takes a long time!

–  Reading sequentially is fast

•  To simplify buffer manager, want to cache a collection of same-sized objects

•  Solution: Read/write pages of data

(8)

Continuing our Design

Key questions:

•  How do we organize pages into a file?

•  How do we organize data within a page?

First, how could we store some tuples on a page?

Let’s first assume all tuples are of the same size Tweets(tid int, user char(10),

time int, content char(140))

CSE 444 - Spring 2015 8

(9)

Page Formats

Issues to consider

•  1 page = 1 disk block = fixed size (e.g. 8KB)

•  Records:

–  Fixed length –  Variable length

•  Record id = RID

–  Typically RID = (PageID, SlotNumber)

Why do we need RID’s in a relational DBMS ?

See future discussion on indexes and transactions

(10)

Page Format Approach 1

Fixed-length records: packed representation

Divide page into slots. Each slot can hold one tuple Record ID (RID) for each tuple is (PageID,SlotNb)

Slot1 Slot2 SlotN

Free space N

Number of records

10

How do we insert a new record?

CSE 444 - Spring 2015

(11)

Page Format Approach 1

Fixed-length records: packed representation

Divide page into slots. Each slot can hold one tuple Record ID (RID) for each tuple is (PageID,SlotNb)

Slot1 Slot2 SlotN SlotN+1

Free Sp. N

Number of records How do we insert a new record?

(12)

Page Format Approach 1

Fixed-length records: packed representation

Divide page into slots. Each slot can hold one tuple Record ID (RID) for each tuple is (PageID,SlotNb)

Slot1 Slot2 SlotN SlotN+1

Free Sp. N

Number of records

12

How do we insert a new record?

How do we delete a record?

CSE 444 - Spring 2015

(13)

Page Format Approach 1

Fixed-length records: packed representation

Divide page into slots. Each slot can hold one tuple Record ID (RID) for each tuple is (PageID,SlotNb)

Slot1 Slot2 SlotN SlotN+1

Free Sp. N

Number of records How do we insert a new record?

How do we delete a record? What is the problem?

(14)

Page Format Approach 1

Fixed-length records: packed representation

Divide page into slots. Each slot can hold one tuple Record ID (RID) for each tuple is (PageID,SlotNb)

Slot1 Slot2 SlotN SlotN+1

Free Sp. N

Number of records

14

How do we insert a new record?

How do we delete a record? Cannot move records! (Why?) How do we handle variable-length records?

(15)

Page Format Approach 2

Can handle variable-length records

Can move tuples inside a page without changing RIDs

Slot directory

Each slot contains

<record offset, record length>

Header contains slot directory + Need to keep track of nb of slots

+ Also need to keep track of free space (F)

Free space 4 F

(16)

CSE 444 - Spring 2015

Record Formats

Fixed-length records → Each field has a fixed length (i.e., it has the same length in all the records)

Field 1 Field 2 . . . . . . Field K

Information about field lengths and types is in the catalog

16

(17)

Record Formats

Variable length records

Remark: NULLS require no space at all (why ?)

Field 1 Field 2 . . . . . . Field K

Record header

(18)

CSE 444 - Spring 2015

Long Records Across Pages

•  When records are very large

•  Or even medium size: saves space in blocks

•  Commercial RDBMSs avoid this

page header

page header

R1 R2 R2 R3

18

(19)

LOB

•  Large objects

–  Binary large object: BLOB

–  Character large object: CLOB

•  Supported by modern database systems

•  E.g. images, sounds, texts, etc.

•  Storage: attempt to cluster blocks together

(20)

Continuing our Design

Our key questions:

•  How do we organize pages into a file?

•  How do we organize data within a page?

Now, how should we group pages into files?

CSE 444 - Spring 2015 20

(21)

Heap File Implementation 1

Data page

Data page

Data page

Data page

Data page

Data page

Data page

Data page A sequence of pages (implementation in SimpleDB)

Some pages have space and other pages are full Add pages at the end when need more space

Works well for small files

But finding free space requires scanning the file

(22)

CSE 444 - Spring 2015

Heap File Implementation 2

Header page

Data page

Data page

Data page

Data page

Data page

Data page Linked list of pages:

• Data page

• Data page Full pages

Pages with some free space

22

(23)

Heap File Implementation 3

Data page

Data page

Data page Better: directory of pages

Directory Header page

Directory contains free-space count for each page.

(24)

CSE 444 - Spring 2015

Modifications: Insertion

•  File is unsorted (= heap file)

–  add it wherever there is space (easy J) –  add more pages if out of space

•  File is sorted

–  Is there space on the right page ?

•  Yes: we are lucky, store it there

–  Is there space in a neighboring page ?

•  Look 1-2 pages to the left/right, shift records

–  If anything else fails, create overflow page

24

(25)

Overflow Pages

•  After a while the file starts being dominated by overflow pages: time to reorganize

Pagen-1 Pagen Pagen+1

Overflow

(26)

CSE 444 - Spring 2015

Modifications: Deletions

•  Free space in page, shift records

–  Be careful with slots

–  RIDs for remaining tuples must NOT change

•  May be able to eliminate an overflow page

26

(27)

Modifications: Updates

•  If new record is shorter than previous, easy J

•  If it is longer, need to shift records

–  May have to create overflow pages

(28)

Alternate Storage Manager Design: Column Store

28

Rows stored

contiguously on disk (+ tuples headers)

Columns stored contiguously on disk (no headers needed)

(29)

More Detailed Example

Row-based (4 pages)

A 1 A 2 A 2 A 2 Page

C 4 B 2 B 4

Column-based (4 pages)

A A A

1

A

2

C Page

2

C

4 4 4 B

2

B

2

C-Store also avoids large tuple headers

(30)

Continuing our Design

We know how to store tuples on disk in a heap file

How do these files interact with rest of engine?

•  Also see lecture 3

CSE 444 - Spring 2015 30

(31)

How Components Fit Together

Query Processor

Storage Manager

Access Methods: HeapFile, etc.

Buffer Manager

Operators: Sequential Scan, etc.

•  Operators: Process data

•  Access methods:

Organize data to support fast access to desired subsets of records

•  Buffer manager: Caches data in memory. Reads/

writes data to/from disk as needed

•  Disk-space manager:

Allocates space on disk

Disk Space Mgr

(32)

Access Methods

•  Operators view relations as collections of records

•  The access methods worry about how to organize these collections

CSE 444 - Spring 2015 32

(33)

Heap File Access Method API

•  Create or destroy a file

•  Insert a record

•  Delete a record with a given rid (rid)

–  rid: unique tuple identifier (more later)

•  Get a record with a given rid

–  Not necessary for sequential scan operator –  But used with indexes (more next lecture)

•  Scan all records in the file

(34)

Buffer Manager

•  Brings pages in from memory and caches them

•  Eviction policies

–  Random page (ok for SimpleDB) –  Least-recently used

–  The “clock” algorithm (see whiteboard or book)

•  Keeps track of which pages are dirty

–  A dirty page has changes not reflected on disk –  Implementation: Each page includes a dirty bit

CSE 444 - Spring 2015 34

(35)

Buffer Manager

Disk

Main memory Page requests from higher-level code

Buffer pool Disk page

Free frame

1 page corresponds Disk is a collection

of blocks

Buffer pool manager Access methods

(36)

Pushing Updates to Disk

•  When inserting a tuple, HeapFile inserts it on a page but does not write the page to disk

•  When deleting a tuple, HeapFile deletes tuple form a page but does not write the page to disk

•  The buffer manager worries when to write pages to disk (and when to read them from disk)

•  When need to add a new page to the file,

HeapFile adds page to the file on disk and then gets it again through the buffer manager

CSE 444 - Spring 2015 36

(37)

Conclusion

•  Row-store storage managers are most commonly used today

•  They offer high-performance for transactions

•  But column-stores win for analytical workloads

•  They are gaining traction in that area

•  Final discussion: OS vs DBMS

–  OS files vs DBMS files

References

Related documents

This is identified as a gap in the literature because supercritical propylene can be obtained at relatively mild temperature and pressure conditions presenting a

Pride of place in this 52 page programme must surely go to the 29 pages of advertisements; 4 pages on Weald- stone, 4 pages of stats, 2 pages of match reports covering the last two

Add any text you want to your home page, and add extra pages using the Google Pages Site Manager..

the Guanabara beaches (31 for Bica, 39 for Bananal and 46 for Coqueiros) is more or less similar to another Brazilian sandy beach whereas these values showed

Firstly, taking its cue from Eric Tucker’s mapping of multi-level ‘capital’ and ‘labour’ constitutions developed in the Canadian context, it suggests that the Greek case

This study con- ducts finite element (FE) simulations on four types of cornea-like structures as well as optical coherence elastography (OCE) experiments on contact lenses

Independent claim 1 recites in pertinent part: an imaging device that captures images of identifying information of the person or the object within the single space at