• No results found

Databases and Information Systems 1 Part 3: Storage Structures and Indices

N/A
N/A
Protected

Academic year: 2021

Share "Databases and Information Systems 1 Part 3: Storage Structures and Indices"

Copied!
12
0
0

Loading.... (view fulltext now)

Full text

(1)

Databases and Information Systems 1

Part 3: Storage Structures and Indices

Prof. Dr. Stefan Böttcher

Fakultät EIM, Institut für Informatik Universität Paderborn WS 2009 / 2010 Contents: - database buffer - storage structures - indices

(2)

Database buffer - why?

on database disk database in main memory

unsafe data safe data

load

store DB in main memory

 organized in pages

disk read or write = 100.000-1.000.000 main memory accesses  optimize / reduce disk access

 changes of tuples are collected in pages + page is stored on disk from time to time

(3)

Database buffer - own management

on database disk database in main memory

unsafe data safe data

load

log and store

DBMS has own main memory management because of: - controlled page replacement (needed for recoverability) - page replacement optimized for query optimization

 may store uncommitted data on disk  recovery uses additional Log

to be able to restore consistent data

(4)

Coupling database buffer and application program on database disk database in main memory application program in main memory Communication area load store

Application program must not directly read or write to database in main memory

 Communication area can be used to write to database in main memory

data exchange between application program and main memory part of DB

(5)

Storage structures and indices

database

(6)

Storage Structures - B-trees

Key 1 Data Ptr 1 Tree Ptr 2 Tree Ptr 1 Key q-1 Data Ptr q-1 Tree Ptr q Data Data

B-Tree 1 B-Tree 2 B-Tree q

key in sub-B-tree K < key K key in sub-B-Tree K+1

p = maximum number of sub-tree pointers q = given number of pointers, qp

q-1 = number of keys

unique tree depth

at least p/2 pointers per node (except root and leaf nodes)

Key 1 Data Ptr 1 Tree Ptr 2 Tree Ptr 1 Key q-1 Data Ptr q-1 Tree Ptr q Data Data

B-Tree 1 B-Tree 2 B-Tree q

(7)

Example of B-trees

• 64 bit–addresses for data, i.e. 8 Byte for each pointer • 4K = 4096 Byte per disk block (and main memory page) • 4 Byte for Integer-key  at most 2^32 values

==> pages can store (p-1) triples of

(pointer to left sub-tree , key , pointer to data record) plus pointer to right-most sub-tree

==> fan-out of the tree is (4096-8) div ( 8+4+8 ) + 1 = 205

each page can address at most 204 data records B-tree depth items addressable at least / at most

1 0 204

2 2*102 205*204 4*104

3 2*102*102 205*205*204 8*106

(8)

Improvement: B+-trees - inner nodes

Key 1 Data Ptr 1 Tree Ptr 2 Tree Ptr 1 Key q-1 Data Ptr q-1 Tree Ptr q Data Data

B+-Tree 1 B+-Tree 2 B+-Tree q

Key in Sub-B+-Tree K < Key K Key in Sub-B+-Tree K+1

p = maximum number of pointers

q = given number of pointers, qp

q-1 = number of keys

unique tree depth

at least p/2 pointers per node (except root and leaf nodes)

(9)

Leaf nodes of B+-Trees

Key 1 Data Ptr 1 Tree Ptr 2 Tree Ptr 1 Key q-1 Data Ptr q-1 Leaf-Ptr q Data Data

keys correspond with keys in the data.

• contain pointers to the data

• do not contain any pointer to a sub-tree • contain a pointer to the next leaf node

(10)

Root nodes of B+-trees

p = maximum number of pointers minimum number of pointers: 2

(except for trivial case of less than 2 data records, where root node is leaf node)

(11)

Example of B+-trees

• 64 bit–addresses for data, i.e. 8 Byte for each pointer • 4K = 4096 Byte per disk block (and main memory page) • 4 Byte for Integer-key  at most 2^32 values

==> leaf pages can store

one pointer to next leaf page (8 Byte) and

4096 div ( 4+8 ) = 340 (key, data pointer) - pairs ==> fan-out 340

internal pages can store

one pointer to last sub-tree (8 Byte) and

4096 div ( 4+8 ) = 340 (Key, data pointer) - pairs ==> fan-out 341

B+-tree depth items addressable at least / at most

1 0 340

2 2*170 341*340 105

3 2*171*170 341*341*340 4* 107

(12)

Storage Structures - Hashing

hash-function h : key  bucket (=data container)

insert: full ?  overflow container

search: also in  overflow container

h(key)

References

Related documents

This paper explores the recent rise in populism on both sides of the Atlantic by look- ing at three cases where populism has been successful including in the election of Donald

Third, to assess the possible impact of investment and trade liberalisation between certain countries upon FDI going to excluded countries we estimate gravity equations using data

A.C.E.s with up-to- date information on recent standard modifications Growing number of implementations Contribute to education program and documentation / training

Narrator #1: Sure enough, just like the angel had said, Zechariah and Elizabeth were soon expecting their first baby… despite their old age.. But Zechariah still

 Dozvoljavanje lokalnim vlastima izdavanje lokalnih valuta i poticanje lokalnih zajednica na formiranje shema lokalne razmjene ( eng. LETS - Local exchange trading schemes)

Review Specifications Review Test Cases Translator Developer Reviewer Tester End Client Partner Review Design Review Code. High end custom software development

The purpose of this report has been to provide a synthesis or appraisal of results from large-scale national and international assessments in Australia over the past 20 years

Even though the children of Israel also claim a common ancestor 14 like the Yorùbá, it is of interest to note other shared features in the stories of the people such as