• No results found

Chapter 5. Foundations of of Information Systems Systems (WS (WS 2008/09) Database Management Systems. Database Management Systems

N/A
N/A
Protected

Academic year: 2021

Share "Chapter 5. Foundations of of Information Systems Systems (WS (WS 2008/09) Database Management Systems. Database Management Systems"

Copied!
38
0
0

Loading.... (view fulltext now)

Full text

(1)

Database Management Systems

Database Management Systems

Database Management Systems

Database Management Systems

Database Management Systems

ChapterChapter 5 –5 –

Foundations of Information Systems (WS 2008/09)

Foundations

(2)

DBMS DBMS architecturearchitecture User Interface User Interface User Interface Schema Manager Schema Manager

Schema Manager Query ManagerQuery ManagerQuery Manager Transaction Transaction Transaction ManagerManagerManager

Storage Manager

Storage

Storage ManagerManager

Data Dictionary Data Dictionary Database Database Log Log Main

(3)

Storage manager basics

Storage manager basics

User Interface

User Interface

User Interface

Schema Manager

Schema Manager

Schema Manager Query ManagerQuery ManagerQuery Manager Transaction Transaction Transaction ManagerManagerManager

Storage Manager

Storage

Storage ManagerManager

Data Dictionary Data Dictionary Database Database Log Log

(4)

Storage hierarchy

Storage hierarchy of a of a computercomputer

Register

Register

Cache

Cache

Main

Main

Memory

Memory

Disc

Disc

Storage

Storage

Archive

Archive

Storage

Storage

1

1

-

-

10 ns

10 ns

10

10

-

-

100 ns

100 ns

100

100

-

-

1000 ns

1000 ns

10 ms

10 ms

sec

sec

Access

Access timestimes ( (approxapprox.):.): 1 ms = 1/1000 sec 1 μs = 1/1000 ms 1 ns = 1/1000 μs 1 1 msms= 1/1000 sec= 1/1000 sec 1 1 μμs = 1/1000 mss = 1/1000 ms 1 1 nsns = = 1/1000 1/1000 μμss „Access gap“: factor 105 This is where This is where queries are queries are executed executed!! This is where This is where data are data are stored stored!!

(5)

Database

Database bufferbufferand and storage hierarchystorage hierarchy

Main

Main memory memory ((processorprocessor))

Background

Background store store (disc)(disc)

( (transienttransient)) ( (persistentpersistent)) replace replace fetch fetch buffered pages buffered pages •

• OperationsOperations on on datadata are alwaysare always performedperformed in in mainmain memory onlymemory only!! •

• ConsequenceConsequence: : Data Data to to be manipulated have be manipulated have to to be read into main store beforebe read into main store before modification

modification ((„„bufferingbuffering““) and ) and have have to to be written be written back back sometimes later sometimes later to to background storage

background storage media media by the by the DBMS. DBMS. •

(6)

Storing

Storing tuplestuplesin in files files on on pagespages

General

General principle principle of of mappingmappingrelations to relations to pagespages (on disc):(on disc):

• per per relationrelation: A : A filefile subdivided into several subdivided into several subsequent

subsequent pagespages •

• per per pagepage: An : An internalinternalrecord record table table containing containing pointers

pointers to all to all tuplestuples//records records on on this pagethis page •

• AdressingAdressingindividual tuplesindividual tuples via via aa tuple identifier

tuple identifier (TID)(TID) •

• TID TID consistsconsists of of twotwopartsparts:: •

• A A page number page number (absolute (absolute addressaddress) ) •

• A A pointer pointer in in the record the record table table pointing pointing toto a

a particular position within the pageparticular position within the page where the respective tuple starts

where the respective tuple starts

(relative

(relative addressaddress)) •

• IndirectionIndirection isis usefuluseful duringduring reorganisationreorganisation of

of thethe pagepage..

• • •• •• 1 2 3 1 2 3 TID TID 4711 2 4711 2 5001 5001 Foundations Foundations . . .. . . 5041 5041 Databases Databases . . .. . . 4052 4052 Logics Logics . . .. . . page page # 4711# 4711

(7)

Access

Access pathspathsand and indexesindexes •

• DBMS DBMS normally make use normally make use of of various internal auxiliary data structuresvarious internal auxiliary data structures in orderin order to

to speed speed up up access access to to larger tableslarger tables. Such additional . Such additional data structures are calleddata structures are called access paths

access paths.. •

• The most important The most important such such access aids are calledaccess aids are called indexesindexes, , special forms special forms ofof search trees offering direct access

search trees offering direct access to to values values of of particularly important particularly important andand frequently accessed attributes

frequently accessed attributes of a table (of a table (rather than sequentially searchingrather than sequentially searching all

all fields fields of all of all rowsrows). ). OftenOften, at least , at least the primary key attributethe primary key attribute(s) of a table(s) of a table are supported by

are supported by an an index for index for rapid rapid accessaccess.. •

• TheThe notionnotion ‚‚index'index' isis motiviatedmotiviated by the concept by the concept of of index pagesindex pages of of booksbooks. . HereHere,, an additional

an additional part part at at the the end of end of the book contains keywordthe book contains keyword--page reference listspage reference lists for avoiding

for avoiding to to search for certain keywords sequentiallysearch for certain keywords sequentially.. •

• SQLSQL offersoffers specialspecial DDLDDL--commands for creating commands for creating and and deleting indexes deleting indexes to to oneone or several columns

or several columns of a table:of a table:

CREATE INDEX <index-name> ON <table-name> <list-of-column-names>

CREATE

CREATE INDEXINDEX <index<index--name> ON <tablename> ON <table--name> <listname> <list--ofof--columncolumn--names>names> DROP

(8)

• Per Per relationrelation, , one index normally supports the primary keyone index normally supports the primary key ((oftenoften created automacreated automa- -tically

tically):): •

• In In additionaddition, , severalseveral secondary indexessecondary indexes may exist for other frequently used columnsmay exist for other frequently used columns,, even if these columns

even if these columns//attributes attributes do do not possess the key propertynot possess the key property. . •

• For For implementingimplementing indexesindexes, , special data structures are used by the special data structures are used by the DBMS, DBMS, the choicethe choice of

of which can be influenced by specially authorizedwhich can be influenced by specially authorized DB DB administrators administrators in in largelarge commercial

commercial systems :systems : •

• Search treesSearch trees:: •

• ISAMISAM--filesfiles •

• BB--treestrees and and variantsvariants •

• RR--trees trees and and variantsvariants ((higherhigher--dimensional dimensional index structuresindex structures)) •

• HashHash--tables tables ((tables associated with special access functionstables associated with special access functions)) •

• When using When using an an indexindex, , there is always there is always a a „„tradeofftradeoff"" between resourcesbetween resources:: A

A reduction reduction in in access access time has to time has to be measured against be measured against an an increase increase in in administrationadministration time

time forfor indexindexmaintenancemaintenance and in and in storage used by the storage used by the extra extra index dataindex data. . Indexes:

Indexes: OverviewOverview

Primärindex

Prim

(9)

• simplestsimplest form of form of indexingindexing::

• EachEach entry entry on on the the index pagesindex pages consists consists of of values values of of the indexed attribute the indexed attribute ((searchsearch key

key S). S). Each entry Each entry on on the the data pagesdata pages is is a a value value of of another attribute another attribute ((data data D). D). •

• On On the index pagesthe index pages, , search key values alternate with search key values alternate with pointerspointers to to correspondingcorresponding data pages

data pages. . •

• SearchSearch withinwithin a a pagepage isis donedone sequentiallysequentially! ! ModificationsModifications on on index pagesindex pages maymay lead

lead to to high high administration overheadadministration overhead ((rebalancing rebalancing of of neighbourneighbour pagespages,, redirection

redirection of of pointerspointers, , creation creation of of newnew pagespages, , moving moving of of keykey valuesvalues)) ISAM

ISAM--OrganisationOrganisation

S

S11 SS22 . . . . . . SSkk . . . . . . SSll . . . S. . . Smm

D

D1 1 DD22 . . . . . . DDss DDt t . . . . . . DDss . . .. . . DDv v . . . . . . DDww Index-Sequential Access Method (ISAM)

Index

Index--SequentialSequential Access Access Method Method (ISAM)(ISAM)

≥ ≥ SS11 < < SS22 index page index page data pages data pages . . . . . . . . . . . .

(10)

pointer

pointer

search key

search key

For

For primary indexprimary index:: data entry

data entry

For

For secondary indexsecondary index:: TID

TID B

B--treetree: : IdeaIdea

Each node corresponds

Each node corresponds to to one page one page in in storestore Tree is

Tree is balancedbalanced: equally long branches: equally long branches

More than

More than 50% of 50% of each page filled with dataeach page filled with data

Generalization

Generalization of of the the ISAMISAM--ideaidea to multiple

to multiple levels levels of of indexing indexing plusplus mixing data

mixing data and and index valuesindex values::

B

(11)

Storage structures

Storage structures: : Summary Summary of of main ideasmain ideas •

• A A databasedatabase physically consists physically consists of a of a selection selection of of filesfiles accessed solely by theaccessed solely by the storage manager

storage manager of of the the DBMS.DBMS. •

• These These files reside files reside on on largelarge, , persistent storage persistent storage media, media, normally normally on on discsdiscs. . This This part

part of a of a computercomputer‘‘s s store is called store is called secondary storagesecondary storage or background storeor background store, , as opposed

as opposed to to its comparatively smallits comparatively small, , but but fast fast primaryprimary or main or main storestore.. •

• There is There is no such no such thing as rows or columns thing as rows or columns in in filesfiles, , but but just just sequences sequences of of indiviindivi- -dual

dual symbolssymbols ((actually actually just just bitsbits!). !). ThusThus, , the the DBMS has to DBMS has to •

• codecode relational relational tuplestuples//fields as bit sequences when storing awayfields as bit sequences when storing away •

• decodedecode them again when fetching them for manipulationthem again when fetching them for manipulation •

• There are There are a a few more basic principles few more basic principles of of physical structure physical structure of a of a databasedatabase:: •

• The beginning The beginning of of each tuple each tuple ((stored as stored as a a bit sequencebit sequence) ) is markedis marked by

by a a reference called reference called tuple identifiertuple identifier (TID).(TID). •

• Tuples are summarized into larger units called Tuples are summarized into larger units called pagespages –– when when accessing the database

accessing the database no no tuplestuples, , but entire pages are fetchedbut entire pages are fetched.. •

(12)

Query

Query manager basicsmanager basics

User Interface

User Interface

User Interface

Schema Manager

Schema Manager

Schema Manager Query ManagerQuery ManagerQuery Manager Transaction Transaction Transaction ManagerManagerManager

Storage Manager

Storage

Storage ManagerManager

Data Dictionary Data Dictionary Database Database Log Log

(13)

Query

Query processingprocessing: : OverviewOverview

Scanning Scanning Parsing Parsing View expansion View expansion Optimization Optimization Optimization Code

Code generationgeneration

Interpretation

Interpretation Executable program

Executable program

Execution

Execution planplan SQL SQL queryquery RA RA--expressionexpression 1. 1. 2. 2. 3. 3. •

• One of One of the most important the most important tasks

tasks of a DBMS of a DBMS isis efficient efficient processing

processing of of queriesqueries.. •

• In In principleprinciple, , queriesqueries areare treated treated like

like programsprograms writtenwritten in a in a programprogram language

language, i.e., , i.e., theythey areare givengivento ato a compiler

compiler.. •

• The most important phase The most important phase of of the compilation process is the

the compilation process is the

attempt

attempt of of improving the improving the efficiency

efficiency of of thethe final final execution phase

execution phase, , called

(14)

• The optimization phase tries The optimization phase tries to to improve efficiencyimprove efficiency of of executionexecution, , even though it noteven though it not always may reach

always may reach an an „„optimaloptimal““ solution solution ((thus the notion is misleadingthus the notion is misleading).). •

• From From an an initialinitial internalinternal representation representation of of thethe queryquery in RA, an in RA, an equivalentequivalent but morebut more efficiently executable

efficiently executable algebra expressionalgebra expression is generated by equivalence preservingis generated by equivalence preserving transformations

transformations: :

• Afterwards Afterwards a a suitable suitable implementation variantimplementation variantis chosen for eachis chosen for each RARA--operatoroperator.. This

This isis donedone byby exploiting severalexploiting several specialspecial parametersparameters of of the operandthe operand relations relations such

such asas sizesize, , selectivityselectivity, , sortingsorting, , indexingindexing etc. etc. –– oftenoften cost models are usedcost models are used, , tootoo. .

• At At the the end of end of thethe optimization phaseoptimization phase a a query executionquery execution planplan ((shortshort: QEP) : QEP) isis genegene- -rated

rated whichwhich cancan bebe furtherfurther improvedimproved individuallyindividually ((„„byby handhand““) ) byby a DB a DB specialistspecialist when

when usingusing a a commercialcommercial DBMS ("DBMS ("database tuningdatabase tuning")."). •

• StrategiesStrategies and and heuristics used by heuristics used by a a commercial query optimizercommercial query optimizer are often veryare often very Query

Query optimizationoptimization: : PhasesPhases

Logical optimization

Logical

Logical optimizationoptimization

Physical optimization

Physical

(15)

Example

Example of of logicallogicaloptimization optimization (1)(1)

In which semester are those students enrolled which attend courses read by professor Sokrates?

In

In whichwhich semestersemester areare thosethose students enrolledstudents enrolled which attend courses

which attend courses readread byby professorprofessor Sokrates?Sokrates?

SELECT DISTINCT s.Semester FROM Students s,

Attends a, Courses c, Professors p

WHERE p.Name = 'Sokrates' AND p.PersNr = c.ReadBy

AND a.CourseNr = c.CourseNr AND a.MatrNr = s.MatrNr

SELECT DISTINCT

SELECT DISTINCT s.Semesters.Semester FROM

FROM StudentsStudents s,s,

Attends Attends a,a, Courses Courses c,c, Professors p Professors p WHERE

WHERE p.Name = 'Sokrates'p.Name = 'Sokrates' AND

AND p.p.PersNrPersNr = c.= c.ReadByReadBy AND

AND a.a.CourseNrCourseNr = c.= c.CourseNrCourseNr AND

AND a.a.MatrNrMatrNr = s.= s.MatrNrMatrNr in SQL

(16)

Example

Example ofoflogical optimizationlogical optimization (2)(2)

π

π

s.Semester s.Semester p.Name = 'Sokrates' p.Name = 'Sokrates' ∧∧ p.

p.PersNrPersNr = c.= c.ReadByReadBy∧∧

a.

a.CourseNrCourseNr= c. = c. CourseNrCourseNr ∧∧

a.

a.MatrNrMatrNr= s. = s. MatrNr MatrNr

σ

σ

×

×

p p

×

×

×

×

cc s s aa Operator

Operator treetree inin canonical

canonical representationrepresentation of

(17)

Example

Example ofoflogical optimizationlogical optimization (3)(3)

π

π

s.Semester s.Semester p.Name = 'Sokrates' p.Name = 'Sokrates' ∧∧ p.

p.PersNrPersNr = c.= c.ReadByReadBy∧∧

a.

a. CourseNrCourseNr = c.= c. CourseNrCourseNr∧∧

a.

a.MatrNrMatrNr= s. = s. MatrNr MatrNr

σ

σ

×

×

p p

×

×

×

×

cc s s aa Splitting and

Splitting and pushpushííng selectionsng selections

nicht verschiebbar !

(18)

Example

Example ofoflogical optimizationlogical optimization (4)(4)

π

π

s.Semester

s.Semester

p.

p.PersNrPersNr= c.= c.ReadBy ReadBy

σ

σ

×

×

p p

×

×

×

×

c c

σ

σ

p.Name = 'Sokrates'p.Name = 'Sokrates'

σ

σ

a.

a.CourseNrCourseNr = c.= c.CourseNr CourseNr

σ

σ

a.

a.MatrNrMatrNr = s. = s. MatrNr MatrNr

Forming joins from products

Forming joins from products

and

(19)

Example

Example ofoflogical optimizationlogical optimization (5)(5)

π

π

s.Semesters.Semester

p.

p.PersNrPersNr= c.= c.ReadBy ReadBy

p p c c s s aa

σ

σ

p.Name = 'Sokrates'p.Name = 'Sokrates'

a.

a.CourseNrCourseNr = c.= c.CourseNrCourseNr

a.

a.MatrNrMatrNr = s. = s. MatrNr MatrNr

Reordering

Reordering joinsjoins

Smallest

Smallest operandoperand:: Should participate Should participate inin 1st 1st joinjoin!! 1 1 2 2 3 3

(20)

Example

Example ofoflogical optimizationlogical optimization (6)(6)

π

π

s.Semesters.Semester

p.

p.PersNrPersNr= c.= c.ReadBy ReadBy

c c s s a a

σ

σ

p.Name = 'Sokrates'p.Name = 'Sokrates'

a.

a.CourseNrCourseNr = c.= c.CourseNr CourseNr a.

a.MatrNrMatrNr = s. = s. MatrNr MatrNr

. . . all

. . . all otherotherjoinsjoins

are then ordered

are then ordered

automatically automatically 33 2 2 1 1

(21)

• JustificationJustification forfor reorderingsreorderings duringduring logicallogicaloptimizationoptimization:: In

In eacheach stepstep equivalence preservingequivalence preserving transformationstransformations are appliedare applied..

• AlwaysAlways:: •

• Answer setAnswer set is is identicalidentical overover eacheach DBDB--statestate beforebefore and and afterafter transformation transformation (

(„„equivalenceequivalence")."). •

• EvaluationEvaluation of of thethe transformedtransformed expressionexpression isis oftenoften (in (in many casesmany cases, , notnot alwaysalways) ) more

more efficientefficient than before thethan before the transformationtransformation. . •

• ButBut:: •

• Transformations Transformations do do notnot come come with any guarantee forwith any guarantee for efficiencyefficiency gainsgains!! •

• Transformation Transformation rulesrules areareheuristicsheuristics ((„„rulesrules of of thumbthumb"), "), which have which have to to be be applied

applied in a in a „„cleverclever““ way.way. •

• ThereforeTherefore: : •

• NotionNotion „„optimizationoptimization" " isis misleadingmisleading !! •

• IfIf at all, at all, only certain only certain improvementsimprovements of of efficiencyefficiency cancan bebe achievedachieved, , rarely

rarely a (a (theoreticallytheoreticallyimaginableimaginable) optimal ) optimal solutionsolution will will bebe reachedreached.. Logical

Logical optimizationoptimization: : TheoreticalTheoreticalfoundationsfoundations

σ

p

(R

1

×

R

2

)

(

σ

p

(R

1

))

×

R

2

σ

σ

pp

(R

(R

11

×

×

R

R

22

)

)

(

(

σ

σ

pp

(R

(R

11

))

))

×

×

R

R

22

if attr

if attr(p) (p) ⊆⊆ attr(Rattr(R11)) and

and attrattr(p) (p) ∩∩ attr(Rattr(R22) = ) = ∅∅

e.g.:

(22)

Physical optimization

Physical optimization : : OverviewOverview

" "logicallogical" " RA RA--expressionexpression improved improved logical logical RA RA--expressionexpression Evaluation Plan Evaluation Plan (

(usingusing physicalphysical RA RA--operatorsoperators)) Logical Logical optimization optimization Physical Physical optimization optimization Main

Main taskstasks of of physicalphysical optimizationoptimization:: •

• ChoosingChoosing suitablesuitable algorithmsalgorithms forfor physicallyphysically realizingrealizing RA

RA--operatorsoperators: : physicalphysical operatorsoperators, , physical algebraphysical algebra •

• ExploitingExploiting exististing exististing access pathsaccess paths to to storagestorage structuresstructures:: indexes

indexes, B, B--treestrees, , hashhash--tablestables •

• SortingSorting of of intermediate resultsintermediate results •

• Creating temporary Creating temporary indexesindexes and and hashhash--tablestables •

• PlanningPlanning of of intermediate storage usageintermediate storage usage •

• Estimating costsEstimating costs of different of different implementation variantsimplementation variants based

(23)

Project Project Sort Sort Select Select Physical

Physical optimizationoptimization: : ExampleExample

π

π

s.Semesters.Semester

p.

p.PersNrPersNr= c.= c.ReadBy ReadBy

p p c c s s a a

σ

σ

p.Name = 'Sokrates'p.Name = 'Sokrates' a.

a.CourseNrCourseNr = c.= c.CourseNr CourseNr a.

a.MatrNrMatrNr= s. = s. MatrNr MatrNr

s.Semester

s.Semester

p.

p.PersNrPersNr= c.= c.ReadBy ReadBy c c s s a a p.Name = 'Sokrates' p.Name = 'Sokrates' h.

h.CourseNrCourseNr= v.= v.CourseNrCourseNr h.

h.MatrNrMatrNr= s. = s. MatrNr MatrNr

Index

IndexJoinJoin Merge

MergeJoinJoin Index

IndexJoinJoin

c.

c.CourseNrCourseNr SortSorth.h.CourseNrCourseNr

Hash Hash IndexDup IndexDup Result Result ofof logical logical optimization optimization:: QEP

QEP afterafter physical

physical

optimization

(24)

Operators of

Operators of physicalphysical algebraalgebra

Project Projectaa Union Union Select Selectcc IndexSelect IndexSelect Each

Each operatoroperator of RA ("of RA ("logicallogical algebraalgebra") ") is associated withis associated with oneone oror more proceduresmore procedures realizing this

realizing this operationoperation in in physicalphysical storagestorage: Operators of : Operators of physicalphysical algebraalgebra •

• ProjectionProjection withoutwithout duplicateduplicate eliminationelimination:: •

• UnionUnion withoutwithout duplicate eliminationduplicate elimination :: •

• Duplicate elimininationDuplicate eliminination:: •

• "naive" "naive" basicbasic versionversion:: •

• usingusing sortingsorting:: •

• using using an an IndexIndex:: •

• SelectionSelection:: •

• withoutwithout usingusing an an indexindex:: •

• withwith usingusing anan indexindex ::

NestedDup NestedDup SortDup SortDup IndexDup IndexDup

(25)

Operators of

Operators of physical algebraphysical algebra (2)(2)

Sort Sortaa Bucket Bucket Hash Hashaa IndexJoin IndexJoincc

• JoinJoin ((analoguoslyanaloguosly: : DifferenceDifference, , intersectionintersection):): •

• "naive" "naive" basic versionbasic version:: •

• on on sortedsorted inputinput relations:relations: •

• using using an an indexindex:: •

• using using a a hashhash--table table in in mainmain memorymemory::

• IntermediateIntermediate storingstoring:: •

• withwith temporarytemporary materialisationmaterialisation of

of results results relations in relations in memorymemory:: •

• M. M. withwith sortingsorting of of resultresult relationrelation:: •

• M. M. withwith indexingindexing the result relationthe result relation :: •

• SortingSorting of of intermediateintermediate resultresult relations:relations:

NestedLoop NestedLoopcc MergeJoin MergeJoincc MergeSort MergeSort HashJoin HashJoincc Tree Treeaa

(26)

Discussion

Discussion of of samplesampleevaluation evaluation planplan

Project Project Sort Sort Select Select s.Semester s.Semester p.

p.PersNrPersNr= c.= c.ReadBy ReadBy

c c s s a a p.Name = 'Sokrates' p.Name = 'Sokrates' a.

a.CourseNrCourseNr= c.= c.CourseNr CourseNr a.

a.MatrNrMatrNr= s. = s. MatrNr MatrNr

Index

IndexJoinJoin Merge

MergeJoinJoin Index

IndexJoinJoin

c.

c. CourseNr CourseNr Sort

Sort a. a. CourseNr CourseNr Hash Hash Index

IndexDupDup

Temporary hash

Temporary hash--tabletable in

in main memorymain memory

Primary index

Primary index on son s

Secondary index

Secondary index on con c

Only

Only partpartof of primary key primary key on a

on a ⇒⇒ no no IndexJoin IndexJoin

No

(27)

Transcation manager basics

Transcation manager basics

User Interface

User Interface

User Interface

Schema Manager

Schema Manager

Schema Manager Query ManagerQuery ManagerQuery Manager Transaction Transaction Transaction ManagerManagerManager

Storage Manager

Storage

Storage ManagerManager

Data Dictionary Data Dictionary Database Database Log Log

(28)

Transactions

Transactions

• Every change Every change (update) of a (update) of a database is performed database is performed in in special unitsspecial units comprising comprising one or more

one or more update update statements statements ((possibly combined with queriespossibly combined with queries):):

• Transactions Transactions in in SQLSQL are marked by special are marked by special keywordskeywords:: •

• BEGIN_OF_TRANSACTION, BEGIN_OF_TRANSACTION, for starting for starting a a transactiontransaction,, •

• COMMIT, COMMIT, for successful completionfor successful completion, , oror •

• ROLLBACK, ROLLBACK, for aborting for aborting a a failed transaction explicitlyfailed transaction explicitly. . •

• Transactions are Transactions are „„packagespackages““ of of individual steps treated with individual steps treated with „„particular careparticular care““ by the

by the DBMS: DBMS: Transactions are always guaranteeing some Transactions are always guaranteeing some form of form of logicallogical and

and physical physical consistency consistency of of the databasethe database!! •

• Individual Individual update update statementsstatements are treated like are treated like „„minimini--transactionstransactions““ implicitlyimplicitly,, even

even without enclosing them into without enclosing them into BEGINBEGIN--COMMIT.COMMIT. •

• The The transaction managertransaction manager is one is one of of the most important components the most important components of a DBMS.of a DBMS. Transactions

Transactions

(29)

Transactions

Transactions: Motivation: Motivation

Transactions take place

Transactions take place in in many areas many areas of real life all of real life all the the time,time, e.g.

e.g. when using when using a a cash cash machinemachine for receiving for receiving cash cash moneymoney or or when transfering money

when transfering money byby home bankinghome banking. . e.g.

e.g..: .: Transfer 50 Transfer 50 €€ fromfrom accountaccount A to A to accountaccount B !B !

begin_of_transaction; read(A, a); a := a -50; write(A, a);

read(B, b); b := b+50; write(B, b); commit

begin

begin_of__of_transactiontransaction; ; readread(A, a); a :(A, a); a :== a -a -50; 50; writewrite(A, a); (A, a); read

read(B, b); b :(B, b); b :== b+50; writeb+50; write(B, b); (B, b); commitcommit

300 300 200 200 A A B B 250 250 200 200 A A B B 250 250 250 250 A A B B Initial

Initial statestate Intermediate stateIntermediate state Final Final statestate

(

(consistentconsistent)) ((inconsistentinconsistent)) ((consistentconsistent)) local

(30)

Transactions

Transactions: Motivation (2): Motivation (2)

Transaction management

Transaction management often takes place often takes place in a in a context context where

where severalseveral competingcompetingtransactionstransactions concurrentlyconcurrently trytry to

to accessaccess a a centralcentral, , jointly used databasejointly used database. . e.g.

e.g..: .: BookBook a a seatseat on on thethe nextnext flightflight fromfrom Cologne to Delhi !Cologne to Delhi !

Mr. A Mr. A Mr. BMr. B begin_of_transaction; read(free, a); write(occupied, a); commit begin

begin_of__of_transactiontransaction;; read

read((freefree, a);, a); write

write((occupiedoccupied, a);, a); commit commit begin_of_transaction; read(free, b); write(occupied, b); commit begin

begin_of__of_transactiontransaction;; read

read((freefree, b);, b); write

write((occupiedoccupied, b);, b); commit commit T TAA TT B B

What happens if there is only one seat left

(31)

• Sequences Sequences of of updates are reasonably handled only if they are executed updates are reasonably handled only if they are executed entirelyentirely and

and without observable interruptionwithout observable interruption. . •

• TransactionsTransactions expectexpect •

• ConsistencyConsistency of of the the final final state state (i.e., (i.e., satisfaction satisfaction of all of all integrity constraintsintegrity constraints)) •

• PersistencePersistence of of thethe resulting changesresulting changes •

• Resistence Resistence of of the the DBMS DBMS againstagainst machine errorsmachine errors and and concurrentconcurrent accessaccess •

• IfIf anyany transactiontransaction has has reachedreached a a state where any state where any of of these expectations can these expectations can no no longerlonger be satisfied

be satisfied, , it it has to has to be be interruptedinterrupted by the by the DBMS and DBMS and thethe initialinitial state from which state from which this particular

this particular transactiontransaction has has startedstarted has to has to reconstructedreconstructed::

• If If a a transactiontransaction hadhad to to bebe rolledrolled back back duedue to to externalexternal errorserrors,, thethe DBMS will DBMS will automaautoma- -tically try

tically try to to restart it restart it at at the next possible moment the next possible moment in time.in time.

• HoweverHowever, , if the transaction if the transaction has has been stopped due been stopped due to to integrity violationsintegrity violations it it has has caused itself

caused itself, , rollbackrollback isis performedperformed, , butbut nono restart is attemptedrestart is attempted forfor thisthis transactiontransaction.. Policy

Policy of of transaction handlingtransaction handling

Recovery Recovery Recovery Rollback Rollback Rollback

(32)

Characteristics

Characteristics of of transactionstransactions: : TheTheACIDACID--paradigmparadigm

Transactions thus are

Transactions thus are characterisedcharacterised byby thethe followingfollowing fourfour propertiesproperties addressedaddressed together

together byby thethe notionnotion of of „„ACIDACID--paradigmparadigm““:: • Atomicity:

Transactions are indivisible, i.e., they are either executed entirely, or not at all („all or nothing-principle“).

• Consistency:

Transactions lead from consistent initial states to consistent final states (but possibly via inconsistent intermediate states).

• Isolation:

Concurrently performed transactions on the same database (in multi-user mode) do not influence each other, even though their individual steps may be interleaved by the DBMS for impro-ving efficiency.

• Durability:

The effect of a successfully completed transaction is maintained

persistently in the DB, unless explicitly reversed later on by follow-up transactions.

• AAtomicitytomicity::

Transactions

Transactions areare indivisible, i.e., indivisible, i.e., they are either executed entirelythey are either executed entirely, , or not

or not at all (at all („all „all oror nothing-nothing-principleprinciple““).). •

• CConsistencyonsistency::

Transactions

Transactions lead fromlead from consistentconsistent initial statesinitial states to consistentto consistent finalfinal states

states (but possibly(but possibly viavia inconsistentinconsistent intermediateintermediatestatesstates).). •

• IIsolationsolation::

Concurrently

Concurrently performedperformed transactionstransactions on the sameon the same databasedatabase (in multi

(in multi--useruser mode) do mode) do notnot influenceinfluence eacheach other, other, eveneven though though their

their individualindividual stepssteps maymay bebe interleavedinterleaved byby thetheDBMS forDBMS for impro-impro -ving

ving efficiencyefficiency.. •

• DDurabilityurability:: The

The effecteffect of a successfullyof a successfully completedcompleted transaction is maintained transaction is maintained persistently

persistently in thein the DB, unless explicitly reversed later DB, unless explicitly reversed later on by followon by follow--upup transactions

(33)

Components

Components of a of a transaction managertransaction manager Resulting from the

Resulting from the ACIDACID--paradigmparadigm: : three main

three main components components of a of a transaction managertransaction manager •

• Consistency checkerConsistency checker:: •

• Checks Checks all all affected integrity constraints after each individual stepaffected integrity constraints after each individual step in a

in a transaction transaction (IMMEDIATE) (IMMEDIATE) or or at at the the end of a end of a transactiontransaction (DEFERRED).

(DEFERRED).

• Stops Stops transaction transaction on on integrity violations integrity violations and and initiates rollbackinitiates rollback.. •

• SchedulerScheduler:: •

• Determines Determines a proper (a proper („„nonnon--interferringinterferring““) ) sequence sequence of of steps steps of conof con- -current transactions

current transactions ((„„interleavinginterleaving““) ) using resources efficientlyusing resources efficiently.. •

• Maintains the Maintains the „„illusionillusion““ of a of a sequential execution sequential execution ((serializabilityserializability). ). •

• Recovery managerRecovery manager:: •

• Records Records individual steps individual steps of of each transaction each transaction in a in a special protocolspecial protocol storage area

storage area, , called the called the DB logDB log ((„„logginglogging““).). •

• Manages rollback Manages rollback and and restart after restart after system system failures according failures according to to thethe log

(34)

Lost update

Lost update--problemproblem

write(

A

,

a

2

)

a

2

:=

a

2

1.03

read(

A

,

a

2

)

T2

write(

B

,

b

)

b

1

:=

b

1

+ 300

read(

B

,

b

1

)

write(

A

,

a

1

)

a

1

:=

a

1

– 300

read(

A

,

a

1

)

T1

9.

8.

7.

6.

5.

4.

3.

2.

1.

Step

„Lost update problem"

„Lost update Lost update problemproblem""

Expl. of

Expl. of twotwo unsynchronizedunsynchronized transactionstransactions, , thethe execution

execution of of which causes changes which causes changes to to be be „„lostlost““::

T1: Transferring 300 € from account A to B T2: Paying 3% interest to account A T T11: Transferring 300 : Transferring 300 €€ from

from accountaccountA to BA to B T

T22: : PayingPaying3% 3% interestinterest to to accountaccount AA A B A B 5300 5300 €€ 4700 4700 €€ 5459 5459 €€ 5000 5000 €€ 5000 5000 €€

(35)

Lost update

Lost update--problemproblem (2) (2)

T2 a a11 := := aa11 ––300300 read( read(AA,,aa11)) T1

9.

8.

7.

6.

5.

4.

3.

2.

1.

Step

A B A B 5300 5300 €€ 4700 4700 €€ 5000 5000 €€ 5150 5150 €€ 5000 5000 €€ There would have been two intuitively correct

There would have been two intuitively correct

solutions

solutions:: 1.

1. casecase: T: T11 executed beforeexecuted before TT22

write

write(A,a(A,a11))

read read(B,b(B,b11)) b b11 := := bb11 + 300+ 300 write write(B,b(B,b11)) read

read(A,a(A,a22))

write

write(A,a(A,a22)) a a22 := := aa22 ∗∗1.031.03 5000 5000 €€ T1: Transferring 300 € from account A to B T2: Paying 3% interest to account A T T11: Transferring 300 : Transferring 300 €€ from

from accountaccountA to BA to B T

T22: : PayingPaying3% 3% interestinterest to

(36)

Lost update

Lost update--problemproblem (3) (3)

T2 T1

9.

8.

7.

6.

5.

4.

3.

2.

1.

Schritt

A B A B 5300 5300 €€ 4700 4700 €€ 5459 5459 €€ 5159 5159 €€ 5000 5000 €€ Another

Another semanticallysemantically „„clean" clean" solutionsolution:: 2.

2. casecase: T: T22 executedexecuted beforebefore TT11

write

write(A,a(A,a11))

read read(B,b(B,b11)) b b11 := := bb11 + 300+ 300 write write(B,b(B,b )) read

read(A,a(A,a22))

write

write(A,a(A,a22)) a

a22 := := aa22 ∗∗1.031.03

a

a11 := := aa11 −−300300

read

read(A,a(A,a11))

4700 4700 €€ T1: Transferring 300 € from account A to B T2: Paying 3% interest to account A T T11: Transferring 300 : Transferring 300 €€ from

from accountaccountA to BA to B T

T22: : PayingPaying3% 3% interestinterest to

(37)

Lost update

Lost update--problemproblem (4) (4)

T2 T1

9.

8.

7.

6.

5.

4.

3.

2.

1.

Step

A B A B 5300 5300 €€ 4700 4700 €€ 5459 5459 €€ 5159 5159 €€ 5000 5000 €€ An

An „„interleavedinterleaved" " execution execution of of both transactionsboth transactions producing

producing a a „„correctcorrect““ resultresult::

write

write(A,a(A,a11))

read read(B,b(B,b11)) b b11 := := bb11 + 300+ 300 write write(B,b(B,b11)) read

read(A,a(A,a22))

write

write(A,a(A,a22)) a

a22 := := aa22 ∗∗1.031.03

a

a11 := := aa11 −−300300

read

read(A,a(A,a11))

T1: Transferring 300 € from account A to B T2: Paying 3% interest to account A T T11: Transferring 300 : Transferring 300 €€ from

from accountaccountA to BA to B T

T22: : PayingPaying3% 3% interestinterest to

(38)

Resumee

Resumee

There is much

There is much, , much more much more to to be learnt be learnt about databases

about databases and and database managementdatabase management, ..., ...

...

... butbut: : This lecture cannot provide more than This lecture cannot provide more than just a just a „„tastertaster““ of of this excitingthis exciting and

and very very relevant relevant area area of of information processinginformation processing..

Time is over, folks!

Next week, we‘ll do a last round of „exam training“ –

make sure everybody attends:

I want each of you to pass straightaway!!

Time

Time is overis over, , folksfolks!! Next week

Next week, , wewe‘‘ll ll do a last do a last roundround of

of „„exam trainingexam training““ ––

make sure everybody attends

make sure everybody attends:: I

References

Related documents

4.11 Contractor whose tender is accepted shall obtain a valid licence under the Contract Labour (Regulation &amp; Abolition) Act 1970 and the Contract Labour (Regulation

First discussed by representatives of Saudi Arabia, Djibouti, Egypt, Jordan, Somalia, Sudan and Ye - men during meetings in Cairo and Riyadh in 2017 and 2018, the entity was founded

This condition appears in a “strictly confidential” letter while it is not present on a non-confidential version signed one day after (Director of Education, May

However, the analysis of the data showed that executives' concerns and the skepticism of adopting the cloud infrastructure were the lack of control of data and

The following variables were no significant predictors and were excluded from the model: Sex, Age, MRC dyspnoea scale, FEV 1 % pred., BMI, D BMI, HADS-A, D HADS-A, HADS-D, D HADS-D,

Brake Cables ABQ - available both distribution centres Brake Cleaner CRC - available both distribution centres Brake Pads &amp; Shoes PRM - available both distribution centres

Stenca Trading offers a wide range of different noise reduction solutions such as acoustic pipe insulation, valve- &amp; flange insulation, and sound ab- sorbing panel solutions

In high-quality online courses, students learn as much as in face-to-face courses (Meyer, Human Factors • Instructor/professor • Instructional Associate/Graduate Assistant