Database Management Systems
Database Management Systems
Database Management Systems
Database Management Systems
Database Management Systems
–
– ChapterChapter 5 –5 –
Foundations of Information Systems (WS 2008/09)
Foundations
DBMS DBMS architecturearchitecture User Interface User Interface User Interface Schema Manager Schema Manager
Schema Manager Query ManagerQuery ManagerQuery Manager Transaction Transaction Transaction ManagerManagerManager
Storage Manager
Storage
Storage ManagerManager
Data Dictionary Data Dictionary Database Database Log Log Main
Storage manager basics
Storage manager basics
User Interface
User Interface
User Interface
Schema Manager
Schema Manager
Schema Manager Query ManagerQuery ManagerQuery Manager Transaction Transaction Transaction ManagerManagerManager
Storage Manager
Storage
Storage ManagerManager
Data Dictionary Data Dictionary Database Database Log Log
Storage hierarchy
Storage hierarchy of a of a computercomputer
Register
Register
Cache
Cache
Main
Main
Memory
Memory
Disc
Disc
Storage
Storage
Archive
Archive
Storage
Storage
1
1
-
-
10 ns
10 ns
10
10
-
-
100 ns
100 ns
100
100
-
-
1000 ns
1000 ns
10 ms
10 ms
sec
sec
AccessAccess timestimes ( (approxapprox.):.): 1 ms = 1/1000 sec 1 μs = 1/1000 ms 1 ns = 1/1000 μs 1 1 msms= 1/1000 sec= 1/1000 sec 1 1 μμs = 1/1000 mss = 1/1000 ms 1 1 nsns = = 1/1000 1/1000 μμss „Access gap“: factor 105 This is where This is where queries are queries are executed executed!! This is where This is where data are data are stored stored!!
Database
Database bufferbufferand and storage hierarchystorage hierarchy
Main
Main memory memory ((processorprocessor))
Background
Background store store (disc)(disc)
( (transienttransient)) ( (persistentpersistent)) replace replace fetch fetch buffered pages buffered pages •
• OperationsOperations on on datadata are alwaysare always performedperformed in in mainmain memory onlymemory only!! •
• ConsequenceConsequence: : Data Data to to be manipulated have be manipulated have to to be read into main store beforebe read into main store before modification
modification ((„„bufferingbuffering““) and ) and have have to to be written be written back back sometimes later sometimes later to to background storage
background storage media media by the by the DBMS. DBMS. •
Storing
Storing tuplestuplesin in files files on on pagespages
General
General principle principle of of mappingmappingrelations to relations to pagespages (on disc):(on disc):
•
• per per relationrelation: A : A filefile subdivided into several subdivided into several subsequent
subsequent pagespages •
• per per pagepage: An : An internalinternalrecord record table table containing containing pointers
pointers to all to all tuplestuples//records records on on this pagethis page •
• AdressingAdressingindividual tuplesindividual tuples via via aa tuple identifier
tuple identifier (TID)(TID) •
• TID TID consistsconsists of of twotwopartsparts:: •
• A A page number page number (absolute (absolute addressaddress) ) •
• A A pointer pointer in in the record the record table table pointing pointing toto a
a particular position within the pageparticular position within the page where the respective tuple starts
where the respective tuple starts
(relative
(relative addressaddress)) •
• IndirectionIndirection isis usefuluseful duringduring reorganisationreorganisation of
of thethe pagepage..
• • •• •• 1 2 3 1 2 3 TID TID 4711 2 4711 2 5001 5001 Foundations Foundations . . .. . . 5041 5041 Databases Databases . . .. . . 4052 4052 Logics Logics . . .. . . page page # 4711# 4711
Access
Access pathspathsand and indexesindexes •
• DBMS DBMS normally make use normally make use of of various internal auxiliary data structuresvarious internal auxiliary data structures in orderin order to
to speed speed up up access access to to larger tableslarger tables. Such additional . Such additional data structures are calleddata structures are called access paths
access paths.. •
• The most important The most important such such access aids are calledaccess aids are called indexesindexes, , special forms special forms ofof search trees offering direct access
search trees offering direct access to to values values of of particularly important particularly important andand frequently accessed attributes
frequently accessed attributes of a table (of a table (rather than sequentially searchingrather than sequentially searching all
all fields fields of all of all rowsrows). ). OftenOften, at least , at least the primary key attributethe primary key attribute(s) of a table(s) of a table are supported by
are supported by an an index for index for rapid rapid accessaccess.. •
• TheThe notionnotion ‚‚index'index' isis motiviatedmotiviated by the concept by the concept of of index pagesindex pages of of booksbooks. . HereHere,, an additional
an additional part part at at the the end of end of the book contains keywordthe book contains keyword--page reference listspage reference lists for avoiding
for avoiding to to search for certain keywords sequentiallysearch for certain keywords sequentially.. •
• SQLSQL offersoffers specialspecial DDLDDL--commands for creating commands for creating and and deleting indexes deleting indexes to to oneone or several columns
or several columns of a table:of a table:
CREATE INDEX <index-name> ON <table-name> <list-of-column-names>
CREATE
CREATE INDEXINDEX <index<index--name> ON <tablename> ON <table--name> <listname> <list--ofof--columncolumn--names>names> DROP
•
• Per Per relationrelation, , one index normally supports the primary keyone index normally supports the primary key ((oftenoften created automacreated automa- -tically
tically):): •
• In In additionaddition, , severalseveral secondary indexessecondary indexes may exist for other frequently used columnsmay exist for other frequently used columns,, even if these columns
even if these columns//attributes attributes do do not possess the key propertynot possess the key property. . •
• For For implementingimplementing indexesindexes, , special data structures are used by the special data structures are used by the DBMS, DBMS, the choicethe choice of
of which can be influenced by specially authorizedwhich can be influenced by specially authorized DB DB administrators administrators in in largelarge commercial
commercial systems :systems : •
• Search treesSearch trees:: •
• ISAMISAM--filesfiles •
• BB--treestrees and and variantsvariants •
• RR--trees trees and and variantsvariants ((higherhigher--dimensional dimensional index structuresindex structures)) •
• HashHash--tables tables ((tables associated with special access functionstables associated with special access functions)) •
• When using When using an an indexindex, , there is always there is always a a „„tradeofftradeoff"" between resourcesbetween resources:: A
A reduction reduction in in access access time has to time has to be measured against be measured against an an increase increase in in administrationadministration time
time forfor indexindexmaintenancemaintenance and in and in storage used by the storage used by the extra extra index dataindex data. . Indexes:
Indexes: OverviewOverview
Primärindex
Prim
•
• simplestsimplest form of form of indexingindexing::
•
• EachEach entry entry on on the the index pagesindex pages consists consists of of values values of of the indexed attribute the indexed attribute ((searchsearch key
key S). S). Each entry Each entry on on the the data pagesdata pages is is a a value value of of another attribute another attribute ((data data D). D). •
• On On the index pagesthe index pages, , search key values alternate with search key values alternate with pointerspointers to to correspondingcorresponding data pages
data pages. . •
• SearchSearch withinwithin a a pagepage isis donedone sequentiallysequentially! ! ModificationsModifications on on index pagesindex pages maymay lead
lead to to high high administration overheadadministration overhead ((rebalancing rebalancing of of neighbourneighbour pagespages,, redirection
redirection of of pointerspointers, , creation creation of of newnew pagespages, , moving moving of of keykey valuesvalues)) ISAM
ISAM--OrganisationOrganisation
S
S11 SS22 . . . . . . SSkk . . . . . . SSll . . . S. . . Smm
D
D1 1 DD22 . . . . . . DDss DDt t . . . . . . DDss . . .. . . DDv v . . . . . . DDww Index-Sequential Access Method (ISAM)
Index
Index--SequentialSequential Access Access Method Method (ISAM)(ISAM)
≥ ≥ SS11 < < SS22 index page index page data pages data pages . . . . . . . . . . . .
pointer
pointer
search key
search key
For
For primary indexprimary index:: data entry
data entry
For
For secondary indexsecondary index:: TID
TID B
B--treetree: : IdeaIdea
Each node corresponds
Each node corresponds to to one page one page in in storestore Tree is
Tree is balancedbalanced: equally long branches: equally long branches
More than
More than 50% of 50% of each page filled with dataeach page filled with data
Generalization
Generalization of of the the ISAMISAM--ideaidea to multiple
to multiple levels levels of of indexing indexing plusplus mixing data
mixing data and and index valuesindex values::
B
Storage structures
Storage structures: : Summary Summary of of main ideasmain ideas •
• A A databasedatabase physically consists physically consists of a of a selection selection of of filesfiles accessed solely by theaccessed solely by the storage manager
storage manager of of the the DBMS.DBMS. •
• These These files reside files reside on on largelarge, , persistent storage persistent storage media, media, normally normally on on discsdiscs. . This This part
part of a of a computercomputer‘‘s s store is called store is called secondary storagesecondary storage or background storeor background store, , as opposed
as opposed to to its comparatively smallits comparatively small, , but but fast fast primaryprimary or main or main storestore.. •
• There is There is no such no such thing as rows or columns thing as rows or columns in in filesfiles, , but but just just sequences sequences of of indiviindivi- -dual
dual symbolssymbols ((actually actually just just bitsbits!). !). ThusThus, , the the DBMS has to DBMS has to •
• codecode relational relational tuplestuples//fields as bit sequences when storing awayfields as bit sequences when storing away •
• decodedecode them again when fetching them for manipulationthem again when fetching them for manipulation •
• There are There are a a few more basic principles few more basic principles of of physical structure physical structure of a of a databasedatabase:: •
• The beginning The beginning of of each tuple each tuple ((stored as stored as a a bit sequencebit sequence) ) is markedis marked by
by a a reference called reference called tuple identifiertuple identifier (TID).(TID). •
• Tuples are summarized into larger units called Tuples are summarized into larger units called pagespages –– when when accessing the database
accessing the database no no tuplestuples, , but entire pages are fetchedbut entire pages are fetched.. •
Query
Query manager basicsmanager basics
User Interface
User Interface
User Interface
Schema Manager
Schema Manager
Schema Manager Query ManagerQuery ManagerQuery Manager Transaction Transaction Transaction ManagerManagerManager
Storage Manager
Storage
Storage ManagerManager
Data Dictionary Data Dictionary Database Database Log Log
Query
Query processingprocessing: : OverviewOverview
Scanning Scanning Parsing Parsing View expansion View expansion Optimization Optimization Optimization Code
Code generationgeneration
Interpretation
Interpretation Executable program
Executable program
Execution
Execution planplan SQL SQL queryquery RA RA--expressionexpression 1. 1. 2. 2. 3. 3. •
• One of One of the most important the most important tasks
tasks of a DBMS of a DBMS isis efficient efficient processing
processing of of queriesqueries.. •
• In In principleprinciple, , queriesqueries areare treated treated like
like programsprograms writtenwritten in a in a programprogram language
language, i.e., , i.e., theythey areare givengivento ato a compiler
compiler.. •
• The most important phase The most important phase of of the compilation process is the
the compilation process is the
attempt
attempt of of improving the improving the efficiency
efficiency of of thethe final final execution phase
execution phase, , called
•
• The optimization phase tries The optimization phase tries to to improve efficiencyimprove efficiency of of executionexecution, , even though it noteven though it not always may reach
always may reach an an „„optimaloptimal““ solution solution ((thus the notion is misleadingthus the notion is misleading).). •
• From From an an initialinitial internalinternal representation representation of of thethe queryquery in RA, an in RA, an equivalentequivalent but morebut more efficiently executable
efficiently executable algebra expressionalgebra expression is generated by equivalence preservingis generated by equivalence preserving transformations
transformations: :
•
• Afterwards Afterwards a a suitable suitable implementation variantimplementation variantis chosen for eachis chosen for each RARA--operatoroperator.. This
This isis donedone byby exploiting severalexploiting several specialspecial parametersparameters of of the operandthe operand relations relations such
such asas sizesize, , selectivityselectivity, , sortingsorting, , indexingindexing etc. etc. –– oftenoften cost models are usedcost models are used, , tootoo. .
•
• At At the the end of end of thethe optimization phaseoptimization phase a a query executionquery execution planplan ((shortshort: QEP) : QEP) isis genegene- -rated
rated whichwhich cancan bebe furtherfurther improvedimproved individuallyindividually ((„„byby handhand““) ) byby a DB a DB specialistspecialist when
when usingusing a a commercialcommercial DBMS ("DBMS ("database tuningdatabase tuning")."). •
• StrategiesStrategies and and heuristics used by heuristics used by a a commercial query optimizercommercial query optimizer are often veryare often very Query
Query optimizationoptimization: : PhasesPhases
Logical optimization
Logical
Logical optimizationoptimization
Physical optimization
Physical
Example
Example of of logicallogicaloptimization optimization (1)(1)
In which semester are those students enrolled which attend courses read by professor Sokrates?
In
In whichwhich semestersemester areare thosethose students enrolledstudents enrolled which attend courses
which attend courses readread byby professorprofessor Sokrates?Sokrates?
SELECT DISTINCT s.Semester FROM Students s,
Attends a, Courses c, Professors p
WHERE p.Name = 'Sokrates' AND p.PersNr = c.ReadBy
AND a.CourseNr = c.CourseNr AND a.MatrNr = s.MatrNr
SELECT DISTINCT
SELECT DISTINCT s.Semesters.Semester FROM
FROM StudentsStudents s,s,
Attends Attends a,a, Courses Courses c,c, Professors p Professors p WHERE
WHERE p.Name = 'Sokrates'p.Name = 'Sokrates' AND
AND p.p.PersNrPersNr = c.= c.ReadByReadBy AND
AND a.a.CourseNrCourseNr = c.= c.CourseNrCourseNr AND
AND a.a.MatrNrMatrNr = s.= s.MatrNrMatrNr in SQL
Example
Example ofoflogical optimizationlogical optimization (2)(2)
π
π
s.Semester s.Semester p.Name = 'Sokrates' p.Name = 'Sokrates' ∧∧ p.p.PersNrPersNr = c.= c.ReadByReadBy∧∧
a.
a.CourseNrCourseNr= c. = c. CourseNrCourseNr ∧∧
a.
a.MatrNrMatrNr= s. = s. MatrNr MatrNr
σ
σ
×
×
p p×
×
×
×
cc s s aa OperatorOperator treetree inin canonical
canonical representationrepresentation of
Example
Example ofoflogical optimizationlogical optimization (3)(3)
π
π
s.Semester s.Semester p.Name = 'Sokrates' p.Name = 'Sokrates' ∧∧ p.p.PersNrPersNr = c.= c.ReadByReadBy∧∧
a.
a. CourseNrCourseNr = c.= c. CourseNrCourseNr∧∧
a.
a.MatrNrMatrNr= s. = s. MatrNr MatrNr
σ
σ
×
×
p p×
×
×
×
cc s s aa Splitting andSplitting and pushpushííng selectionsng selections
nicht verschiebbar !
Example
Example ofoflogical optimizationlogical optimization (4)(4)
π
π
s.Semester
s.Semester
p.
p.PersNrPersNr= c.= c.ReadBy ReadBy
σ
σ
×
×
p p×
×
×
×
c cσ
σ
p.Name = 'Sokrates'p.Name = 'Sokrates'σ
σ
a.
a.CourseNrCourseNr = c.= c.CourseNr CourseNr
σ
σ
a.
a.MatrNrMatrNr = s. = s. MatrNr MatrNr
Forming joins from products
Forming joins from products
and
Example
Example ofoflogical optimizationlogical optimization (5)(5)
π
π
s.Semesters.Semesterp.
p.PersNrPersNr= c.= c.ReadBy ReadBy
p p c c s s aa
σ
σ
p.Name = 'Sokrates'p.Name = 'Sokrates'a.
a.CourseNrCourseNr = c.= c.CourseNrCourseNr
a.
a.MatrNrMatrNr = s. = s. MatrNr MatrNr
Reordering
Reordering joinsjoins
Smallest
Smallest operandoperand:: Should participate Should participate inin 1st 1st joinjoin!! 1 1 2 2 3 3
Example
Example ofoflogical optimizationlogical optimization (6)(6)
π
π
s.Semesters.Semesterp.
p.PersNrPersNr= c.= c.ReadBy ReadBy
c c s s a a
σ
σ
p.Name = 'Sokrates'p.Name = 'Sokrates'a.
a.CourseNrCourseNr = c.= c.CourseNr CourseNr a.
a.MatrNrMatrNr = s. = s. MatrNr MatrNr
. . . all
. . . all otherotherjoinsjoins
are then ordered
are then ordered
automatically automatically 33 2 2 1 1
•
• JustificationJustification forfor reorderingsreorderings duringduring logicallogicaloptimizationoptimization:: In
In eacheach stepstep equivalence preservingequivalence preserving transformationstransformations are appliedare applied..
•
• AlwaysAlways:: •
• Answer setAnswer set is is identicalidentical overover eacheach DBDB--statestate beforebefore and and afterafter transformation transformation (
(„„equivalenceequivalence")."). •
• EvaluationEvaluation of of thethe transformedtransformed expressionexpression isis oftenoften (in (in many casesmany cases, , notnot alwaysalways) ) more
more efficientefficient than before thethan before the transformationtransformation. . •
• ButBut:: •
• Transformations Transformations do do notnot come come with any guarantee forwith any guarantee for efficiencyefficiency gainsgains!! •
• Transformation Transformation rulesrules areareheuristicsheuristics ((„„rulesrules of of thumbthumb"), "), which have which have to to be be applied
applied in a in a „„cleverclever““ way.way. •
• ThereforeTherefore: : •
• NotionNotion „„optimizationoptimization" " isis misleadingmisleading !! •
• IfIf at all, at all, only certain only certain improvementsimprovements of of efficiencyefficiency cancan bebe achievedachieved, , rarely
rarely a (a (theoreticallytheoreticallyimaginableimaginable) optimal ) optimal solutionsolution will will bebe reachedreached.. Logical
Logical optimizationoptimization: : TheoreticalTheoreticalfoundationsfoundations
σ
p(R
1×
R
2)
⇒
(
σ
p(R
1))
×
R
2σ
σ
pp(R
(R
11×
×
R
R
22)
)
⇒
⇒
(
(
σ
σ
pp(R
(R
11))
))
×
×
R
R
22if attr
if attr(p) (p) ⊆⊆ attr(Rattr(R11)) and
and attrattr(p) (p) ∩∩ attr(Rattr(R22) = ) = ∅∅
e.g.:
Physical optimization
Physical optimization : : OverviewOverview
" "logicallogical" " RA RA--expressionexpression improved improved logical logical RA RA--expressionexpression Evaluation Plan Evaluation Plan (
(usingusing physicalphysical RA RA--operatorsoperators)) Logical Logical optimization optimization Physical Physical optimization optimization Main
Main taskstasks of of physicalphysical optimizationoptimization:: •
• ChoosingChoosing suitablesuitable algorithmsalgorithms forfor physicallyphysically realizingrealizing RA
RA--operatorsoperators: : physicalphysical operatorsoperators, , physical algebraphysical algebra •
• ExploitingExploiting exististing exististing access pathsaccess paths to to storagestorage structuresstructures:: indexes
indexes, B, B--treestrees, , hashhash--tablestables •
• SortingSorting of of intermediate resultsintermediate results •
• Creating temporary Creating temporary indexesindexes and and hashhash--tablestables •
• PlanningPlanning of of intermediate storage usageintermediate storage usage •
• Estimating costsEstimating costs of different of different implementation variantsimplementation variants based
Project Project Sort Sort Select Select Physical
Physical optimizationoptimization: : ExampleExample
π
π
s.Semesters.Semesterp.
p.PersNrPersNr= c.= c.ReadBy ReadBy
p p c c s s a a
σ
σ
p.Name = 'Sokrates'p.Name = 'Sokrates' a.a.CourseNrCourseNr = c.= c.CourseNr CourseNr a.
a.MatrNrMatrNr= s. = s. MatrNr MatrNr
s.Semester
s.Semester
p.
p.PersNrPersNr= c.= c.ReadBy ReadBy c c s s a a p.Name = 'Sokrates' p.Name = 'Sokrates' h.
h.CourseNrCourseNr= v.= v.CourseNrCourseNr h.
h.MatrNrMatrNr= s. = s. MatrNr MatrNr
Index
IndexJoinJoin Merge
MergeJoinJoin Index
IndexJoinJoin
c.
c.CourseNrCourseNr SortSorth.h.CourseNrCourseNr
Hash Hash IndexDup IndexDup Result Result ofof logical logical optimization optimization:: QEP
QEP afterafter physical
physical
optimization
Operators of
Operators of physicalphysical algebraalgebra
Project Projectaa Union Union Select Selectcc IndexSelect IndexSelect Each
Each operatoroperator of RA ("of RA ("logicallogical algebraalgebra") ") is associated withis associated with oneone oror more proceduresmore procedures realizing this
realizing this operationoperation in in physicalphysical storagestorage: Operators of : Operators of physicalphysical algebraalgebra •
• ProjectionProjection withoutwithout duplicateduplicate eliminationelimination:: •
• UnionUnion withoutwithout duplicate eliminationduplicate elimination :: •
• Duplicate elimininationDuplicate eliminination:: •
• "naive" "naive" basicbasic versionversion:: •
• usingusing sortingsorting:: •
• using using an an IndexIndex:: •
• SelectionSelection:: •
• withoutwithout usingusing an an indexindex:: •
• withwith usingusing anan indexindex ::
NestedDup NestedDup SortDup SortDup IndexDup IndexDup
Operators of
Operators of physical algebraphysical algebra (2)(2)
Sort Sortaa Bucket Bucket Hash Hashaa IndexJoin IndexJoincc •
• JoinJoin ((analoguoslyanaloguosly: : DifferenceDifference, , intersectionintersection):): •
• "naive" "naive" basic versionbasic version:: •
• on on sortedsorted inputinput relations:relations: •
• using using an an indexindex:: •
• using using a a hashhash--table table in in mainmain memorymemory::
•
• IntermediateIntermediate storingstoring:: •
• withwith temporarytemporary materialisationmaterialisation of
of results results relations in relations in memorymemory:: •
• M. M. withwith sortingsorting of of resultresult relationrelation:: •
• M. M. withwith indexingindexing the result relationthe result relation :: •
• SortingSorting of of intermediateintermediate resultresult relations:relations:
NestedLoop NestedLoopcc MergeJoin MergeJoincc MergeSort MergeSort HashJoin HashJoincc Tree Treeaa
Discussion
Discussion of of samplesampleevaluation evaluation planplan
Project Project Sort Sort Select Select s.Semester s.Semester p.
p.PersNrPersNr= c.= c.ReadBy ReadBy
c c s s a a p.Name = 'Sokrates' p.Name = 'Sokrates' a.
a.CourseNrCourseNr= c.= c.CourseNr CourseNr a.
a.MatrNrMatrNr= s. = s. MatrNr MatrNr
Index
IndexJoinJoin Merge
MergeJoinJoin Index
IndexJoinJoin
c.
c. CourseNr CourseNr Sort
Sort a. a. CourseNr CourseNr Hash Hash Index
IndexDupDup
Temporary hash
Temporary hash--tabletable in
in main memorymain memory
Primary index
Primary index on son s
Secondary index
Secondary index on con c
Only
Only partpartof of primary key primary key on a
on a ⇒⇒ no no IndexJoin IndexJoin
No
Transcation manager basics
Transcation manager basics
User Interface
User Interface
User Interface
Schema Manager
Schema Manager
Schema Manager Query ManagerQuery ManagerQuery Manager Transaction Transaction Transaction ManagerManagerManager
Storage Manager
Storage
Storage ManagerManager
Data Dictionary Data Dictionary Database Database Log Log
Transactions
Transactions
•
• Every change Every change (update) of a (update) of a database is performed database is performed in in special unitsspecial units comprising comprising one or more
one or more update update statements statements ((possibly combined with queriespossibly combined with queries):):
•
• Transactions Transactions in in SQLSQL are marked by special are marked by special keywordskeywords:: •
• BEGIN_OF_TRANSACTION, BEGIN_OF_TRANSACTION, for starting for starting a a transactiontransaction,, •
• COMMIT, COMMIT, for successful completionfor successful completion, , oror •
• ROLLBACK, ROLLBACK, for aborting for aborting a a failed transaction explicitlyfailed transaction explicitly. . •
• Transactions are Transactions are „„packagespackages““ of of individual steps treated with individual steps treated with „„particular careparticular care““ by the
by the DBMS: DBMS: Transactions are always guaranteeing some Transactions are always guaranteeing some form of form of logicallogical and
and physical physical consistency consistency of of the databasethe database!! •
• Individual Individual update update statementsstatements are treated like are treated like „„minimini--transactionstransactions““ implicitlyimplicitly,, even
even without enclosing them into without enclosing them into BEGINBEGIN--COMMIT.COMMIT. •
• The The transaction managertransaction manager is one is one of of the most important components the most important components of a DBMS.of a DBMS. Transactions
Transactions
Transactions
Transactions: Motivation: Motivation
Transactions take place
Transactions take place in in many areas many areas of real life all of real life all the the time,time, e.g.
e.g. when using when using a a cash cash machinemachine for receiving for receiving cash cash moneymoney or or when transfering money
when transfering money byby home bankinghome banking. . e.g.
e.g..: .: Transfer 50 Transfer 50 €€ fromfrom accountaccount A to A to accountaccount B !B !
begin_of_transaction; read(A, a); a := a -50; write(A, a);
read(B, b); b := b+50; write(B, b); commit
begin
begin_of__of_transactiontransaction; ; readread(A, a); a :(A, a); a :== a -a -50; 50; writewrite(A, a); (A, a); read
read(B, b); b :(B, b); b :== b+50; writeb+50; write(B, b); (B, b); commitcommit
300 300 200 200 A A B B 250 250 200 200 A A B B 250 250 250 250 A A B B Initial
Initial statestate Intermediate stateIntermediate state Final Final statestate
(
(consistentconsistent)) ((inconsistentinconsistent)) ((consistentconsistent)) local
Transactions
Transactions: Motivation (2): Motivation (2)
Transaction management
Transaction management often takes place often takes place in a in a context context where
where severalseveral competingcompetingtransactionstransactions concurrentlyconcurrently trytry to
to accessaccess a a centralcentral, , jointly used databasejointly used database. . e.g.
e.g..: .: BookBook a a seatseat on on thethe nextnext flightflight fromfrom Cologne to Delhi !Cologne to Delhi !
Mr. A Mr. A Mr. BMr. B begin_of_transaction; read(free, a); write(occupied, a); commit begin
begin_of__of_transactiontransaction;; read
read((freefree, a);, a); write
write((occupiedoccupied, a);, a); commit commit begin_of_transaction; read(free, b); write(occupied, b); commit begin
begin_of__of_transactiontransaction;; read
read((freefree, b);, b); write
write((occupiedoccupied, b);, b); commit commit T TAA TT B B
What happens if there is only one seat left
•
• Sequences Sequences of of updates are reasonably handled only if they are executed updates are reasonably handled only if they are executed entirelyentirely and
and without observable interruptionwithout observable interruption. . •
• TransactionsTransactions expectexpect •
• ConsistencyConsistency of of the the final final state state (i.e., (i.e., satisfaction satisfaction of all of all integrity constraintsintegrity constraints)) •
• PersistencePersistence of of thethe resulting changesresulting changes •
• Resistence Resistence of of the the DBMS DBMS againstagainst machine errorsmachine errors and and concurrentconcurrent accessaccess •
• IfIf anyany transactiontransaction has has reachedreached a a state where any state where any of of these expectations can these expectations can no no longerlonger be satisfied
be satisfied, , it it has to has to be be interruptedinterrupted by the by the DBMS and DBMS and thethe initialinitial state from which state from which this particular
this particular transactiontransaction has has startedstarted has to has to reconstructedreconstructed::
•
• If If a a transactiontransaction hadhad to to bebe rolledrolled back back duedue to to externalexternal errorserrors,, thethe DBMS will DBMS will automaautoma- -tically try
tically try to to restart it restart it at at the next possible moment the next possible moment in time.in time.
•
• HoweverHowever, , if the transaction if the transaction has has been stopped due been stopped due to to integrity violationsintegrity violations it it has has caused itself
caused itself, , rollbackrollback isis performedperformed, , butbut nono restart is attemptedrestart is attempted forfor thisthis transactiontransaction.. Policy
Policy of of transaction handlingtransaction handling
Recovery Recovery Recovery Rollback Rollback Rollback
Characteristics
Characteristics of of transactionstransactions: : TheTheACIDACID--paradigmparadigm
Transactions thus are
Transactions thus are characterisedcharacterised byby thethe followingfollowing fourfour propertiesproperties addressedaddressed together
together byby thethe notionnotion of of „„ACIDACID--paradigmparadigm““:: • Atomicity:
Transactions are indivisible, i.e., they are either executed entirely, or not at all („all or nothing-principle“).
• Consistency:
Transactions lead from consistent initial states to consistent final states (but possibly via inconsistent intermediate states).
• Isolation:
Concurrently performed transactions on the same database (in multi-user mode) do not influence each other, even though their individual steps may be interleaved by the DBMS for impro-ving efficiency.
• Durability:
The effect of a successfully completed transaction is maintained
persistently in the DB, unless explicitly reversed later on by follow-up transactions.
•
• AAtomicitytomicity::
Transactions
Transactions areare indivisible, i.e., indivisible, i.e., they are either executed entirelythey are either executed entirely, , or not
or not at all (at all („all „all oror nothing-nothing-principleprinciple““).). •
• CConsistencyonsistency::
Transactions
Transactions lead fromlead from consistentconsistent initial statesinitial states to consistentto consistent finalfinal states
states (but possibly(but possibly viavia inconsistentinconsistent intermediateintermediatestatesstates).). •
• IIsolationsolation::
Concurrently
Concurrently performedperformed transactionstransactions on the sameon the same databasedatabase (in multi
(in multi--useruser mode) do mode) do notnot influenceinfluence eacheach other, other, eveneven though though their
their individualindividual stepssteps maymay bebe interleavedinterleaved byby thetheDBMS forDBMS for impro-impro -ving
ving efficiencyefficiency.. •
• DDurabilityurability:: The
The effecteffect of a successfullyof a successfully completedcompleted transaction is maintained transaction is maintained persistently
persistently in thein the DB, unless explicitly reversed later DB, unless explicitly reversed later on by followon by follow--upup transactions
Components
Components of a of a transaction managertransaction manager Resulting from the
Resulting from the ACIDACID--paradigmparadigm: : three main
three main components components of a of a transaction managertransaction manager •
• Consistency checkerConsistency checker:: •
• Checks Checks all all affected integrity constraints after each individual stepaffected integrity constraints after each individual step in a
in a transaction transaction (IMMEDIATE) (IMMEDIATE) or or at at the the end of a end of a transactiontransaction (DEFERRED).
(DEFERRED).
•
• Stops Stops transaction transaction on on integrity violations integrity violations and and initiates rollbackinitiates rollback.. •
• SchedulerScheduler:: •
• Determines Determines a proper (a proper („„nonnon--interferringinterferring““) ) sequence sequence of of steps steps of conof con- -current transactions
current transactions ((„„interleavinginterleaving““) ) using resources efficientlyusing resources efficiently.. •
• Maintains the Maintains the „„illusionillusion““ of a of a sequential execution sequential execution ((serializabilityserializability). ). •
• Recovery managerRecovery manager:: •
• Records Records individual steps individual steps of of each transaction each transaction in a in a special protocolspecial protocol storage area
storage area, , called the called the DB logDB log ((„„logginglogging““).). •
• Manages rollback Manages rollback and and restart after restart after system system failures according failures according to to thethe log
Lost update
Lost update--problemproblem
write(
A
,
a
2)
a
2:=
a
2∗
1.03
read(
A
,
a
2)
T2write(
B
,
b
)
b
1:=
b
1+ 300
read(
B
,
b
1)
write(
A
,
a
1)
a
1:=
a
1– 300
read(
A
,
a
1)
T19.
8.
7.
6.
5.
4.
3.
2.
1.
Step
„Lost update problem"
„
„Lost update Lost update problemproblem""
Expl. of
Expl. of twotwo unsynchronizedunsynchronized transactionstransactions, , thethe execution
execution of of which causes changes which causes changes to to be be „„lostlost““::
T1: Transferring 300 € from account A to B T2: Paying 3% interest to account A T T11: Transferring 300 : Transferring 300 €€ from
from accountaccountA to BA to B T
T22: : PayingPaying3% 3% interestinterest to to accountaccount AA A B A B 5300 5300 €€ 4700 4700 €€ 5459 5459 €€ 5000 5000 €€ 5000 5000 €€
Lost update
Lost update--problemproblem (2) (2)
T2 a a11 := := aa11 ––300300 read( read(AA,,aa11)) T1
9.
8.
7.
6.
5.
4.
3.
2.
1.
Step
A B A B 5300 5300 €€ 4700 4700 €€ 5000 5000 €€ 5150 5150 €€ 5000 5000 €€ There would have been two intuitively correctThere would have been two intuitively correct
solutions
solutions:: 1.
1. casecase: T: T11 executed beforeexecuted before TT22
write
write(A,a(A,a11))
read read(B,b(B,b11)) b b11 := := bb11 + 300+ 300 write write(B,b(B,b11)) read
read(A,a(A,a22))
write
write(A,a(A,a22)) a a22 := := aa22 ∗∗1.031.03 5000 5000 €€ T1: Transferring 300 € from account A to B T2: Paying 3% interest to account A T T11: Transferring 300 : Transferring 300 €€ from
from accountaccountA to BA to B T
T22: : PayingPaying3% 3% interestinterest to
Lost update
Lost update--problemproblem (3) (3)
T2 T1
9.
8.
7.
6.
5.
4.
3.
2.
1.
Schritt
A B A B 5300 5300 €€ 4700 4700 €€ 5459 5459 €€ 5159 5159 €€ 5000 5000 €€ AnotherAnother semanticallysemantically „„clean" clean" solutionsolution:: 2.
2. casecase: T: T22 executedexecuted beforebefore TT11
write
write(A,a(A,a11))
read read(B,b(B,b11)) b b11 := := bb11 + 300+ 300 write write(B,b(B,b )) read
read(A,a(A,a22))
write
write(A,a(A,a22)) a
a22 := := aa22 ∗∗1.031.03
a
a11 := := aa11 −−300300
read
read(A,a(A,a11))
4700 4700 €€ T1: Transferring 300 € from account A to B T2: Paying 3% interest to account A T T11: Transferring 300 : Transferring 300 €€ from
from accountaccountA to BA to B T
T22: : PayingPaying3% 3% interestinterest to
Lost update
Lost update--problemproblem (4) (4)
T2 T1
9.
8.
7.
6.
5.
4.
3.
2.
1.
Step
A B A B 5300 5300 €€ 4700 4700 €€ 5459 5459 €€ 5159 5159 €€ 5000 5000 €€ AnAn „„interleavedinterleaved" " execution execution of of both transactionsboth transactions producing
producing a a „„correctcorrect““ resultresult::
write
write(A,a(A,a11))
read read(B,b(B,b11)) b b11 := := bb11 + 300+ 300 write write(B,b(B,b11)) read
read(A,a(A,a22))
write
write(A,a(A,a22)) a
a22 := := aa22 ∗∗1.031.03
a
a11 := := aa11 −−300300
read
read(A,a(A,a11))
T1: Transferring 300 € from account A to B T2: Paying 3% interest to account A T T11: Transferring 300 : Transferring 300 €€ from
from accountaccountA to BA to B T
T22: : PayingPaying3% 3% interestinterest to
Resumee
Resumee
There is much
There is much, , much more much more to to be learnt be learnt about databases
about databases and and database managementdatabase management, ..., ...
...
... butbut: : This lecture cannot provide more than This lecture cannot provide more than just a just a „„tastertaster““ of of this excitingthis exciting and
and very very relevant relevant area area of of information processinginformation processing..
Time is over, folks!
Next week, we‘ll do a last round of „exam training“ –
make sure everybody attends:
I want each of you to pass straightaway!!
Time
Time is overis over, , folksfolks!! Next week
Next week, , wewe‘‘ll ll do a last do a last roundround of
of „„exam trainingexam training““ ––
make sure everybody attends
make sure everybody attends:: I