DB Administration 1 CIS
Database Administration
Transaction Processing
Why Concurrency Control?
Locking
Database Recovery
Query Optimization
DB Administration 2 CIS
Transactions
Transaction --A sequence of operations that is regarded as a single logical operation - i.e. a logical unit of work
Atomicity of Transactions -- Either all operations in a transaction are executed completely or none of them is executed.
Transaction Manager -- A component of a DBMS which is responsible for transaction processing.
DB Administration 3 CIS
Commit and Rollback of Transactions
TRANS: PROC OPTIONS (MAIN) ; /* declarations omitted */
EXEC SQL WHENEVER SQLERROR GO TO UNDO;
GET LIST (SX, SY ) /* get values from enduser */
EXEC SQL UPDATE SUPPLIER SET S# = :SY
WHERE S# = :SX;
EXEC SQL UPDATE SHIPMENT SET S# = :SY
WHERE S# = :SX;
EXEC SQL COMMIT;
GO TO FINISH
UNDO: EXEC SQL ROLLBACK;
FINISH: RETURN;
END /*TRANS */;
DB Administration 4 CIS
Transaction Processing
The execution of a transaction involves three steps:
Transaction Start
¾ Signals the beginning a transaction execution
¾ Acquire required resources ( e.g., private workspace )
Transaction Execution--Issues a sequence of reads/writes
¾ A read brings data from the DB to the private workspace
¾ A write updates data in the private workspace and the DB
Transaction Commit/Rollback
¾ Commit -- All changes become permanent in the DB
¾ Rollback -- Stops the execution and undo all changes
DB Administration 5 CIS
Synchronization Points
• A synchpoint represents a boundary between two consecutive transactions
• The point at which a database is guaranteed to be in a consistent state
• The only operations that establish a synchpoint are COMMIT, ROLLBACK and program initiation
• COMMITS and ROLLBACKS terminate the transaction, not the program
— All updates made by the program since the previous synchpoint are committed (COMMIT) or undone (ROLLBACK)
— All database positioning is lost ( all open cursors are closed)
— All record locks are released
DB Administration 6 CIS
System Failure and Media Recovery
• System failures (e.g. power failure) - affect all transactions currently in progress, but do not physically damage the d.b.
• Media failure (e.g. head crash on the disk) - cause damage to the database, or to some portion of it, and affect at least those transactions currently using that portion
• System checkpoint - Taking a checkpoint involves physically writing the contents of the database buffers out to the physical database, and b) physically writing a special checkpoint record out to the physical log
• Undo a transaction - any transaction that was in progress at the time of failure
• Redo a transaction - a transaction that was completed, but was not written to disk successfully before a system failure
DB Administration 7 CIS
System & Media Recovery (cont’d)
•A system failure has occurred at time tf
• The most recent checkpoint prior to time tf was taken at time tc
• Transactions of type T1 were completed prior to time tc
• Transaction of type T2 started prior to time tc and completed after time tc and before time tf
• Transactions of type T3 also started prior to time tc but did not complete by time tf
• Transactions of type T4 started after time tc and completed before time tf
• Transactions of type T5 also started after time tc but did not complete by time tf
Time Checkpoint Failure
T1
T2
T3
T4
T5
tc tf
DB Administration 8 CIS
System Recovery (cont’d)
T2 and T4 need to be Redone, while T3 and T5 need to be Undone
To do this:
1. Start with two lists of transactions, the UNDO-list and the REDO-list Set the UNDO-list equal to the list of all transactions given in the checkpoint record; set the REDO-list to empty.
2. Search forward through the log, starting from the checkpoint record.
3. If a “start of transaction” log entry is found for transaction T, add T to the UNDO-list.
4. If a “commit” log entry is found for transaction T, move T from the UNDO- list to the REDO-list
5. When the end of the log is reached, the UNDO- and REDO- lists identify, respectively, transactions of types T3 and T5, and transactions of type T2 and T4.
6. The system now works backward through the log, undoing transactions in the UNDO-list; then it works forward again , redoing trans. in the REDO-list.
DB Administration 9 CIS
Deferred Update
• Updates are not written to the database until after a transaction has reached its commit point.
• If transaction fails before commit, it will not have modified database and so no undoing of changes required.
• May be necessary to redo updates of committed transactions as their effect may not have reached database.
DB Administration 10 CIS
Immediate Update
• Updates are applied to database as they occur.
• When a transaction starts, write a transaction start to the log file.
• Once the log is written, write a record containing the necessary data to the log file.
• Once the log record is written, write the updates to the database buffers
• The updates to the database itself are written when the buffers are next flushed to secondary storage.
• Essential that log records are written before write to database.
Write-ahead log protocol.
• Need to redo updates of committed transactions following a failure.
• May need to undo effects of transactions that had not committed at time of failure.
DB Administration 11 CIS
Immediate Update
• If no “transaction commit” record in log, then that transaction was active at failure and must be undone.
• Undo operations are performed in reverse order in which they were written to log.
DB Administration 12 CIS
Recovery Facilities
Backup copies of the database
Write-ahead logging (system log or journal )
¾ Includes transaction IDs, time, before and after images of data record, etc.
Checkpoint (synchronization point)
¾ Synchronizes transaction execution and logs
¾ Writes a checkpoint record to the log file
Shadow copies (details on slide 14)
DB Administration 13 CIS
Database Recovery
Media Failure Recovery
¾ Restored by the last backup copy
¾ Forward recovery if the system log is intact
System Failure Recovery Using Logs
DB with changes
Before Images
Undo DB without changes
¾ Undo ( Backward Recovery ).
The uncommitted changes made by a transaction to a database is undone.
¾ Redo ( Forward Recovery ).
The committed changes made by a transaction is reapplied to an earlier copy of the database.
DB with changes After
Images
Redo DB without
changes
DB Administration 14 CIS
Shadow Paging
• Maintain two page tables during life of a transaction:
current page and shadow page table.
• When transaction starts, two pages are the same.
• Shadow page table is never changed thereafter and is used to restore database in event of failure.
• During transaction, current page table records all updates to database.
• When transaction completes, current page table becomes shadow page table.
DB Administration 15 CIS
Concurrency Problems
Deposit and withdraw funds from a bank account.
Transaction Start
Read account balance and store it in B;
B = B + Deposit-Amount;
Write B as the new balance;
If ( B < Withdraw-Amount ) then Transaction rollback;
else
B = B - Withdraw-Amount;
Write B as the new balance;
Transaction commit;
End.
DB Administration 16 CIS
Uncommitted Dependency Problem
Assume that two transactions T1 and T2 both access to a same account with $200 in balance. T1 withdraws $100. T2 deposits $50 first, and then withdraws $500.
Time start
start
read balance balance = balance - $100 write balance commit
read balance balance = balance + $50
balance - 500 rollback
T1 T2
What is the final balance?
DB Administration 17 CIS
Lost Update Problem
Assume that two transactions T1 and T2 both access to a same account with $200 in balance. T1 reads a balance of $200 and withdraws $100. T2 reads a $200 balance and deposits 50.
Time start
start read balance
balance = balance - $100 write balance
commit
read balance
balance = balance + $50
T1 T2
What is the final balance?
write balance commit
DB Administration 18 CIS
Inconsistent Analysis Problem
Assume three accounts: Acc 1 = Balance of $40, Acc 2 = Balance of $50, Acc 3 = Balance of $30
Time Fetch Acc 1 (40)
Sum = 40
T1 T2
commit Fetch Acc 2 (50)
Sum = 90
Fetch Acc 3 (20)
Sum = 110, not 120
Fetch Acc 3 (30)
Update Acc 3
30 -> 20
Update Acc 1 (40)
40 -> 50
Can T1 write back to the database and will the DB be consistent?
DB Administration 19 CIS
Concurrency Control Algorithms
Serializability -- The execution of a set of transactions in some concurrent order is equivalent to the execution of those transactions in strictly sequential order.
Pessimistic concurrency control
¾ Locking
¾ Timestamping
¾ Multiple versions
Optimistic concurrency control
What is A.C.I.D.?
DB Administration 20 CIS
Currency Control Using Locks
Two-Phase Locking
¾ Growing phase -- Acquires required locks it will ever need.
¾ Shrinking phase -- Releases acquired locks and can't request any new lock
Binary Locks or Read/Write Locks
¾ Read-lock (S) a.k.a shared-lock -- Non-exclusive
¾ Write-lock (X) – Exclusive
Intention Locks
¾ Intent Shared (IS) – Transaction intends to set S locks on individual tuples in R.
¾ Intent Exclusive (IX) – Same as IS, plus transaction might update individual tuples in R and will therefore set X locks on those tuples
¾ Shared Intent Exclusive (SIX) – Combines S and IX; i.e., a transaction can tolerate concurrent readers, but not concurrent updaters in R, thus the transaction might update individual tuples in R and will therefore set X locks on those tuples
DB Administration 21 CIS
Granularity of Data Items
Size of data items chosen as unit of protection by concurrency control protocol.
Ranging from coarse to fine:
¾ The entire database.
¾ A table.
¾ An area or database space.
¾ A tuple.
¾ A attribute value of a tuple.
DB Administration 22 CIS
Granularity of Data Items
Tradeoff:
¾ coarser, the lower the degree of concurrency;
¾ finer, more locking information that is needed to be stored.
Best item size depends on the types of transactions.
DB Administration 23 CIS
Hierarchy of Granularity
Intention lock could be used to lock all ancestors of a locked node.
Intention locks can be read or write. Applied top-down, released bottom-up.
DB Administration 24 CIS
Example Using R/W Locks
Time start
start
read balance balance = balance - $100 write balance commit
balance = balance + $50 balance < 500 rollback (release lock)
T1 T2
What is the final balance?
write-lock (balance) read balance write-lock (balance)
waiting . . .
DB Administration 25 CIS
Deadlocks
o Deadlocks may occur when
Transactions, T1 and T2, request locks in parallel:
Time
request X lock on R2
T1 T2
request X lock on R1 request X lock on R2
request X lock on R1
wait wait wait wait wait etc.
wait wait wait etc.
DB Administration 26 CIS
What Is Query Optimization?
A process that translates a query into an equivalent form which is more efficient and costs less than the original one.
Query optimization is responsible by the DBMS
Supports associative queries
Example:
select Student.id, Student.name from Student, Class where Student.id = Class.id and
Class.course# = 'CIS3210';
Assume: 10,000 students
100 student are taking CIS3210 each student takes 3 classes
DB Administration 27 CIS
No Optimization Example
Student Class 10,000 tuples 30,000 tuples
5 Result 1
Main Memory 3510 tuples8
Result 1
ID Name
Σ
Sudent.id=Class.id
Σ
course#='CIS3210'
Read: 8
3510 tuples Write:
3510 tuples8
DB Administration 28 CIS
Optimization Example
Student Class 10,000 tuples 30,000 tuples
5 Main Memory
Result 1
ID Name
Σ
Sudent.id=Class.id
Σ
course#='CIS3210'
Read: 10 tuples Write: 10 tuples
6 6
(100 tuples)
10 tuples6
DB Administration 29 CIS
Query Optimization
Gives the DBMS more control over system performance.
Two main techniques for query optimization:
¾ heuristic rules that order operations in a query;
¾ comparing different strategies based on relative costs, and selecting one that minimizes resource usage.
Disk access tends to be dominant cost in query processing for centralized DBMS.
DB Administration 30 CIS
Heuristical Processing Strategies
Perform Projection as early as possible.
¾ Keep projection attributes on same relation together.
Perform Selection operations as early as possible.
¾ Keep predicates on same relation together.
Combine Cartesian product with subsequent Selection whose predicate represents join condition into a Join operation.
Use associativity of binary operations to rearrange leaf nodes so leaf nodes with most restrictive Selection operations executed first.
Compute common expressions once.
¾ If common expression appears more than once, and result not too large, store result and reuse it when required.
¾ Useful when querying views, as same expression is used to construct view each time.
DB Administration 31 CIS
Cost Estimation for Attribute Operations
Many different ways of implementing attribute operations.
Aim of Query Optimization is to choose most efficient one.
Use formulae that estimate costs for a number of options, and select one with lowest cost.
Consider only cost of disk access, which is usually dominant cost in Query Processing.
Many estimates are based on cardinality of the relation, so need to be able to estimate this.
DB Administration 32 CIS
Query Optimization (Summary)
Translate query into an internal representation
¾ Relational algebra expressions
Find other equivalent internal representation
¾ Several equivalent algebra expressions
Determine the execution cost of each representation
Determine costs based on attribute access
Optimizers chooses the best solution for a query plan