EXEC SQL ROLLBACK; FINISH:
5.2 Recovery from System Failures
System failures (also called soft crashes) are those failures like power outage which affect all transactions in progress, but do not physically damage the database. During a system failure, the contents of the main memory are lost. Thus the
contents of the database buffers which contain the updates of transactions are lost. (Note: Transactions do not directly write on to the database. The updates are written to database buffers and, at regular intervals, transferred to the database.) At restart, the system has to ensure that the ACID properties of transactions are maintained and the database remains in a consistent state. To attain this, the strategy to be followed for recovery at restart is as follows:
• Transactions which were in progress at the time of failure have to be undone at the time of restart. This is needed because the precise state of such a transaction which was active at the time of failure is no longer known and hence cannot be successfully completed.
• Transactions which had completed prior to the crash but could not get all their updates transferred from the database buffers to the physical database have to redone at the time of restart.
This recovery procedure is carried out with the help of
• An online logfile or journal – The logfile maintains the before- and after-images of the tuples updated during a transaction. This helps in carrying out the UNDO and REDO operations as required. Typical entries made in the logfile are :
• Start of Transaction Marker
• Transaction Identifier
• Record Identifier
• Operations Performed
• Previous Values of Modified Data (Before-image or Undo Log)
• Updated Values of Modified Records (After-image or Redo Log)
• Commit / Rollback Transaction Marker
• Taking a checkpoint at specific intervals – This involves the following two operations:
a) physically writing the contents of the database buffers out to the physical database. Thus during a checkpoint the updates of all transactions, including both active and committed transactions, will be written to the physical database.
b)physically writing a special checkpoint record to the physical log. The checkpoint record has a list of all active transactions at the time of taking the checkpoint. 5.3 Recovery : An Example
At the time of restart, T3 and T5 must be undone and T2 and T4 must be redone. T1 does not enter the recovery procedure at all since it updates were all written to the database at time tc as part of the checkpoint proces
5.4 Concurrency
Concurrency refers to multiple transactions accessing the same database at the same time. In a system which allows concurrency, some kind of control mechanism has to be in place to ensure that concurrent transactions do not interfere with each other.
Three typical problems which can occur due to concurrency are explained here. a) Lost Update Problem
(To understand the above situation, assume that
• there is a record R, with a field, say Amt, having value 1000 before time t1. o Both transactions A & B fetch this value at t1 and t2 respectively. o Transaction A updates the Amt field in R to 800 at time t3.
o Transaction B updates the Amt field in R to 1200 at time t4.
Thus after time t4, the Amt value in record R has value 1200. Update by Transaction A at time t3 is over-written by the Transaction B at time t4.)
(To understand the above situation, assume that
• there is a record R, with a field, say Amt, having value 1000 before time t1. o Transaction B fetches this value and updates it to 800 at time t1. o Transaction A fetches R with Amt field value 800 at time t2.
o Transaction B rolls back and its update is undone at time t3. The Amt field takes the initial value 1000 during rollback.
Transaction A continues processing with Amt field value 800 without knowing about B's rollback.)
c) Inconsistent Analysis Problem
5.5 Locking
Locking: A solution to problems arising due to concurrency.
Locking of records can be used as a concurrency control technique to prevent the above mentioned problems. A transaction acquires a lock on a record if it does not want the record values to be changed by some other
transaction during a period of time. The transaction releases the lock after this time.
Locks are of two types 1. shared (S lock)
2. and exclusive (X Lock).
• A transaction acquires a shared (read) lock on a record when it wishes to retrieve or fetch the record.
• An exclusive (write) lock is acquired on a record when a transaction wishes to update the record. (Here update means INSERT, UPDATE or DELETE.)
The following figure shows the Lock Compatibility matrix.
Normally, locks are implicit. A FETCH request is an implicit request for a shared lock whereas an UPDATE request is an implicit request for an exclusive lock.
Explicit lock requests need to be issued if a different kind of lock is required during an operation. For example, if an X lock is to acquired before a FETCH it has to be explicitly requested for.
5.6 Deadlocks
Locking can be used to solve the problems of concurrency. However, locking can also introduce the problem of deadlock as shown in the example below.
Deadlock is a situation in which two or more transactions are in a simultaneous wait state, each of them waiting for one of the others to release a lock before it can proceed.
If a deadlock occurs, the system may detect it and break it. Detecting involves detecting a cycle in the “Wait-For Graph” (a graph which shows 'who is waiting for whom'). Breaking a deadlock implies choosing one of the deadlocked transactions as the victim and rolling it back, thereby releasing all its locks. This may allow some other transaction(s) to proceed.
Deadlock prevention can be done by not allowing any cyclic-waits. 6. Query Optimization
6.1 Overview
When compared to other database systems, query optimization is a strength of the relational systems. It can be said so since relational systems by themselves do optimization to a large extent unlike the other systems which leave optimization to the programmer. Automatic optimization done by the relational systems will be much more efficient than manual optimization due to several reasons like :
• uniformity in optimization across programs irrespective of the programmer's expertise in optimizing the programs.
• system's ability to make use of the knowledge of internal conditions (eg: volume of data at the time of querying) for optimization. For the same query, such conditions may be different at different times of querying. (In a manual system, this knowledge can be utilised only if the query is re-written each time, which is not practically possible.)
• system's ability to evaluate large number of alternatives to find the most efficient query evaluation method.
In this chapter we shall look into the process of automatic query optimization done by the relational systems.