Distributed DBMS
SyllabusIntroduction to DDMBS, Features and Needs, Reference architecture, levels of distribution transparency, replication, and Distributed database design, Fragmentation and allocation criteria, and Storage mechanisms, Translation of global queries, global query optimization, queries execution and access plan.
Concurrency control, 2-phase locks, distributed deadlocks, quorum based, time based protocols, comparison, reliability, non-blocking commitment protocols 3-phased), partitioned networks checkpoints and cold starts
Management of distributed transactions 2-phase unit protocols, architectural aspects, nodes and link failure recoveries, distributed data dictionary management distributed database administration
Heterogeneous database-federated data reference architecture, loosely and tightly coupled alternative architectures, development tasks, operational global task management client server database, SQL Server, Open database connectivity, constructing an application.
Introduction
Independent or centralized systems were the norm in the earlier days of information management. There was duplication of hardware and facilities. Incompatible procedures and lack of management control were the consequences of the evolving nature of the field.
In a centralized database system the DBMS and the data reside at a single location and control and processing is limited to this location. However many organizations have geographically dispersed operations. The reliability of the system is compromised since loss of messages between sites or failure of the communication links may occurs. The excessive load on the system at the central site would likely cause all accesses to be delayed.
An organization located in a single building with quazi – independent operational divisions, each with its own information processing needs and using a centralized database on a local network, would have similar problems.
The current trend is toward a distributed system. This is a central system connected to intelligent remote devices, each of which can itself be a computer or interconnected, possibly heterogeneous computers. The distribution of processing power creates a feasible environment for data distribution. All distributed data must still be accessible from each site. Thus, distributed database can be defined as consisting of a collection of data with different parts under the control of separate DBMS, running on independent computer systems. All the computers are interconnected and each system has autonomous processing capability, serving local applications. Each system participates as well in the execution of one or more global applications. Such applications require data from more than one site.
Another typical definition of a distributed database is the following:
A distributed database is a collection of data which belong logically to the same system but are spread over the sites of a computer network. Third definition emphasizes two equally important aspects of a distributed database.
1. Distribution: The fact that the data are not resident at the same site, so that we can distinguish a distributed database from a single, centralized database.
2. Logical co-relation: The fact that the data have some properties which tie them together, so that we can distinguish a distributed database from a set of local databases or files which are resident at different sites of a computer network.
The problem with the above definition is that properties, distribution and logical correlation are too vaguely defined. In order to develop a more specific definition, let us consider an example.
Fig. A distributed database on a geographically dispersed network.
Consider a bank that has 3 branches at different locations. At each branch a computer controls the teller terminals of the branch and the account database of that branch. Each computer with its local account database at one branch constitutes one site of the distributed database; computers are connected by a communication network. During normal operations the applications which are requested from the terminals of a branch need only to access the database of that branch. These applications are completely executed by the computer of the branch where they are issued, and will therefore be called local applications.
An example of a local applications is a debit or credit application performed on an account stored at the same branch at which the application is required.
Therefore the definition of a distributed database is a collection of data which are distributed over different computers of a computer network. Each site of the network has autonomous processing capability and can perform local applications. Each site also participates in the execution of at least one global applications, which requires the accessing data at several sites using a communication subsystem.
Need of Distributed DBMS
There are several reasons why distributed databases are developed. 1. Organizational and economic reasons :
Many organizations are decentralized and a distributed database approach fits more naturally the structure of the organizations. The problems of a distributed organizational structure and of the corresponding into systems are the subject of several books and papers. With the recent developments in computer technology, the economy of scale motivates for having large centralized computer centers is becoming questionable. The organizational and economic reasons are probably the most important reasons for developing distributed databases.
2. Interconnection of existing databases :
Distributed databases are the natural solution when several databases already exist in an organization and the necessity of performing global applications arises. In this case, the distributed database is created bottom up from the pre-existing local databases. This process may require a certain degree of local restructuring; however the effort which is required by this restructuring in much less than that needed for the creation of a completely new centralized database.
if an organization grows by adding new, relatively autonomous organizational units (new branches, warehouses, etc.) then the distributed database approach supports a smooth incremental growth with a minimum degree of impact on the already existing units. With a centralized approach it is very difficult to add new branches.
4. Reduced Communication Overhead :
In a geographically distributed database, any applications are local clearly reduces the communication overhead with respect to a centralized database.
5. Performance considerations :
The existence of several autonomous processors results in the increase of performance through a high degree of parallelism. This consideration can be applied to any multiprocessor system and not only to distributed databases. Distributed databases have the advantage in that the decomposition of data reflects application dependent criteria which maximize application locality; in this way the natural interference between different processors is maximized. The load is shared between the different processors, and critical bottlenecks are avoided.
Features of DDBMS
Distributed database can be defined as consisting of a collection of data with different parts under the control of separate DBMSs, running on independent computer systems. All the computers are interconnected and each system has autonomous processing capability, serving local applications.
There are various features of DDBMS are as follows:
1. Autonomy: in distributed database it is possible to identify a hierarchical control structure based on a global database administrator, who has the central responsibility of the whole database and on local database administrators, who have the responsibility of their respective local databases. However, it must be emphasized that local database administrator may have a high degree of autonomy, up to the point that a global database administrator is completely missing
and the inter-site coordination is performed by the local administrators themselves. This characteristic is usually called site autonomy.
2. Sharing: Data sharing is also provided by DDBMS, Users at a given site are able to access data stored at other sites and at the same time retain control over the data at their own site.
3. Availability & Reliability: Even when a portion of a system (i.e. a local site) is down, the system remains available. With replicated data, the failure of one site staff allows access to the replicated copy of the data from another site. The remaining site, continue to function. The greater accessibility enhances the reliability of the system.
4. Parallel Evaluation: A query involving data from several sites can be subdivided into sub queries and the parts evaluated in parallel.
5. Distributed data: Data distribution in DBMS with redundant copies can be used to increase system availability and reliability. If data can be obtained from a site other than the one that has failed, then availability improves, and as the system can still run, reliability does too. Data distribution can also be used to decrease communication costs. If most of the data used at a given site is available locally, the communication cost compared with that of a remote centralized system obviously reduced.
6. Data Distribution Transparency: Data independence means that the actual organization of data is transparent to the application programmer.
7. Management of distributed Data with different levels of
transparency: ideally, a DBMS should be distribution transparent in
the sense of hiding the details of where each file (table, relation) is physically stored within the system.
8. Distribution or Network Transparency: this refers to freedom for the users from the operational details of the network. It may be divided into location transparency and naming transparency. Location transparency refers to the fact that the command issued to perform a task is independent of the location of data and the location of the system where the command was issued. Naming transparency implies that once a name is specified, the named object can be accessed unambiguously without additional specifications.
9. Replication Transparency: copies of data may be stored at multiple sites for better availability, performance and reliability. Replication transparency makes the user unaware of the existence of copies.
10. Fragmentation Transparency: two types of fragmentation are
possible. Horizontal fragmentation decomposes all the rows of relation into several subsets that are called fragments of that relation. Vertical fragmentation decomposes attributes of a relation into several subsets that are referred as fragments. A global query by the user must be transformed into several fragment queries. Fragmentation transparency makes the user unaware of the existence of fragments. 11. Improved Performance: A distributed DBMS fragments the
database by keeping the data closer to where it is needed most. Data localization reduces the contention for CPU and I/O services and simultaneously reduces access delays involved in WAN. When a large database is distributed over multiple sites smaller databases exist at each site. As a result, local queries and transaction accessing data at a single site have better performance because of the smaller local database.
12. Easier Expansion: In a distributed environment, expansion of
the system in terms of adding more data, minimizing databases sizes, or adding more processors much easier.
Fig: Reference Architecture for Distributed Database.
Above figure shows reference architecture for distributed database. This architecture is not explicitly implemented in all distributed data base. However, it levels are conceptually relevant in order to understand the organization of any distributed database.
The major components of reference architecture are as follows:
Global Schema
:-At the top level of the reference architecture is the global schema. The global schema defines all the data which are contained in the distributed database as if the database were not distributed at all. For this reason, the global schema can defined exactly in the same way as in a nondistributed database. And also the global schema consists the definition of a set of global relations.
“Each global relation can be spilt in to several non overlapping parts which are called fragments”. Fragments are indicated by a global relation name with an index.
There are several different ways in which to perform the spitting operation; the mapping between global relations and fragments is defined in the fragmentation schema. This mapping is one to many.
Allocation schema
The allocation schema defines at which sites a fragment is located. Fragments are logical portion of global relation which is physically located at one or several sites of the network.
All the fragments which correspond to the same global relation R and are located at same site j constitute the physical image of global relation are at site j. This image is indicated by global relation name and a site index.
Example: Physical image Rj
Fig: fragments and physical images for global relation.
A global relation R is split into four fragments R1, R2, R3 & R4 are located
redundantly at three sites, and thus building three physical images are R1,
R2& R3.
To complete the terminology, we will refer to a copy of a fragment at a given site and denote it using the global relation name and two indexes and two physical images can be identical that is it is a copy of another physical image.
Example: - R1 is a copy of R2.
The three top level of reference architecture are site independent. at a lower level, it is necessary to map the physical image to the object which are manipulated by the local DBMS, this mapping is called a local mapping schema & depends on the type of local DBMS .
The three most important objectives which motivate the features of this architecture are the separation of data fragmentation and allocation, the control of redundancy and the independence from local DBMS.
1) Separating the concept of data fragmentation from the
concept of data allocation: This separation allows us to distinguish
two different level of distribution transparency these are as follows. a) Fragmentation transparency: It is the highest degree of
transparency and consists of fact that the user work on global relation.
b) Location transparency It is the lower degree of transparency and requires the user to work on the fragments instead of global relation.
2) Explicit control of redundancy reference architecture provides explicit control of redundancy at fragment level in above fig R2 & R3
physical images contain common data. The definition of disjoin fragments as a building block of physical images allows us to refer explicitly to overlapping part: replicated fragment R2.
3) Independence from local DBMS or Local mapping transparency It allows us to study several problems of distributed DBMS without having to take into account these specific data models of local DBMS.
Distribution Transparency
Distribution transparency means that program can be written as if the databases were not distributed. Thus the correctness of data from site to, however their speed of execution is affected.
There are three types of transparencies, which are as follows:
1) Fragmentation transparency (user need not be aware of the data),
2) Location transparency (user does not know the location of the data), and
3) Allocation transparency.
Fragmentation transparency is the highest degree of transparency and consists of the fact that the user or application programmer works on global relations.
Location transparency is lower degree of transparency and requires the user or application programmer to work on fragments instead of global relations; however, he or she Does not know where the fragments are located.
The separation between concept of fragmentation and allocation is very convenient in distributed database design, because the determination of relevant portions of the data is thus distinguished from the problem of optimal allocation.
To understand the concept of Distribution transparency, here consider a simple e.g. application, called SUPINQUIRY, which consists in accepting a supplier number from a terminal, finding the corresponding supplier name, and displaying it at the terminal.
Level 1 Fragmentation Transparency
The way in which the application access the database if the DDBMS provides fragmentation transparency in shown in fig a
Read (terminal, $SNUM); Select NAME into $NAME From SUPPLIER
Where $NUM=$SNUM; WRITE (terminal, NAME).
First, the application accepts a supplier number from the terminal; then it accesses the database. The whole SQL statement repeats a single distributed database access primitive, which receives the variable $SUPNUM as the input parameter and returns the variable $NAME as the o/p parameter. The DDBMS interprets the primitive by accessing the database at one of the three sites in a way in which is completely determined by the system.
From the viewpoint of distribution transparency ,notice that the application refers to the global relation name SUPPLIER, completely ignoring the fact that the database is distributed .in this way, the application is completely not distributed by any change, which is applied to all schemata which are below the global schema in reference architecture.
Level 2 Location Transparency
If the DDBMS provides locate ion transparency but not fragmentation transparency, the same application can be written as shown in the fig.
The request for the supplier with the given number in first issued referring to fragment.SUPPLIER1, and if the DDBMS returns a negative answer in the
control variable #FOUND, a similar request is issued with request to fragment SUPPLIER2. At this point, this naïve implementation assumes
Read (terminal, $SNUM); Select Name into $NAME From SUPPLIER2
Where SNUM=$SNUM; Write (terminal, NAME).
that the supplier has been found and displays the result. Of course several variations of this solutions are possible, for instance, issuing both requests in parallel in order to exploit to the parallelism of the distributed system; however this does not change the distribution transparency characteristics. This application is clearly independent from the changes in the allocation schema, but not from changes in the fragmentation schema, because the fragmentation structure is incorporated in the application.
However location transparency is by itself very useful, because it allows the application to ignore which copies exist of which fragment, therefore allowing copies to be moved from one site to another, and allowing the creation of new copies without affecting the application when allocation transparency is provided without fragmentation transparency, it is very effective to write advantage of knowing the fragmentation structure.
Level 3 Local Mapping Transparency
Read(terminal,$SNUM); Select NAME into $NAME From SUPPLIER1AT SITE1
If not #FOUND then Read(terminal,$SNUM); Select NAME into $NAME From SUPPLIER1AT SITE1
Where SNUM=$SNUM;
At this level we assume that the application will refer to objects using names which are independent from the individual local system; however it has to specify at which site names are indicated in the SQL statements by adding an at clause to the from clause. In this case each database access primitive is routed by the DDBMS to a specific site. However, these primitives use site-independent fragment names. If this mapping were not provided, the application would incorporate directly the filenames, which are not used by the local systems.
Replication
Replication improves the performance of simple read operation in a distributed system and improves it’s reliability. However updates incur greater overhead and the requirement that all replicates of data be updated and consistent adds complexity to a distributed database system.
A database said to be:
1. Strictly partitioned: when no replicates of the fragments are allowed 2. Fully redundant: when complete database copies are distributed at all
sites
3. Partially redundant: when only certain fragments are replicated
Choice 1: for data replication would lead to relatively expensive query
evaluation due to the unavailability of the data locally or at some near by site
Choice 2: Is very expensive in terms of storage space and the overhead to
maintain consistency, it is meaningless to replicate data at nodes where it is unlikely to be accessed
Choice 3: Is responsible for allowing reduced access time for frequently
higher reliability and availability during site crashes. However because of the replication, updates are expensive.
Updates require access to all copies of the data item. Because the copies are distributed over different sites of the network, the sites must reach a consensus on the possibility of an update failed sites may not participate in the agreement and sites may fail after the process has started. These issue are deal with later in this section on con-currency control and recovery.
Although a major aim of the database system is to reduce if not eliminate redundancy, planned data redundancy can improve distributed database performance
Ex : if a number of copies of a data item are available a read operation can
be directed to any one of these copies. A write operation, however must update all copies, otherwise we would have inconsistent data. The system is required to ensure that any update operation is done on all replicates this results in increased overhead a price to be paid in distributed databases.
Fragmentation
Fragment Each global relation can be split into several non-overlapping
portions which are called fragments.
Fragmentation Schema
The mapping between global relations and fragments is defined in the fragmentation schema this mapping is one to many i.e. several fragments correspond to one global relation but only one relation correspond to one fragment fragments are indicated by a global relation name with an index (fragment index)
Fragments are logical portions of global relation which are physically located at or several sites of the network all the fragments which correspond to the same global relation R and are located at the same site j constitute the physical image of global relation R at site j. (i.e. R)
There are some rules which must be followed when defining fragments:
(a) Completeness Condition: All the data of the global relation must be mapped into the fragments i.e. it must not happen that a data item which belongs to a global relation does not belong to any fragment.
(b) Reconstruction Condition: It must always be possible to reconstruct each global relation from its fragments. the necessity of this condition is obvious, in fact only fragments are stored in the distributed database and global relation have to be built through this reconstruction operation if necessary. (c) Disjointness Condition It is convenient that fragments be disjoint, so that the replication of the data can be controlled explicitly at the allocation level. however this condition is useful mainly with horizontal fragmentation, while for vertical fragmentation we will sometimes allow this condition to be violated the reason for this exception will be discussed when dealing with vertical fragmentation
Types of Fragmentation
The decomposition of global relations into fragments can be performed by applying two different types of fragmentation
1. Horizontal fragmentation 2. Vertical fragmentation 3. Mixed fragmentation
We will first consider these two types of fragmentation separately and then consider more complex fragmentation which can be obtained by applying combination of both.
1. Horizontal fragmentation
Horizontal fragmentation consist of partitioning the tuples of a global relation into subset, this is clearly useful in Database where each subset can contain data which have common geographical properties. It can be defined by expressing each fragment as a selection operation on the global relation
Ex : SUPPLIER(SNUM,NAME,CITY)
Then the horizontal fragmentation can be defined in the following way: SUPPLIER1=SL CITY = “SF” SUPPLIER
The above fragmentation satisfies the completeness condition if “SF” and “LA” are the only possible values of the city attribute otherwise we would not know to which fragment the tuples with other CITY values belong
The reconstruction condition is easily verified because it is always possible to reconstruct the SUPPLIER global relation through the following operation:
SUPPLIER = SUPPLIER1 UN SUPPLIER2
The Disjointness condition is clearly verified we will call the predicate which is used in the selection operation which defines a fragments its qualification q1: CITY = “SF”
q2: CITY = “LA”
The above example that in order to satisfy the completeness condition the set of qualifications of all fragments must be complete, at least with respect to the set of allowed values. The reconstruction condition is always satisfied through the union operation and the Disjointness condition requires that qualifications be mutually exclusive.
Derived Horizontal Fragmentation:
In some cases the horizontal fragmentation of a relation cannot be based on a property of its own attributes but is derived from the horizontal fragmentation of another relation.
Ex: SUPPLY (SNUM, PNUM, DEPTNUM, QUAN)
SNUM is a supplier number it is meaningful to partition this relation so that a fragment contains the tuples for suppliers which are in a given city. However city is not an attribute of the SUPPLY relation it is an attribute of the SUPPLIER relation considered in the above example.
Therefore we need a semi-join operation in order to determine the tuples of supply which corresponds to the suppliers in given city.
The derived fragmentation of supply can be therefore defined as follows:
SUPPLY1=SUPPLY SJ SNUM=SNUM SUPPLIER1 SUPPLY2=SUPPLY SJ SNUM=SNUM SUPPPLIER2
The effect of the semi-joint operations is to select from SUPPLY the tuples which satisfy the join condition between SUPPLIER1 OR SUPPLIER2 and supply.
The reconstruction of the global relation SUPPLY can be performed through the union as shown for SUPPLIER.
Q1 : SUPPLY.SNUM=SUPPLIER.NUM AND SUPPLIER.CITY =”SF” Q2 :SUPPLY.SNUM=SUPPLIER.NUM AND SUPPLIER.CITY = “LA”
2. VERTICAL FRAGMENTATION:
The vertical fragmentation of a global relation is the subdivision of its attributes into groups. Fragments are obtained by projecting the global relation over each group. This can be useful in distributed database where each group of attributes can contain data which have common geographical properties. The fragmentation is correct if each attributes is mapped into at least one attribute of the fragments moreover it must be possible to reconstruct the original relation by joining the fragments together.
Ex: EMP(EMPNUM,NAME,SAL,TAX,MGRNUM,DEPTNUM) A vertical fragmentation of this relation can be defined as EMP1 =PJ EMPNUM, NAME, MGRNUM, DEPTNUM EMP EMP2= PJ EMPNUM, SAL, TAX EMP
This fragmentation could for instance reflect an organization in which salaries and taxes are managed separately. The reconstruction of relation EMP can be obtained as:
EMP = EMP1 JN EMPNUM = EMPNUM EMP2
Notice that the above formula for the reconstruction of global relation EMP is not complete, because the result of joining EMP1 and EMP2 contains th column EMPNUM twice. This undesired replication can be eliminated by a projection operation that we omit to indicate.
Ex: the following vertical fragmentation of relation EMP EMP1 =PJ EMONUM, NAME, MGRNUM, DEPTNUM EMP EMP2 = PJ EMPNUM, NAME, SAL, TAX EMP
The attribute NAME is replicated in both fragments. we can explicitly eliminate this attribute when EMP through an additional projection operation. EMP= EMP1 JN EMPNUM = EMPNUM PJ EMPNUM, SAL, TAX EMP2
3. MIXED FRAGMENTATION
:
The fragments which are obtained by the above fragmentation operations are relation themselves so that it is possible to apply the fragmentation operation recursively provided that the correctness condition are satisfied each time .
The reconstruction can be obtained by applying the reconstruction rules in inverse order.
Ex: EMP(EMPNUM,NAME,SAL,TAX,MGRNUM,DEPTNUM)
The following is a mixed fragmentation which is obtained by applying the vertical fragmentation , followed by a horizontal fragmentation on DEPTNUM
EMP1 = SL DEPTNUM<= 10 PJ EMPNUM,NAME,MGRNUM,DEPTNUM EMP EMP2 = SL 10 < DEPTNUM> 20 PJ EMPNUM,NAME,MGRNUM,DEPTNUM EMP EMP3 = SL DEPTNUM >20 PJ EMPNUM,NAME,MGRNUM,DEPTNUM EMP EMP4 = PJ RMPNUM,NAME, SAL, TAX EMP
The reconstruction of EMP relation EMP is defined by the following expression:
EMP = UN( EMP1,EMP2,EMP3) JN EMPNUM =EMPNUM PJ EMPNUM,SAL,TAX EMP4
Mixed fragmentation can be conveniently represented by a fragmentation tree
In a fragmentation tree the root corresponds to a global relation .the leaves corresponds to the fragments and the intermediate nodes correspond to the intermediate results of the fragment defining expression. The set of nodes which are sons of a given node represent the decomposition of this node by a fragmentation operation (vertical or horizontal)
Ex: The fragmentation tree of relation EMP. The root (relation EMP) is vertically fragmented into two portions one portion corresponds to a leaf node of the tree (EMP1) the other portion is horizontally partitioned thus generating the other three leaves corresponding to fragments EMP1, EMP2 and EMP3.
Distributed Database Design
Designing a distributed database has many technical and organizational issues that become more difficult in a multiple site system. The technical problem is the interconnection of sites by a computer network and the optimal distribution of data and applications to the sites for meeting the requirements of applications and for optimizing performances. From the
v
h
EMP
EMP1 EMP2 EMP3
organizational viewpoint, the issue of decentralization is crucial, since distributed systems typically substitute for large centralized systems. Distributing an application has a major impact on the organization.
The mathematical problem of optimally distributing data over a computer network has been widely analyzed in the context of distributed the system and distributed databases. The major outcomes of this research are: 1. Several design criteria have been established about how data can be
conveniently distributed.
2. Mathematical foundation has been given to “design aids” that, in the near future, will help the designer in determining data distribution.
Framework for Distributed Database Design
The term “Distributed Database Design” has a very broad and imprecise meaning. The design of a centralized database amounts to:
1. Designing the “conceptual schema” which describes the integrated database (all the data which are used by the database applications).
2. Designing the “physical database”, i.e. mapping the conceptual schema to storage areas and determining appropriate access methods.
In a distributed database these two problems become the design of the global schema and the design of the local physical databases at each site. The techniques which can be applied to these problems are the same as in centralized databases. The distribution of the database adds to the above problems two new ones:
3. Designing the fragmentation i.e. determining how global relations are subdivided into horizontal, vertical or mixed fragments.
4. Designing the allocation of fragments, i.e. determining how fragments are mapped to physical images; in this way, also the replication of fragments is determined.
These two problems fully characterize the design of data distribution, Since fragmentation has been established as a distinguishing feature of distributed databases.
The distinction between these two problems is conceptually relevant, since the first one deal with the “logical criteria” which motivate the fragmentation of global relation, while the second one deals with the “physical” placement of data at the various sites.
Although the design of application programs is made after the design of schemata, the knowledge of application requirements influences schema design, since schemata must be able to support applications efficiently. Thus, in the design of a distributed database, sufficiently precise knowledge of application requirements is needed; clearly, this knowledge is required only for the more “important” applications, i.e. those which will be executed frequently or whose performances are critical.
In the application requirements we include:
1. The site from which the application is issued (also called site of origin) 2. The frequency of activation of the application (i.e. the number of
activation requests in the unit time)
3. The number, type and the statistical distribution of accesses made by each application to each required data “object”.
Objectives of the design of Data Distribution
In the design of data distribution, the following objectives should be taken in to account:
1.
Processing Locality
: Distributing data to maximize processinglocality corresponds to the simple principle of placing data as close as possible to the applications which use them. The simplest way of characterizing processing locality is to consider two types of references to data: “local” reference and “remote” reference. Clearly, once the sites of origin of applications are known, locality and remoteness of references depend only on data distribution.
Designing data distribution for maximizing processing locality can be done by adding the number of local and remote references corresponding to each candidate fragmentation and fragment allocation, and selecting the best solution among them.
The term complete locality designates those applications which can be completely executed at their sites of origin. The advantage of complete locality is not only the reduction of remote accesses, but also the increased simplicity in controlling the execution of the applications.
2.
Availability and Reliability of Distributed Data:
A highdegree of availability is achieved by storing multiple copies of the same information; the system must be able to switch to an alternative copy
when the one that should be accessed under normal conditions is not available.
Reliability is also achieved by storing multiple copies of the same information. It is possible to recover from crashes or from the physical destruction of one of the copies by using the other, still available copies. Since physical destruction can be caused by events such as fire, earthquake, or sabotage), it is relevant to store replicated copies in geographically dispersed locations.
3.
Workload Distribution
: Distributing the workload over the sitesin an important feature of distributed computer systems. Workload distribution is done in order to take advantage of the utilization of computers at each site, and to maximize the degree of parallelism of execution of applications.
4.
Storage Cost and Availability
: Database distribution shouldreflect the cost and availability of storage at the different sites. Typically, the cost of data storage is not relevant, if compared with CPU, I/O, and transmission costs of applications (but the limitation of available storage at each site must be considered.
Using all the criteria at the same time is extremely difficult, since this leads to complex optimization models. It is possible to consider some of the above features as constraints, rather than objectives.
Top-Down and Bottom-Up Approaches to the Design of
Data Distribution
There are two alternative approaches to the design of data distribution,
1. The top down and
2. The bottom up approach
In the top-down approach, we start by designing the global schema, and we proceed by designing the fragmentation of the database, and then by allocating the fragments to the sites, creating the physical images. The approach is completed by performing, at each site, the “physical design” of the data which are allocated to it.
This approach is the most attractive for systems which are developed from scratch, since it allows performing the design rationally.
When the distributed database is developed as the aggregation of existing databases, it is not easy to follow the top-down approach. In this case the global schema is often produced as a compromise between existing data descriptions. It is even possible that each pair of existing databases is independently interfaced using a different translation schema, without the notion of a global schema. This leads to systems which are different in conception from distributed database architecture.
When existing databases are aggregated, a bottom-up approach to the design of data distribution can be used. This approach is based on the integration of existing schemata into a single, global schema. By integration, we mean the merging of common data definitions and the resolution of conflicts among different representations given to the same data.
It should be noticed that the bottom-up approach lends itself less easily to the development of horizontally fragmented relations. Horizontal fragments of a same global relation must have the same relation schema; this feature is easily enforced in a top-down design, while it is difficult to “discover”. Since horizontal fragments are a relevant and useful feature of a distributed database, the integration process should attempt to modify the definitions of local relations, so that they can be regarded as horizontal fragments of common, global relations.
When existing databases are aggregated into a distributed database, it is possible that they use different DBMS. A heterogeneous system adds to the complexity of data integration the need for a translation between different representations. In this case, it is possible to make a one-to-one translation between each pair of different DBMS. However the approach which is mostly used in the prototypes of heterogeneous systems is to select a common data model, and then to translate it into unique representations all the different schemata of the involved DBMS.
In summary, the bottom-up design of a distributed database requires:
1. The selection of a common database model for describing the global schema of the database.
2. The translation of each local schema into the common data model. 3. The integration of the local schemata into a common global schema.
A distributed database (DDB) is a collection of multiple, logically
interrelated databases distributed over a computer network, like LAN, WAN ,MAN etc..
A distributed database management system (D–DBMS) is the software that manages the DDB and provides an access mechanism that makes this distribution transparent to the users where transparency is the separation of the higher level semantics of a system from the lower level implementation issues.
A concurrency control is defined as, it is a situation where several transactions execute concurrently in the database and there may be possibility of deadlock detection, such situation is controlled by any of the concurrency control schema or protocols and prevents the deadlocks, is known as concurrency control.
Example
1) Two-Phase Locking-based (2PL) 2) Timestamp Ordering (TO)
A concurrency control deals with the ensuring transactions atomicity in the presence of concurrent execution of transactions, such problem occurs in the synchronization. In distributed systems the synchronization problem is harder than centralized system.
An atomicity is a property of transaction where either all the operations of the transactions are reflected properly or none of them operation is performed.
A transaction is a collection of actions that make consistent transformations of system states while preserving system consistency.
A concurrency control method is responsible for maintaining consistency among the multiple copies of the data items placed in the DDBMS.
2-Phase Locking
The 2-phase locking ensures the serializability, the transaction is serializable if it produces the same result as some serial execution of the transaction. In serial execution, each transaction runs for completion before any statements from any other transaction are executed.
A lock is a variable associated with each data item and the manipulation of such variable is called locking.
The two phase locking is so called, because it has two phases as follows: 1) Growing Phase: A transaction may obtain locks but may not release
any lock.
2) Shrinking Phase: A transaction may not obtain any new locks.
Both the phases are monoatomic; the number of locks are only increasing at first phase and decreasing at second phase. All locks request by transaction or its sub transactions should be made in the growing phase and released in the shrinking phase. When a transaction issues and unlock instruction phase starts indicating that all required locks are obtained. Where data is replicated all sub transaction of the transaction which modify the replicated data item would be observed the two phase locking protocol. Therefore subtraction released a lock and subsequently has another sub transaction request another lock which required each sub transaction notify all other sub transactions acquired all its locks. This shrinking phase start once all subtractions acquired there locks.
When all sub transactions finished their growing phase the number of massages involved is high. The exchange of massages between the sites are done by using two phase commit protocol.
Ex. Consider two transactions T1 & T2 and sites S1 & S2. Suppose a
distributed data A is stored S1 and B at sites S2 then for execution of these
two transactions, each transactions generates two local sub transactions as T1s1, T1s2 & T2s1, T2s2 respectively executed at site S1 & S2.
Transaction T1 Transaction T2
Lock x (A) lockx (A)
A : = 100 A : = 200
Write (A) Write (A)
Unlock (A) Unlock (A)
Lock x (B) Lock x (B)
B : = 1000 B : = 2000
Write (B) Write (B)
Unlock (B) Unlock (B)
Fig. (a) Two modifying Transactions. The execution schedule in shown in following fig.
Transaction Transaction Transaction Transaction Time T1s1 T2s1 T1s2 T2s2
t1 Lock x(A) Lock x(B)
t2 A : = 100 B : = 2000
t3 Write (A) Write (B)
t4 Unlock (A) Unlock(B)
t5 Lockx(A) Lockx(B)
t6 A : = 200 B : = 1000
t7 Write (A) Write (B)
t8 Unlock (A) Unlock(B)
Fig.(b) A schedule for transactions.
Distributed Deadlocks
A system is in deadlock state if there exists several transactions such that every transaction is in the set is waiting for another transaction in the set.
Consider there exists a set of existing transactions are T0, T1, T3,
……… , Tn such that T0 is waiting for a data item that hold by T1, T1 is waiting
for a data item that hold by T2, and ……. , and Tn-1 is waiting for a data item
that hold by T0. Therefore none of the transaction gets processed in such
situation, & is called a deadlock detection.
A deadlock is occurred in distributed database system, when a cycle is produced in the execution of the transaction in Distributed Wait-For-Graph (DWFG), and these transactions are aborted or restarted repeated, such complex situation produced in system is known as distributed deadlock, as shown in following fig.
A deadlock occurs when each transaction T from the set of one or more transactions is waiting for some item which is locked by some other transaction T’ in that set. Hence each transaction which is in a waiting queue in that set is waiting for one of the other transaction release the lock.
A deadlock determines the cycles in the wait-for-graph, when more than one transactions are executes their operations for the same resource. A distributed deadlock is more difficult than that of centralized database.
Site-1 site-2 site-1
T1 A1 T2 A1 T1 A2 T2 A1 T1 A1
T1 A2
T2 A2
Fig a. A distributed wait for graph showing fig b. A local wait for graph. distributed deadlock
Labeling:- T- Transactions A- Agenda
A distributed wait for graph as shown in fig (a) consist of a cycle which corresponds to the deadlock detection. Fig. consist of two sites, two transaction T1 & T2 & two agents as A1 & A2.
Assume that each transaction has only one agent at each site where it is executed. A directed edge from an agent Ti Aj to an agent Tr As means that Ti Aj is blocked and waiting for Tr As.
There are following two reasons to an agent for waiting another one.
1. Agent Ti Aj waits for agent Tr Aj to release a resource which it needs. Here, Ti and Tr are different transactions and two agents are at same site, because agents request only local resources. Such wait is shown in fig. (a) where T1 A1 waiting for T2 A1 to release the resource at site 1.
2. Agent Ti Aj waits for agent Ti As to perform some required function. Here, two agents belongs from same transaction. The agent Ti Aj has requested that agent Ti As perform some function at a different site. As shown in fig (a) with dashed lines.
In fig. (b) which shows local wait for graph. The square nodes are called input parts if they have an edge entering the local wait for graph and output ports if they receive an edge existing the LWFG. A deadlock occurs in LWFG by a cycle in it.
In a distributed system a deadlock is occurred when a cycle is produced in DWFG. It is detected when and exchange of information takes place the cycle gets produced and deadlock occurs. It is a different task than local deadlocks.
T2 A2
A deadlock resolves the one or more transactions to be aborted or restarted, so that it releases its resources for other transactions.
Following are the criteria’s for aborting the transactions. a) Abort the youngest transaction.
b) Abort the transaction which owns less resources c) Abort the transaction with smallest abort cost.
d) Abort the transaction with the longest excepted time to complete. A redundancy increases the deadlocks probability in distributed DBMS.
Consider, two transactions T1 & T2 both of which must lock same data item
exclusively, as x.
If ‘x’ is replicated i.e. If x1 & x2 are two copies at site 1 & 2 and both
transactions wants to obtain lock then deadlock occurs. AT Site 1 : T1 locks x1 ; T2 waits
AT site 2 : T2 locks x2 ; T1 waits. Hence, deadlock occurs.
Methods of deadlock detection
1) Deadlock detection using centralized or hierarchical control.
2) Distributed deadlock detection. 3) Deadlock prevention.
1
) Deadlock detection using centralized or Hierarchical
control
:
In a centralized control method a deadlock detector is placed which has a responsibility to discover or find the cycles in DWFG. The deadlock detector receives the information from all sites in a distributed database.
At each local site there is a local deadlock detector which has responsibility to find all potential deadlocks at that particular site.
T1 A2 T1 A3
T3 A2
T3 A2
Fig.(a) Lacal wait for graph at site 1 Fig.(b) Potential global deadlock at site 1
The above fig. shows deviation of a potential global deadlock cycle from a local wait for graph.
To determine potential deadlock cycles, the local deadlock detector starts from an input port and searches backward along the local graph until it reaches an output port. Such part is a potential deadlock cycle. Only the initial and final agents of each potential cycle must be transmitted as shown in fig. (b).
A deadlock detector collects these massages forms a DWFG, finds cycles and selects the transactions for abort.
Drawbacks Of Centralized Deadlock
1. It is unprotected to failures of the site where the centralized detector runs. 2. It requires large communication cost.
To reducing a communication cost an hierarchical controllers are used.
Hierarchical controllers
NLLDD 0 NLDD 1 NLDD 2 T1 A1 T2 A1 T5A1 T3 A1 T4 A1 T1 A1 T4 A1Site 1 Site 2 Site 3 Site 4 Site 5
Fig. A tree for deadlock detectors Labeling :
NLDD – Non local deadlock detector. LDD – Local deadlock detector.
The local deadlock detectors determine local deadlocks and transmit information to the potential global cycles to non local deadlock detectors.
The performance of hierarchical deadlock detection mechanism depends on the choice of hierarchy.
Here, site 1 & 2 have a common detector as LDD1 While site 3, 4 & 5 has a common detector as NLDD 2 finally all of them are controlled by NLDD 0.
(2)
Distributed Deadlock Detection
Here, no distinction between local and non-local deadlocks detectors. Each site has same responsibility while exchanging an information to determine global deadlocks.
The potential deadlock cycles detected at local sites are transmitted through an algorithms. In LWFG all the input and output parts are collected into a single node called the external (EX). Main difference between centralized & distributed deadlock is that in centralized all the potential deadlocks cycles are sent to the designated site while in distributed deadlock detectors need a rule for determining to which site the potential deadlock cycles are transmitted. This rule must attempt to minimize amount of transmitted information.
The following algorithm consist of distributed deadlock detection.
Site 1 Site 1 Site 1
T1 T2 EX T1 T2 EX
T1 T2
T1 T2 EX T1 T2 EX
Site 2 Site 2 Site 2
Fig (a) Fig(b) fig(c)
Fig. Distributed Deadlock detection algorithm.
A local deadlock detectors performs at each site the following actions. 1) Forms LWFG including EX node.
2) The received massage perform following modifications of LWFG.
a) For each transaction in massage add it to LWFG if it does not already exist.
b) For each transaction n massage, starting with EX create an edge to the next transaction in the massage.
3) Cycles without EX node in LWFG indicate existence of end lock.
4) Cycles with EX node are the potential deadlock cycles. Which must be transmitted to different site.
(3)
Distributed Deadlock Prevention
Deadlock prevention eliminates the need for Dead lock detection & resolution.
It is done in following ways. If a transaction T1 request a resources which hold by T2 then “prevention test” is applied. If this test indicates a risk of dead lock, then T1 is not allowed to enter wait stats. While either T1 is aborted restarted or T2 is aborted & restarted.
Prevention test must insure that if T1 is allowed to wait for T2 then dead lock can never occur. This can be obtain by ordering transaction using lexicographical ordering.
T1 T2
Let Ti is allowed to wit for TJ , Where i<J. because it is impossible to
built a closed chain.
i1 i2 --- in i1.
Such that if iJ jk then ij < ik ∀ J,k.
Following are methods of deadlock prevention. 1) Non preemptive method.
2) Preemptive method. 1) Non preemptive method:
For deadlock prevention based on timestamps is the following: In preemptive method older transaction which already hold a resources & not allow younger transaction to wait for older ones.
2) Preemptive method: -
In Preemptive method older transaction preempts younger transaction therefore younger transaction waits for older.
Reliability
:-“A reliability is a measure of the success on the basis of that the system conforms to some authorities specification on its behavior”
“The deviation of such specified behavior is known as failures.”
The reliability of a system is inversely related to the frequency of failures. Reliability means the consistency of good quality which is acquired by the DDBMS.
A reliability of a system is measured in several ways which are based on the incidence of failures. These measures includes following parts
2) Mean time to repair (MTTR)
3) Availability (fraction of time that system meets its specification)
In a database system application, the reliability problem is divided in two parts as below
1) Application dependent
It requires that the transaction fulfills the general system’s specification.
2) Application
independent:-It requires that the transactions maintains their atomicity, durability, serializability & isolation properties.
Types of Failures
There are following types of Failures in the transactions 1. Transaction failures
• Transaction aborts (unilaterally or due to deadlock)
• Avg. 3% of transactions abort abnormally 2. System (site) failures
• Failure of processor, main memory, power supply, …
• Main memory contents are lost, but secondary storage contents are safe
3. Media failures
• Failure of secondary storage devices such that the stored data is lost
• Head crash/controller failure (?) 4. Communication failures
Lost/undeliverable messages Network partitioning
Components of Distributed Reliability Protocols
• Commit
protocols:- How to commit a transaction properly when more than one site are involved in the commitment.
It is different from centralized DB.
How to execute commit command for distributed transactions. Issue: how to ensure atomicity and durability?
• Termination
protocols:- Designed for the sites that are affected by a failed site, tell a site how to commit/abort properly when other site fails.
If a failure occurs, how can the remaining operational sites deal with it.
Non-blocking: the occurrence of failures should not force the sites to wait until the failure is repaired to terminate the transaction.
• Recover
protocols:- Address the recovery procedure for a failed site once it restarts, just opposite to the termination protocols.
When a failure occurs, how do the sites where the failure occurred deal with it.
Independent: a failed site can determine the outcome of a transaction without having to obtain remote information.
Non blocking commitment protocol
A commitment protocol is called blocking if the occurrences failures, that effects the some of the participating sites for wait until the failure is repaired before terminating the transaction & the transaction which is not terminated at sites called the pending at that site.
The basic 2-phase-commitment protocol are assures that transactions are correctly committed or aborted at all sites even in presence of failures.
Also, in some cases transactions are blocked until the failure is repaired, which blocks resources and reduce the system’s availability
The following fig. shows state diag. for 2-phase-commitment protocol without ACK massages. The input & output massages are indicated for each transaction. A transaction occurs when input massage arrives and which sends the output massage. The state information of the transaction is recorded into stable storage for recovery purpose.
-- / PM PM / RM ua/PM/AAM tm/ACM RM / CM CM / -- ACM/--AAM/ACM Coordinator Participants
= transactions which are due to an exchange of massages = unilateral transitions(timeout)
Fig. state diagram for the 2-phase-commit protocol
The problems due to 2-Phase-Commit are as follows
I A U C I R C A I = initial state
U = uncertain(waiting for some information) R = ready to commit A = abort (transaction) C = commit (transaction) Massages PM = prepare massage RM = ready answer massage AAM = abort answer massage ACM = abort command massage
CM = commit command Local conditions
ua = local unilateral abort tm = timeout
• Blocking
o Ready implies that the participant waits for the coordinator o If coordinator fails, site is blocked until recovery
o Blocking reduces availability
• Independent recovery is not possible These problems are solved by 3-phase protocol.
3-Phase
Protocol:-It is a nonblocking protocol, which overcomes the 2-phase-protocol and solves the problems of failures at the sites in the DDBMS.
When a failure occurs at coordinator sites, there are two possible cases as 1) At least one of the participants receives the command, which gives
massage to the other participant to terminate the transaction.
2) None of the participants has received the command, while only the coordinator site gets crashed; all the participant sites are operational and choose another coordinator site.
In both the above cases the transaction is terminated correctly, but consider the situation where no one of the operational participant receive any command and one of them gets failed, such problem can also be arrived at the coordinator site since the coordinator is also a participant of another coordinator site. If the coordinator site gets failed, it creates a situation where termination is impossible.
To eliminate the above blocking as shown in fig. a of 2-phase-protocol, here the operational participants are blocked because in the second phase of commitment the participants goes through R state which is ready to abort or commit transaction to A or C. therefore, if all the operational participants does not receive the command, then also the failed participants perform the abort or commit action, which must not performed. This problem can be eliminated by modifying 2-phase-protocol as the 3-phase-protocol as shown following fig. PM/AAM I / PM PM/RM ua / --B C U A C P C A R U C
tm /ACM AAM/ACM RM / PCM PCM/ OK tm /ACM OK /CM CM / Coordinator Participants
Fig. state diag. for 3-phase-commit protocol
The 3-phase-protocol consist of three phases, it is an extension of 2-phase-protocol which avoids the blocking problem. The third phase consist of multiple sites are involved in the decision to commit.
The above fig. consists of new state as prepared-to-commit-state, which commits the transaction. The coordinator issues the command in second phase either ACM command or PCM massage. When the PCM massage send the coordinator enters the new Before-commitment (BC) state then participant must sends OK massage. Hence it is entered in the new PCM state and records it to stable storage. Finally, when coordinator receive OK massage it enters in the final commit state (C) and sends final commit command (CM) which eliminates the blocking problem.
Network Partition
A network partitions are created the following two possibility arise
1) The coordinator& all its participants remain in one participant here the failure has no effect on the commit protocol.
2) The coordinator & its participants belong to several partitions then it is observed the sites in the one of the partitions failed. Sites that are not in partition containing the coordinator simply executed the protocol to deal with failure of the coordinator. The coordinator & the sites that are in the same partition as the coordinator follows the usual commit protocol, assuming that the site in the other partitions have failed.
New states
PC = prepared to commit state
New massage
PCM = enter the PC state OK = entered the PC state = possible restart transitions from
DEADLOCK HANDLING
A system is said to be in deadlock state, if there exist a set of transactions such that every transaction in a set is waiting for another transaction in set.
Suppose there exists a set of waiting transactions {T0,T1,…Tn} such
that T0 is waiting for the data item that T1 holds ,and T1 is waiting for the data
item that T2 holds, and…..,and Tn-1 is waiting for the data item that Tn holds,
and Tn is waiting for the data item that T0 holds. None of the transactions can
make progress in such situations.
The only remedy to this undesirable situation is for the system to invoke some drastic action, such as rolling back some of the transactions involved in dead lock. Rollback of a transaction may be partial: i.e., a transaction may be rolled back to the point where it obtained a lock whose release resolves the deadlock.
There are two principal methods for dealing with the problem of deadlock: 1) DEADLOCK PREVENTION protocol to ensure that the systems will NEVER
a deadlock state.
2) We can allow the system to enter the deadlock state, and then try to recover by using a deadlock detection and deadlock detection scheme.
Both the methods may result in transaction rollback. Prevention is commonly used if the probability that the system would enter a state of deadlock state is relatively high; otherwise, detection and recovery are more efficient.
DEADLOCK PREVENTION
There are two approaches to deadlock prevention:
1. One approach ensures that no cyclic waits can occur by ordering the request for locks, or requiring all locks to be acquired together.
2. The other approach is closer to deadlock recovery, and perform transaction rollback instead of waiting for a lock, whenever the wait could potentially result in rollback.
The simplest scheme under the first approach requires that each transaction locks all its data items before it begins execution. Moreover, either all are locked in one step or none are locked. There are 2 main disadvantages of this protocol:
1. It is often hard to predict, before the transaction begins, what data items need to be locked;
2. Data items utilization may be very low, since many of the data items may be locked but unused for long time.
Another approach for preventing deadlock is to impose an ordering of all data items, and to require that a transaction is locked.
The second approach for preventing dead lock is to use PREEMPTION and TRANSACTION ROLLBACKS.
In preemption, when a transaction T2 requests a lock that transaction T1
holds, the lock granted to T1 be preempted by rolling back of T1 ,and granting
of the lock to T2..To control the preemption, we assign a unique timestamp to
each transaction. The system uses these timestamps only to decide whether a transaction is rolled back, it retains its old timestamp when restarted. Two different deadlock-prevention scheme using timestamps have been proposed:
1) Waite-die Scheme: this is the nonpreemptive technique. When transaction Ti requests a data item currently held by Tj ,Ti is
allowed to wait only if it had a timestamp smaller than that of Tj
(i.e. ,Ti is older than Tj).Otherwise, TI is rolled back (dies).
Example, suppose that transactions T22,T23,and T24 requests a data item
held by T23 ,then T24 will be rolled back.
2) Wound-wait Scheme: this scheme is a preemptive technique. It is a counterpart to the wait-die scheme. When transactions requests a data item currently held by Tj,Ti is allowed to wait if it has a
timestamp larger than that of Tj (i.e. , Ti is younger than Tj).
Otherwise, Tj is rolled back(Tj is wounded by Ti).
From above example, with transaction T22, T23, and T24, if T22 requests a
data item held by T23, then the data item will be preempted from T23, and T23
will be rolled back. If T24 requests a data item held by T23, then T24 will wait.
Whenever the system rollback transactions, it is important to ensure that there is no STARVATION- i.e., no transaction gets rolled back repeatedly and is never allowed to make progress.
Both the wound-wait and the wait-die scheme avoid starvation: At any time, there is a transaction with the smallest timestamp. This transaction cannot be required to rollback in either scheme. Since timestamp always increase, and since transactions are not assigned new timestamps when they are rolled back, a transaction that is rolled back repeatedly will eventually have the smallest timestamp, at which point it will be rolled back again.
There are some differences in the way that the two schemes operate.
1) In the wait-die scheme, an older transaction must wait for the younger one to release its data items. Thus, the older the transaction gets, the more it tends to wait. On the other hand, in the wound-wait scheme, an older transaction never waits for younger transactions.
2) In the wait-die scheme, if a transaction Ti dies and is rolled back
because it requested a data item held by transaction Tj , then Ti may
reissue the same sequence of requests when it is restarted. If the data item is still held by Tj then Ti will die again. Thus, Ti may die several
times before acquiring the needed data item.
On the other hand in wound-wait scheme transaction Ti is wounded and
rolled back Tj requested a data item that it holds. When Ti is restarted and
requests the data item now being held by Tj, Ti waits. Thus, there may be