Hace7epe Üniversitesi
Bilgisayar Mühendisliği Bölümü
BBM467
Data Intensive ApplicaAons
Dr. Fuat Akal
FoundaAons of Data[base] Clusters
•
Database Clusters
•
Hardware Architectures
•
Data Design Schemes
•
ReplicaAon Schemes
•
Query Parallelism
•
Logical Cluster OrganizaAon
Database Clusters
•
A cluster of computers can be thought as a single
compuAng resource.
–
It uAlizes mulAple machines to provide a more powerful
compuAng environment through a single system image.
•
There are two types clusters
–
high availability clusters (HA)
Hardware Architectures: Shared Memory
• All processors have access to the main memory and the disk, respecAvely.
• The processors are Aghtly coupled inside the
same box and interconnected with a special switch. • The interprocess communicaAon is done by using
a shared memory.
• The shared-‐memory approach presents simplicity and allows for load balancing as well as
inter-‐query parallelism which comes for free. • However, it is too expensive since it requires a
special interconnect among the processors.
• Its performance and scalability are limited with the available memory and communicaAon bandwidths.
P P P
Hardware Architectures: Shared Disk
• In the shared-‐disk approach, all processors have their own memory, but they share disks.
• The interprocess communicaAon occurs over a common high-‐speed bus.
• Provides high availability. All data is sAll accessible even when a node fails.
• Since each node has its own data cache, cache coherency must be maintained, e.g. by means of a lock manager, which results in reduced performance. • Shared-‐disk systems have limited scalability due to
bandwidth of the high-‐speed bus and potenAal bo7lenecks of shared hardware.
D D D
M M
Hardware Architectures: Shared Nothing
• In a shared-‐nothing architecture, each node is a complete stand-‐alone computer with its own memory and disk.
• The nodes are connected via switch or LAN. But, they do not share anything.
• The main advantages of such systems are very good scalability and high availability.
• However, the management of data is complicated and the programming with this model is harder due to importance of data parAAoning and allocaAon.
D D M M P P D M P
ParAAoning Schemes
• Ver$cal Par$$oning: VerAcal parAAoning divides the columns of a table into separate tables.
– VerAcal parAAoning makes projecAons and joins easier and helps opAmizing access to the cache by reducing size of the tuples. However, access to the whole table may be required anyway, when execuAng queries.
• Horizontal Par$$oning: Horizontal parAAoning divides a table along its tuples. Its basic
advantage is to allow parallel scans or projects.
– The hash par55oning is based on a hash funcAon that distributes the tuples according to a hashing key.
• useful for parallel exact match queries and hash-‐join operaAons.
• not appropriate for range queries and operaAons on other than parAAoning keys. – The range par55oning is made based on value intervals of parAAoning keys.
• uAlizes evaluaAons of range queries.
• the performance of the range parAAoning depends on the interval size.
– The round robin parAAoning technique distributes the tuples on each of the parAAons. This approach is also called striping. The number of logically con-‐secuAve tuples forms a striping unit.
• The relaAve size of the striping unit directly affects the performance.
• Small striping units result in more I/O parallelism for scans and long range queries. • Larger striping units, on the other hand, may cause latency to complete scans.
ParAAoning Schemes
A B C 1 2 3 4 5 6 7 8 9 10 A B 1 2 3 4 5 6 7 8 9 10 A C 1 2 3 4 5 6 7 8 9 10 A B C 1 2 3 4 5 A B C 6 7 8 9 10 A B C 1 4 5 A B C 7 8 10 A B C 2 3 6 9 A B C 1 4 7 10 A B C 2 5 8 A B C 3 6 9 Original Table a) Vertical Partitioning b) Hash Partitioning c) Range Partitioning d) Round-Robin PartitioningVirtual ParAAoning
•
Virtual parAAoning, also called query parAAoning,
assumes that all tables are fully replicated on each
cluster node.
•
In this approach, a query is decomposed into
subqueries which access small pieces of data by
appending range predicates to the where clause of
that query.
•
Each subquery then deals with only a small part of
the data.
Virtual ParAAoning (Example)
LineItem LineItem
node A node B
SELECT Sum(L_ExtendedPrice*L_Discount) AS Revenue
FROM LineItem
WHERE L_Discount BETWEEN 0.03 AND 0.05
AND L_OrderKey BETWEEN 0 AND 3000000
SELECT Sum(L_ExtendedPrice*L_Discount) AS Revenue
FROM LineItem
WHERE L_Discount BETWEEN 0.03 AND 0.05
AND L_OrderKey BETWEEN 300001 AND 6000000
SELECT Sum(L_ExtendedPrice*L_Discount) AS Revenue
FROM LineItem
WHERE L_Discount BETWEEN 0.03 AND 0.05
original query
subquery2 subquery1
ReplicaAon Schemes
•
Full Replica$on: Tables are duplicated on each cluster node.
That is, each node holds an exact copy of the original
database.
•
Par$al Replica$on: ParAal replicaAon means that only parts
of original database are replicated on the different cluster
nodes.
•
Mixed Replica$on: Both full and parAal replicaAon at the
ReplicaAon Schemes
a) Full Replica$on b) Par$al Replica$on
c) Mixed Replica$on Original Database
Global Database Scheme
Node 1 Node 2 Node 3 Node 4 Node 5
Node Group 1 NG 2 NG 3 Database Cluster Co-existing Design Schemes 1 2 3
Mixed Data Design
-‐ Organize as node groups (NG) -‐ Freely design every NG
Query Parallelism in a Cluster
•
inter-‐query parallelism: The capability of the
database management system to accept queries
from mulAple users simultaneously. Each query is
executed independently of the others.
•
intra-‐query parallelism: Achieved by decomposing
queries into subqueries and evaluaAng them
simultaneously.
Data Q1 Q2 Database (Partition) Data Q3 Database Partition
b) intra-query & intra-partition a) inter-query
Data
Q4
Database Partition
c) intra-query & inter-partition
Data
Database Partition
Data
Q5
Database Partition
c) intra-query & intra-partition & inter-partition
Data
Logical Cluster OrganizaAon
• Flat Cluster Architecture: Allows any cluster node to be accessible by
clients.
– Forms a federated database of disAnct databases running on independent servers.
• Connected by a LAN, no resource sharing, such as disks.
– Provides high availability and simple design.
• ReplicaAon is difficult to implement with this model.
• Middleware Based Cluster Architecture: A client can only interact with
the cluster through a coordinaAon middleware.
– The middleware is responsible for scheduling and rouAng of the clients requests. – The middleware has the knowledge about underlying cluster.
• It can be used to ensure correct execuAons of concurrent updates and reads.
• It also allows to improve overall throughput by choosing be7er components, e.g. with less load to perform client requests.
– It is subject to single point of failure.
• If the middleware fails, the cluster will become useless.
Logical Cluster OrganizaAon
Coordination Middleware Clients Database ClusterReplicaAon Management
•
ReplicaAon is an essenAal technique to improve availability
and scalability by fully or parAally duplicaAng data objects
among the nodes of a distributed system.
•
ReplicaAon management is responsible for the maintenance
of replicas and ensures consistency of mulAple copies of the
same data object residing on different nodes.
•
That is, replicaAon management is not simply copying data
objects onto different nodes of a distributed system.
SynchronizaAon of Updates
•
There are two possibiliAes for the locaAon of updates:
– Updates can either be centralized on one primary copy
– Or, be distributed on (a subset of) all replicas (update everywhere).
•
SynchronizaAon of updates can be done in two ways: eager
and lazy
a) Primary Copy b) Update Everywhere
: update
: updatable object : propagation : read-only object
SynchronizaAon of Updates
•
Eager (or synchronous) replicaAon.
– All copies of an object are synchronized within the same database
transacAon.
– Allows early detecAon of conflicts and presents a simple soluAon to
provide consistency.
– Has drawbacks regarding performance and due to the high
communicaAon overhead among the replicas and the high probability of deadlocks.
•
Lazy (or asynchronous) replicaAon.
– Replica maintenance is decoupled from the original database transacAon.
– The transacAons keeping the replicas up-‐to-‐date and consistent run as
separate and independent database transacAons aler the original transacAon has commi7ed.
– Compared to eager replicaAon approaches, lazy approaches require