Distributed System
Distributed System
Reference Texts:
Andrew S. Tanenbaum and Marten van Steen:
– “Distributed Systems: Principles and Paradigms”
Pearson Prentice-Hall
George Coulouris, Jean Dollimore and Tim Kindberg:
– “Distributed Systems: Concepts and Design”
4th Edition, Pearson Education
Andrew S. Tanenbaum:
– “Distributed Operating Systems”
Course Outline
o Introduction to Distributed System (DS)
o Client – Server Architecture
o Process & Inter-Process Communication
o Mobile Agent
o Synchronization
o Consistency & Replication
o Fault Tolerance
o Security
o Distributed System Paradigms
1. Introduction
o The definition of distributed computing (DC)- distributed systems (DS) and characteristics
o Major goals in building DS
o Hardware and software concepts in DS
What Is A Distributed Computing?
Distributed computing is a field of computer science that studies distributed systems.
Distributed computing also refers to the use of distributed systems to solve computational problems. In distributed computing, a problem is divided into many tasks, each of which is solved by one computer.
What Is A Distributed System?
A distributed system is a collection of independent
computers that appears to its users as a single coherent system.
- A. S. Tanenbaum et al.
A distributed system is defined as one in which hardware or software components located at networked computers
communicate and coordinate their actions only by passing messages.
- G. Coulouris et al.
A distributed system is one on which I cannot get any work done because some machine I have never heard of crashed.
Types of Distributed Systems
Cluster - “A type of parallel or distributed processing system, which consists of a collection of interconnected stand-alone computers cooperatively working together as a single, integrated computing resource” [Buyya].
Grid - “A type of parallel and distributed system that enables the sharing, selection, and aggregation of geographically distributed autonomous resources dynamically at runtime depending on their availability, capability, performance, cost, and users‘ quality-of-service requirements” [Buyya].
Reasons for Distributed Systems
Can we live without them?
– Functional Separation (computers with different capabilities and purposes):
Clients and Servers
Data collection and data processing – Inherent distribution of:
Information: Different information is created and maintained by different persons (e.g., Web pages)
People: Computer supported collaborative work (virtual teams, engineering, virtual surgery)
Infrastructure: Retail store and inventory systems for supermarket chain (e.g., Coles, Safeway)
– Power imbalance and load variation:
Distribute computational load among different computers. – Reliability:
Long term preservation and data backup (replication) at different location. – Economies:
Sharing a printer by many users and reduce the cost of ownership.
Distributed Systems VS. Centralized Systems
Advantages:
– Economics: Microprocessors offer a better price / performance than mainframes
– Speed: A distributed system may have more total computing power than a mainframe.
– Inherent distribution: Some application like banking, inventory systems involve spatially separated machines
– Reliability: If 5% of the machines are downed, the system as a whole can still survive with a 5% degradation of performance.
– Incremental growth: Computing power can be added in small increments – Data sharing: Allow many users access to a common database
– Device sharing: Allow many users to share expensive peripherals.
– Communication: Make human-to-human communication easier-- Email
– Flexibility: Spread the workload over the available machines in the cost effective way.
Disadvantages:
Networks vs. Distributed Systems
Are they the same things?
Networks:
media for interconnecting local and wide area computers and exchange messages based on protocols. Network entities are visible and they are explicitly addressed (IP address).
Distributed System:
existence of multiple autonomous computers is transparent
However,
many problems (e.g., openness, reliability) in common, but at different levels.
Networks focuses on packets, routing, etc., whereas distributed systems focus on
applications.
Every distributed system relies on
Consequences of Distributed Systems
Computers in distributed systems may be on separate
continents, in the same building, or the same room.
DS have the following consequences:
– Concurrency – each system is autonomous. Carry out tasks independently
Tasks coordinate their actions by exchanges messages.
– Heterogeneity – No global clock
Characteristics of distributed systems
Parallel activities
– Autonomous components executing concurrent tasks
Communication via message passing
– No shared memory
Resource sharing
– Printer, database, other services
No global state
– No single process can have knowledge of the current global state of the system
No global clock
Goals of Distributed Systems
Why do we use DS?
– Connecting Users and Resources
Desired properties:
– Transparency– Openness – Scalability
Goal 1: Connecting users and resources
Computer resources include all hardware resources (e.g., disks, printers), software resources (e.g., compiler) and data resources (e.g., files) etc.
The main goal of a DS is to make it easy for users to access remote resources, and share them with other users in a controlled way.
Connecting users and resources also makes it easier to collaborate and exchange information.
Goal 2: Distribution Transparency
Why Transparency?
– To hide from the user and the application of the distribution of components, so that the system is perceived as a single system rather than a collection of independent components.
Transparency in a Distributed System
Different forms of transparency in a distributed system.
Transparency Description
Access Hide differences in data representation and how a resource is accessed
Location Hide where a resource is located
Migration Hide that a resource may move to another location
Relocation Hide that a resource may be moved to another location while in use
Replication Hide that a resource is replicated
Concurrency Hide that a resource may be shared by several competitive users
Failure Hide the failure and recovery of a resource
Degree of Transparency
Distribution transparency is generally preferable, but blindly hiding all distribution aspects from users is not always a good idea
For Example
:
– eNewsPaper
– Process interaction of different countries – Remote Printing job
Goal 3: Openness
Openness is the characteristic that determines whether the system can be extended in various ways.
In DSs, services are generally specified through interfaces,
which are often described in an Interface Definition Language
(IDL)
.
Openness: Summary
Open systems are characterized by the fact that their key interfaces are published.
Open DSs are based on the provision of uniform communication mechanism and published interfaces for access to shared resources.
Goal 4: Scalability
The number of computers in the world is unlimited. Therefore, accessing shared resources should be nearly independent of the size of the network.
A system is described as scalable if it will remain effective when there is a significant increase (or decrease) in the number of resources and the number of users.
Scalability
Figure. Examples of scalability limitations.
• Scalability of a system can be measured along three different dimensions:
– A system is scalable in size if more users and resources can be easily added.
– A system is geographically scalable if it allows users and resources to lie far apart.
– A system is administratively scalable if it can be easy to manage even if it spans many independent administrative organizations.
Scalability Problems
Characteristics of decentralized algorithms:
• No machine has complete information about the system state. • Machines make decisions based only on local information.
• Failure of one machine does not ruin the algorithm.
Scalability: Scaling Techniques
The scalability problems in DSs appear as performance problems caused by limited capacity of server and network. There are basically three techniques for scaling:
hiding communication latencies: try to avoid waiting for response to remote services as much as possible (using asynchronous communication); or reduce the overall communication.
distribution: taking a component, splitting it into smaller parts, and subsequently spreading those parts across the system. Example: Internet Domain Name System (DNS).
Scaling Techniques: reducing the overall communication
1.4
The difference between letting:
a) a server or
Scaling Techniques: distribution
1.5
Pitfalls when Developing Distributed Systems
False assumptions made by first time developer:
The network is reliable.The network is secure.
The topology does not change. Latency is zero.
Bandwidth is infinite. Transport cost is zero.
Hardware Concepts
Flynn’s Classification:
– SISD: Single Instruction stream, Single Data stream all traditional single CPU machines.
– SIMD: Single Instruction stream, Multiple Data streams array processors, parallel super-computers.
– MISD: Multiple Instruction streams, Single Data stream no known computer.
Hardware Concepts
MIMD (Multiple-Instruction Multiple-Data)
Tightly Coupled versus Loosely Coupled
Tightly coupled systems (multiprocessors)
o shared memory
o Inter machine delay short, data rate high
Loosely coupled systems (multicomputers)
o private memory
Hardware Concepts
There are several ways to organize hardware in multiple-CPU systems, especially in teams of how they are interconnected and how they communicate.
Based on memory organization:
-Multiprocessor: using shared memory
-communication: through shared variables
-Multicomputer: no shared memory, using private memory
-communication: message-passing Based on the architecture of interconnection network:
-Bus-based systems (multiprocessor or multicomputer)
Hardware Concepts
1.6
Hardware Concepts
Bus-based multiprocessors
cache memory hit rate
cache coherence
write-through cache: propagate write immediately
Hardware Concepts
A crossbar switch An omega switching network
Multicomputer Systems
a) 2D-Mesh (Grid) b) Hypercube
Software Concepts
• Software more important for users
• Two kinds of OS for multiple CPU systems:
• Loosely coupled Software • Tightly coupled Software
• Three types of DS:
1. Network Operating Systems 2. (True) Distributed Systems
Network Operating Systems
loosely-coupled software on loosely-coupled hardware
A network of workstations connected by LAN
each machine has a high degree of autonomy
o rlogin machine
o rcp machine1:file1 machine2:file2
Files servers: client and server model
Clients mount directories on file servers
Best known network OS:
Network Operating System
General structure of a network operating system.
(True) Distributed Systems
tightly-coupled software on loosely-coupled hardware
provide a single-system image or a virtual uniprocessor a single, global inter process communication
mechanism, process management, file system; same system call interface everywhere
Ideal definition:
“ A distributed system runs on a collection of
Multiprocessor Operating Systems
tightly-coupled software on tightly-coupled hardware
Examples: high-performance servers
shared memory
single run queue
Network OS, Distributed OS, and Multiprocessor OS
Item Network OS Distributed OS Multiprocessor OS
Does it look like a virtual uniprocessor?
No Yes Yes
Do all have to run the same OS? No Yes Yes
How many copies of OS are there? N N 1
How is communication achieved? Shared files Messages Shared memory
Are agreed upon network protocol required?
Yes Yes No
Is there a single run queue? No No Yes
Does file sharing have well-defined semantics?
Middleware
Middleware:
Positioning Middleware
General structure of a distributed system as middleware.
Middleware and Openness
In an open middleware-based distributed system, the protocols used by each middleware layer should be the same, as well as the interfaces they offer to applications.