Distributed (Operating) Systems
Introduction
Fernando Pérez José María Peña María S. Pérez
Distributed Operating Systems 1
Schedule
Sessions
1. Introduction: Distributed systems (Hardware/Software issues)
2. Process management in clusters: Load balancing and job scheduling
3. Distributed communications 4. Distributed services
Scenarios
• High-performance solutions for scientific applications (process management)
• Distributed systems for transactional services Mon Tue 8:00 9:00 10:00 11:00 1-Intro 4-serv 12:00 13:00 LUNCH 14:00 15:00 2-proc Scenario 2 16:00 17:00 Scenario 1 3-comm
Fernando Pérez José María Peña María S. Pérez
Distributed Operating Systems 2
Bibliography
• Distributed Systems: Concepts and Design
G. Coulouris, J. Dollimore, T. Kindberg; Addison-Wesley, 2001
• Distributed Systems: Principles and Paradigms
A. S. Tanenbaum, M. Van Steen; Prentice-Hall, 2007
• Distributed Operating Systems: Concepts & Practice
D. L. Galli; Prentice-Hall, 2000• Distributed Operating Systems & Algorithms
R. Chow, T. Johnson; Addison-Wesley, 1997• Distributed Computing: Principles and Applications
M.L. Liu; Addison-Wesley, 2004Distributed (Operating) Systems
Introduction and
Concepts
Introduction and
Concepts
Fernando Pérez José María Peña María S. Pérez
Distributed Operating Systems 4
Distributed System (DS)
• Hardware: Network-connected processor without shared
physical memory:
– Loosely-coupled system – Non-common clock
– Processor-dependent I/O systems
– Independent failures of system components – Heterogeneous system
• Goal of this seminar: Distributed System Software
– Distributed Operating Systems (classical view)
– Software interface that hide distributed system complexity:
Fernando Pérez José María Peña María S. Pérez
Distributed Operating Systems 5
Advantages and Drawbacks
• Advantages:
– Cost/performance ratio
– Parallel processing: high performance – Fault tolerance: high availability
– Scalable, open and heterogeneous
– Most appropriate for originally distributed applications – E.g., geographically distributed enterprise
• Drawbacks:
– More complex software development
– Networks connection problems: latency, bandwidth and availability – Security
Fernando Pérez José María Peña María S. Pérez
Distributed Operating Systems 6
New Paradigms for DS
• Cluster Computing:
– Dedicated systems: • High performance. • High availability. – Homogeneous system: • Nodes.• LAN (generalist or specific).
– Open issues: Coupling degree, distributed services.
• Gird Computing:
– Resource sharing and idle processor usage. – Restricted to some specific tasks.
– Different scopes:
• Inter-departmental grids. • Inter-organization grids.
Fernando Pérez José María Peña María S. Pérez
Operating System Support
7
Operating System Support
1. OS for Distributed Systems:
• Requirements
• Characteristics
2. Distributed Systems
3. Parallel/Distributed OS:
• Operating Systems Parallelisation • Distributed System Services
Fernando Pérez José María Peña María S. Pérez
Operating System Support
8
Distributed Architectures
A distributed system is a collection of independent computers
presented to the user as a single computer.
Distributed Computer Architectures:
– Flynn’72: SISD, SIMD, MISD, MIMD – Johnson’88: UMA, NUMA, NORMA
Fernando Pérez José María Peña María S. Pérez
Operating System Support
9
Distributed System Application
• Internet Services: e-mail, news, web, ...
• Corporate networks or intranets.
• Parallel processing:
– Massive processing (+efficiency).
– Distributed topology (distributed-nature problems)
• Distributed massive data management.
• High performance multimedia.
• Industrial and control systems.
• Real-time systems.
Fernando Pérez José María Peña María S. Pérez
Operating System Support
10
Distributed System Profile
Distributed systems have:
1. No common clock: Message and co-ordination aspects. 2. Global concurrency: Real parallel execution.
3. Independent failures: Partial failures.
Distributed system usage:
1. Collaborative processing: combined features and services. 2. Parallel processing: massive or high-performance calculation.
Fernando Pérez José María Peña María S. Pérez
Operating System Support
Parallel systems
– Performance
– Scalability
– Reliability
– Transparency
– Security
11System Requirements
Collaborative systems
– Openness
– Scalability
– Reliability
– Transparency
– Security
Common characteristics but different hardware
platforms and applications.
Fernando Pérez José María Peña María S. Pérez
Operating System Support
12
Operating System Distribution
• Operating systems for multiprocessors with shared memory
(SMP):
– Software tightly coupled – Hardware tightly coupled
• Distributed operating systems (DOS):
– Software tightly coupled – Hardware loosely coupled
• Network operating system:
– Software loosely coupled – Hardware loosely coupled
Fernando Pérez José María Peña María S. Pérez
Operating System Support
13
Operating Systems for SMPs
Architectures with multiple processors (2 to 8) with uniform
access shared memory (SMP: Symmetric Multiprocessors)
Characteristics:
– “Small” variations of the traditional OS versions. – There is only one copy of the OS.
– Concurrency with real parallelism (≠ shared time). – Commercial versions (Linux, WinNT, Solaris, AIX, ...).
– Different problems: kernel code running on multiple processors
(concurrent system calls), synchronisation mechanisms (spin-locks), optimisation and scheduling (processor affinity), ...
Fernando Pérez José María Peña María S. Pérez
Operating System Support
14
Distributed Operating Systems (DOS)
A distributed operating system is a group of processor
interconnected by a communication network that hides its
complexity presenting to the user a “
virtual uniprocessor
”.
Characteristics:
– It runs on a distributed systems making them appear as a centralised system.
– Transparency: Must hide complex factor of the distribution. – It is easier to say than to do.
– This goal is reached partially by the experimental systems. – Failures make the users comply.
Fernando Pérez José María Peña María S. Pérez
Operating System Support
15
Distributed Operating Systems (DOS)
Problems:
– Each node has a copy of the OS: Which tasks are performed locally and which globally?
– How mutual exclusion is achieved without shared memory? – How deadlocks are detected without global states?
– Process scheduling: Each operating system copy has an own task queue (process migration).
– How a single directory tree is defined?
– Problems due to no-common clock, partial failures and heterogeneity.
Main result:
– New concepts have been developed and they are useful for other domains.
Fernando Pérez José María Peña María S. Pérez
Operating System Support
16
DOS Evolution
• First network operating systems:
– New network services in a conventional OS – E.g.: UNIX 4BSD (≈1980)
• New network functionalities:
– Sun’s ONC (≈1985): includes NFS, RPC, NIS
• First DOS:
– New OS based on conventional (monolithic) versions. – E.g.: Sprite, University of Berkeley (≈1988)
• DOS based on
μ
-kernel. E.g.:
– Mach, CMU (≈1986)
– Amoeba, designed by Tanenbaum (≈1984) – Chorus, INRIA, France (≈1988)
Fernando Pérez José María Peña María S. Pérez
Operating System Support
17
Network Operating Systems
Network of computers loosely coupled that share resources with
no external control on the hardware/software of each node.
Characteristics:
– No virtual uniprocessor vision is presented (independent nodes). – Each node runs a copy of the OS (different).
– Conventional OS+ network utilities.
– Communication protocols for resource sharing and high-level service access.
Fernando Pérez José María Peña María S. Pérez
Operating System Support
18
Cooperative Systems
High-level services-oriented software systems that requires
communication mechanisms to build upper level services.
Characteristics:
– A grade of transparency is provided but the single-system vision is not presented. Autonomous independent systems.
– They are founded on middlewares (CORBA, DCE, COM+, ...)
– These systems are designed as a combination of multiple services offered by different network elements.
Fernando Pérez José María Peña María S. Pérez
Operating System Support
19
Middleware
Middleware:
– Software layer over the operating system that provides standard distributed services.
– Open systems independent of the vendor. – Hardware and OS independent.
Examples:
– DCE (Open Group). – CORBA (OMG). – ... Hardware OS Hardware OS Hardware OS Middleware
Fernando Pérez José María Peña María S. Pérez
Operating System Support
20
Single System Image (SSI)
The illusion, created by hardware/software, that presents a
collection of resources as one.
– Hardware SSI: DEC Memory Channel or SMPs – Operating System: DOS or Gluing layer
– Application and Services: Middlewares (many levels).
Fernando Pérez José María Peña María S. Pérez
Operating System Support
21
Why SSI is useful?
• It is easy to program/use:
– Traditional programming, known interfaces. – Low-level issues hidden.
• Allows centralized and distributed management depending on
task requirement.
• (Potentially) provides:
– Fault tolerance. – Scalability.
Fernando Pérez José María Peña María S. Pérez
Operating System Support
22
Operating System Layers
A simplified vision of an Operating System has the following
layers:
• Hardware. • Kernel. • System services. • Application programs. • Users. Hardware Kernel Services Applications UsersFernando Pérez José María Peña María S. Pérez
Operating System Support
23
Kernel Responsabilities
Kernel Services μ−Kernel Services Computer Computerμ−Kernel μ−Kernel μ−Kernel
Services
Monolithic Kernels:
Many OS functionalities inside the kernel
scheduler, memory manager, drivers, file systems...
μ−Kernels:
Many OS tasks are performed outside the kernel. Remaining: (i) process communication, (ii) memory management, (iii) low-level management and
scheduling y (iv) low-level i/o
Distributed Services:
Distributed system structure. Depending on the level: Distributed operating systems Network operating systems or (Cooperative).
Fernando Pérez José María Peña María S. Pérez
Operating System Support
24
Operating System on Distributed Systems
Distributed Clusters
SMPs MPPs
Size 100s – 1000s 10s 100s or less 10s – 1000s
OS N x kernels Single OS kernel N x OS platforms N x OS platforms
OS type Specific purpose Standard OS tools (not always)plus Standard OS andspecial tools Communic. Message / DSM Shared Memory Message passing (e.g.: MPI) Message passing or middleware
Scheduling Single queue Single queue Multiple queues coordinated Independent queues Special variants
of standard OSs
Fernando Pérez José María Peña María S. Pérez
Operating System Support
25
Tools for Distributed/Cluster Systems
• Operating system:
– Modular/Layered Monolithic – Based on μ-Kernels
• Runtime systems:
– Parallel file systems or I/O libaries – Distributed shared memory software
• Resource management:
– Process scheduling tools – Load balancing
• Applications:
– Management and administration tools. – Processing tasks and jobs
Fernando Pérez José María Peña María S. Pérez
Distributed Operating Systems
Hardware and
Software Overview
Hardware and
Software Overview
Fernando Pérez José María Peña María S. Pérez
Distributed Operating Systems 27
Concept of Cluster
• Alternative to traditional supercomputing facilities.
• Instead of traditional systems:
– Specific hardware. – High-cost.
– Slow hardware development. – Painful software development.
• the use of general-purpose systems provides:
– Commodity hardware (Commercial-off-the-self: COTS). – Moderate-cost.
– Fast hardware development.
Fernando Pérez José María Peña María S. Pérez
Distributed Operating Systems 28
Concept of Cluster
Cluster
: Hardware system based on commodity hardware
connected by a dedicated (high-performance) network.
– Nodes: PCs or workstations (SMPs).
– Network: From high-speed networks to specific hardware.
Mysterious acronyms:
– PoPCs: Pile of PCs
– COWs: Clusters of workstations
– CLUMPS: Clusters of multiprocessors – NOWs: Networks of workstations
Fernando Pérez José María Peña María S. Pérez
Distributed Operating Systems 29
Hardware Characteristics
• Nodes:
– Processor: Intel Pentium, AMD Athlon, Compaq Alpha, IBM PowerPC, Sun SuperSparc (3-4...Ghz)
– Memory: SDRAM, DDR or similar (2-8 GB) – Storage: SCSI or RAID
• Network:
– Key element.
– It could cost 50+% of the system value
Fernando Pérez José María Peña María S. Pérez
Distributed Operating Systems 30
Cluster Networks (I)
• General purpose network technologies:
– Improvement in network bandwidth.
– Only reduced improvements in the latency Å Not well-suited
• Low-latency protocols:
– Active Messages (Berkeley): “Zero-copy” synchronous model. GAM. – Fast Messages (Illinois): Reliable AM in order.
– VMMC (Princeton): Distributed shared memory pages (DSM). – U-net (Cornell): Virtual interfaces for memory pages.
Fernando Pérez José María Peña María S. Pérez
Distributed Operating Systems 31
Cluster Networks (II)
• Cluster communication standards:
– VIA: Hardware interface (native/emulated) for communications. Mpas physical memory regions and virtual network interfaces. MPI versions over VIA.
– InfiniBand: I/O hardware standard (2.5Gbps) using one-way connections. 6 Communication models. Using RDMA and IPv6.
• Network hardware:
– Ethernet, FastEthernet, GigaEthernet: Cheap but limited. Collision problems. VIA emulations.
– Giganet (cLAN): Implementation over VIA (1.26Gbps)
– Myrinet: Low-latency programmable networks. Cut-through routing and failure detection. GM protocol.
Fernando Pérez José María Peña María S. Pérez
Distributed Operating Systems 32
Technologies Comparative
Gigabit Ethernet
Giganet Myrinet QsNet SCI ServerNet2
MPI badwidth – stable (MB/sec)
35-50 105 140 208 80 65
MPI latency (μseg) 100-200 20-40 ~18 5 6 20.2
Maximum number of nodes
1000’s 1000’s 1000’s 1000’s 1000’s 64k
VIA support Win/Linux Win/Linux Over GM NOne Software Hardware
MPI support type MPICH over MVIA or TCP Thrird parties Thrird parties Quadrics or Compaq Thrird parties Compaq or Thrird parties
Fernando Pérez José María Peña María S. Pérez
Distributed Operating Systems 33
Software Development (I)
• Operating Systems:
– Linux:
• Free, cheap, fast and fast-development. • e.g., Beowulf
– Solaris:
• Good parallelism support and good network services. • e.g., Solaris MC
– AIX:
• Powerful and well-optimized software development tools. • e.g., SP2
– Windows:
• Why not?
Fernando Pérez José María Peña María S. Pérez
Distributed Operating Systems 34
Software Development (II)
• Middleware and SSI:
– SSI (Single System Image): The whole cluster is presented as a
single monoprocessor. – Layered development:
• Hardware (Local).
• Operating system (μkernel) or gluing level: GLUnix or MOSIX • Application, services and middleware: CODINE
– Common services (desirable):
•Single access point. •Single file hierarchy.
•Single management point. •Single network connection.
•Single work-management service.
•Single user interface •Single I/O space
•Single process space •Checkpointing.
Fernando Pérez José María Peña María S. Pérez
Distributed Operating Systems 35
Software Development (III)
• Programming tools:
– Thread support: Pthreads or OpenMP – Message passing in clusters:
• MPI: MPICH or LANMPI.
• PVM: Worse performance but more features.
– DSM: Distributed shared memory:
• Software: TreadMarks, Linda or Nanos • Hardware: DASH or Merlin
– Parallel debuggers – Instrumentation tools.
Fernando Pérez José María Peña María S. Pérez
Distributed Operating Systems 36
Software Development (IV)
• Administration tools:
– Remote management:
• Administrative commands: install software, copy files. • Process-level resource management.
• User list and other system information: NIS.
• e.g., SP2 tools, Cluster Command & Control (C3)
– Scheduling systems:
• Work queues and workload management • Resource supervision.
Fernando Pérez José María Peña María S. Pérez
Distributed Operating Systems 37
Input/Output System
• I/O Crisis:
– Exponential growth of CPUs power (Moore’s law). – I/O systems much smaller growth.
– I/O phase is the actual bottleneck of high-performance systems.
• Solution based on I/O parallelism:
– Parallel I/O systems: MPI I/O
– Parallel filesystems: ParFiSys, GPFS – Intelligent I/O: Armada, Panda