IBM System Storage DS800
implementation for open syst
Hardware components and architectu
© Copyright IBM Corporation 2007
Unit objectives
After completing this unit, you should be able to: After completing this unit, you should be able to:
● Discuss the hardware and architecture of the DS
● Discuss the hardware and architecture of the DS
● Learn virtualization terminology used for configur
● Learn virtualization terminology used for configu
DS8000 subsystemDS8000 subsystem
● Describe the physical hardware components and
● Describe the physical hardware components an
● Describe the models and features provided by e
● Describe the models and features provided by e
● Describe the types of disk arrays that can be con
DS8000 subsystem
Agenda
●
DS8000 highlights
●
DS8000 hardware components
●
DS8000 architecture
●
DS8000 cache management
●
DS8000 RAS features
© Copyright IBM Corporation 2007
DS8000 highlights
●
New processor family - POWER5+ RISC (DS800
– DS8100 model 931 – DS8300 model 932/9B2
●
Significant extensions to enable scalability
– 64K logical volumes (CKD, FB, or mixed)
– Expanded volume sizes, dynamic volume add/delete
●
I/O adapters
– Fibre Channel/FICON host adapter (4 Ports, 2 or 4Gb/ – ESCON host adapter (2 Ports, 18 MB/s)
●
FC-AL disks
– 146 GB, 300 GB -10K or 73 GB,146 GB and 300 GB 1 – FATA disk drives of 500 GB / 7200 rpm
DS8000 series models (2107)
●
Are:
– High-performance
– High-capacity series of disk storage
– Designed to support continuous operations • Redundancy
• Hot replacement / updates
– Use IBM POWER5™ server technology
• That is integrated with the IBM Virtualization Engine™ t
●
Consist of:
– Storage unit
– One or two (recommended) Management Consoles (M
●
Graphic User Interface (GUI) or Command Line I allows:
●
For high availability, hardware components are re
© Copyright IBM Corporation 2007DS8000 models
● DS8100 (Model 921 or 931) – Processor complex (server):
• Dual
• Two-way
– Up to one expansion frame
● DS8300 (Models 922, 9A2, 932 or 9B2) – Processor complex (server):
• Dual
• Four-way
– Up to two expansion frames
– Models 9A2 and 9B2 support two IBM System Storage System (LPAR) in one storage unit
– Expansion frame Model 92E attaches to 921,922, 931 & 932 – Expansion frame Model 9AE attaches to 9A2 & 9B2
● Model conversions ExistingOptional c
– 2-way to 4-way LPAR (931 to 9B2) two-step process – 4-way to 4-way LPAR (932 to 9B2)
– 4-way LPAR to 4-way (9B2 to 932)
– LPAR expansion to expansion frame (9AE to 92E)
– Expansion frame to LPAR expansion frame (92E to 9AE) © Copyright IBM Corporation 2007
DS8000 R2 and R2.4 highlights
R2: Announcing new features for ALL models:
●
IBM POWER5+ processor: new DS8000 Turbo (
●
Processor memory for POWER 5+ processor
●
4Gb FCP / FICON adapter (available on all mode and 93x/9Bx)
●
500GB 7,200 rpm FATA drives (available on all m 92x/9Ax and
93x/9Bx)
●
3-site Metro / Global Mirror
●
Earthquake resistance kit
●
Ethernet adapter pair (for TPC RM support)
●
Performance Accelerator (Models 932, and 92E
R2.4: Announcing new features for ALL models:
● 300GB 15,000 rpm Fibre Channel drives
● HyperPAV (System z)
DS8000 hardware overview (old models
● 2-Way (Model 8100 - 2107-921)
– Two dual processor servers (POWER5) • Up to 128 GB cache (16, 32, 64 or 128 GB) – 8 to 64 2Gb FC/FICON – 4 to 32 ESCON Ports – 16 to 384 HDD
• Intermixable 73/146 GB 15Krpm, 146/300 GB 10Krpm – Physical capacity from 1.1TB up to 115TB
• (384 x 300 GB DDMs)
● 4-Way (Model 8300 - 2107-922/9A2))
– Two four processor servers (POWER5) • Up to 256 GB cache (32, 64, 128 or 256 GB) – 8 to 128 2Gb FC/FICON – 4 to 64 ESCON Ports – 16 to 640 HDD
• Intermixable 73/146 GB 15Krpm, 146/300 GB 10Krpm – Physical capacity from 1.1TB up to 192TB
• (640 x 300 GB DDMs) Expansion frame Model 92E at Expansion frame Model
9AE a © Copyright IBM Corporation 2007
DS8000 Turbo hardware overview
● 2-Way (Model 8100 - 2107-931)
– Two dual processor servers (POWER5+) • Up to 128 GB cache (16, 32, 64 or 128 GB) – 8 to 64 2Gb FC/FICON – 4 to 32 ESCON ports – 16 to 384 HDD
• Intermixable 73/146/300 GB 15Krpm, 146/300 GB 10Kr – Physical capacity from 1.1TB up to 192TB
• (384 x 500 GB FATA DDMs)
● 4-Way (Model 8300 - 2107-932/9B2)
– Two four-processor servers (POWER5+) • Up to 256 GB cache (32, 64, 128 or 256 GB) – 8 to 128 2Gb FC/FICON – 4 to 64 ESCON ports – 16 to 640 HDD
• Intermixable 73/146/300 GB 15Krpm, 146/300 GB 10Kr – Physical capacity from 1.1TB up to 320TB
• (640 x 500 GB FATA DDMs) Expansion frame Model 92E Expansion frame Model 9AE
Interfaces to DS8000 (1 of 2)
●IBM System Storage DS Storage Manager GUI (DS-SM:
Web-based GUI)
– Program interface to perform logical configurations and Copy S management functions
– Installed via GUI (graphical mode) or unattended (silent mode) – Accessed through Web browser
– Offers:
• Simulated configuration (offline)
– Create, modify, save logical configuration when disconnected – Apply to a network-attached storage unit
• Real-time configuration (online)
– Logical configuration and Copy Services to a network-attached storag
• Both
●DS command-line interface (CLI: script-based)
– OPEN hosts invoke and manage FlashCopy, Metro and Global
• Batch processes and scripts
• Check storage unit configuration and perform specific application f
• For example:
– Check current copy services configuration used by storage unit – Create new logical storage and Copy Services configuration settings – Modify or delete logical storage and Copy Services configuration settin
Interfaces to DS8000 (2 of 2)
● DS Open application programming interface (API
– Non-proprietary storage management client applicatio
• Routine LUN management activities (creation, mapping
• Creation or deletion of RAID5 and RAID10 volume spac
• Copy Services functions: FlashCopy, PPRC
– Helps to integrate configuration management support storage resource management (SRM) applications
– Enables automation of configuration management thro written applications
– Complements the use of Web-based DS-SM and scrip
– Implemented through IBM System Storage Common I Model (CIM) agent
• Middleware application providing CIM-compliant interfac
– DS Open API uses CIM technology to manage proprie open system devices through storage management ap
– DS API used by TPC for disk
IBM system storage management conso
● Focal point for
– Configuration, Copy Services, Maintenance
● Dedicated workstation installed inside DS8000
– Is the eServer Power5 HMC and can be also called S-DS8000 – Automatic monitoring the state of system
– Notify user and IBM when service required (Call Home – Can also be connected to network
• Enabling centralized management through GUI, CLI, or – Called SMC on DS6000
● External management console (optional)
DS8000 management console overview
●System storage Management Console: MC – Other possible names: HMC or S-HMC
– On DS6000: SMC
● Storage Management Console is the focal point for conf Copy Services management, and maintenance activities
– Dedicated workstation physically located (installed) inside your DS8300 and can automatically monitor the state of your system IBM when service is required.
– The Management Console can also be connected to your netw centralized management of your system using the IBM System
Command Line Interface or storage management software that System storage DS Open API.
– An external Management Console is available as a optional fea used as a redundant management console for environments wi requirements.
●Internal Management Console feature code: 1100
●External Management Console feature code: 1110 © Copyright IBM Corporation 2007
DS8000 management console features
●
DS8000 management console multiple functions:
– Local service
• Interface for local service personnel – Remote service
• Call home and call back – Storage facility configuration
• LPAR management (HMC)
• Supports logical storage configuration via preinstalled s DS Storage Manager in online mode only
– Network Interface Server for logical configuration and advanced Copy Services functions
●
Service appliance (closed system)
●
Connection to Storage Facility (DS8000) through private
Ethernet networks only
Hardware management console
AIX
AIX
Unassigned EthernetResources
Partition 1 Partition 2 Status
POWER5 Hypervisor Command/Response Virtual Consoles R A M Service P5 HMC featu Processors LPAR Processor
-V
ol
at
ile Mem Regions Allocation Perm Temp • Logical partition • Dynamic logical
I/O Slots Tables • Capacity and re
N o n management • System status • HMC managem
• Service function update, …) • Remote HMC in
DS8000 MC and a pair of Ethernet switc
● Every DS8000 base frame comes with a pair of Ethernet switch cabled to the processor complex.
● The MC has
• Two built-in Ethernet ports
> The MC private Ethernet ports shown are configured int Ethernet switch to form the private DS8000 networks.
• One dual-port Ethernet PCI adapter
• One PCI modem for asynchronous Call home support.
● The customer Ethernet port indicated is the primary port to be u the customer network.
● The empty Ethernet port is normally not used.
● Corresponding private Ethernet ports of the external MC (FC111 plugged into port 2 of the switches as shown in next foil.
● To interconnect two DS8000 base frames, FC1190 would provi Ethernet cables to connect from port 16 of each switch in the sec into port 15 of the first frame.
• If the second MC is installed in the second DS8000, it would remain plug
Ethernet” switches.
DS8000 MC and Ethernet switches plug
PCI Modem
DS8000 MC – network configuration
●Storage Management Console network consists of: – Redundant private Ethernet networks for connection to the Stor
– Customer network configured to allow access from the HMC to secure Virtual Private Network (VPN)
●Call home to IBM Services is possible through Dial-up ( the MC) or Internet connection VPNs
●Dial-up or Internet connection VPNs is also available for provide Remote Service and Support
●Recommended configuration is to connect MC to custom network for support
– Support will use WebSM GUI for all service actions
● Network connectivity and remote support is managed by
DS8000 and DS6000 remote access feat
● Call Home (Outbound connectivity): Automatic P Reporting
– IBM DS6000 and DS8000 are designed with a “Call H
• In the event of a failure, the Call Home function generat with the IBM support organization
• IBM support determines the failing component and disp customer engineer with the replacement part
● Remote Service and Support (Inbound connectiv
– With remote support enabled, IBM technical support c management console to troubleshoot a problem, view and traces interactively
– This can reduce lag time to send such information to I shorten problem determination time
– In the case of complex problems, IBM technical suppo engage a specialist quickly to resolve the problems as possible
DS8000 MC network topology
DS8000 Subsystems Customer Network DMZ VPN Redundant MC Opt. Firewall EthernetFabric eth eth provided by
eth modem customer
Internet Internet integrated VPN Firewall Proxy MC : DMZ Hardware Management Console
IBM
DMZ: Demilitarized Zone
VPN: Virtual Private Network Network
IBM Remote
infrastruc
How virtual private network (VPN) opera
● The VPN server is located behind IBM firewall, which is desi secure ● The VPN client is located behind the customer firewall
– The customer has control over opening a connection to access the cl
– Neither IBM technical support nor non-authorized personnel can acc customer’s permission
● The VPN Server security complies with IBM corporate secur ITCS104
– This is an IBM internal security measure for all IBM secure data.
Remote support (Inbound)
DS8000 MC – remote service security
● Server authentication via private/public key:
– Each MC generates a certificate based on the private will use for Secure Sockets Layer (SSL) based encryp decryption.
– The IBM SSR transmit the certificate for the installed database maintained within the IBM secure network.
– IBM personnel then will retrieve the MC specific certifi database and use this key (public key) to establish the session with the MC needed service.
● SSH over VPN for command line access:
– Secure Shell (SSH) is used for command line access f IBM location (for example: putty ssh session with publ
Product Engineer is currently logged
– SSH client authentication is done through private/publ
DS8000 Data flow
● The normal flow of data for a write is the following:
– 1. Data is written to cache memory in the owning server.
– 2. Data is written to NVS memory of the alternate server.
– 3. The write is reported to the attached host as having been completed.
– 4. The write is destaged from the cache memory to disk.
– 5. The write is then discarded from the NVS memory of the alternate server.
Under normal operation, both DS8000 servers are act processing I/O requests.
DS8000 supported operating systems
●IBM:
– System i: OS/400, i5/OS, Linux, and AIX – System p: AIX and Linux
– System z: z/OS, z/VM, and Linux
●Intel servers:
– Windows, Linux, VMware, and NetWare
●Hewlett-Packard: – HP-UX
– AlphaServer: Tru64 UNIX (April 2005) and OpenVMS (April 20
●Sun: – Solaris
– OSX
●SGI Origin Servers:
– IRIX (April 2005) Check the DS8000 series Interoperability
● Fujitsu Primepower and updated Information on this subject.
Host connectivity: IBM SDD and MPIO
● SDD provides the following fun– Enhanced data availability – Automatic path failover
– Dynamic I/O load-balancing acros – Path selection policies for the hos – Concurrent download of licensed
● With DS6000 and DS8000, SD on the following operating syst – Windows – NetWare – AIX – HP-UX – Sun Solaris – Linux
● Can coexist with RDAC (DS40 driver) on most operating syst manage separate HBAs.
● Can not be used with most oth drivers (in other
Powerpath)
MPIO 2.1.0 PCM (Path Control Module) is also supported forAIX 5.2 ML5 (or later) and AIX 5.3 ML 1(or later).
Warning: Default MPIO is not supported: sddpcm file sets are required.
DS8000 enhancements – at a glance
● Hardware – new “everything”– Processors, adapters, internal paths, frames…
● Increased management flexibility via storage system LPARs
● Enhanced performance
– Faster or more of almost “everything” – New patent pending cache algorithms
● Extended logical device addressing – Up to 256 logical subsystems (LSS) with
virtualized assignment of physical capacity to LSSs
– Up to 64K logical volumes
● Extended connectivity
– Up to 128 host ports (FC or FICON) – or 64 ESCON host ports
– Up to 510 FCP logins per port and 8,192 per Storage LPAR
– Up to 512 FICON logical paths per logical control unit image and 128,000 per storage facility image
– Up to 256 FICON logical path groups per control unit image
● Improved volume man – Nondisruptive volume
– Up to 64K volumes as Logical Subsystems ( contain volumes for m
– Larger LUNs (over 2 T
– 65,520 cylinder (55.6
● Improved administratio
– Online and offline con using a Web-based gr
(GUI)
– Ease-of-use improvem ESS Specialist)
– Command line interfa control of copy service dependencies on GUI
● Even more attractive T Ownership – More flexible feature li
– Four year standard w – Larger capacity volum
– Increased opportunitie
DS8000 overview
2-way 2-way 4-way or LPAR 4-way or
(Base Frame (Base Frane +
+
Only) Only) Expan
Expansion Frame
Fra
Processor (DS8000 turbo) 2-way 2-way 4-way 4-w
- System p Squadron Power5+ 2.2GHz 2.2GHz 2.2GHz 2.2G
Cache 16 to 128 GB 32 to 256 GB 32 to 2
Expansion Rack Yes (1) -- Yes (1 or 2)
--Host Adapters 2 to 16
- per HA:4-port FC / FICON(4Gb) (e.g., 8 to 64 FC / 2 to 16 2 to
- per HA: 2-port ESCON FICON ports)
Device Adapters 2 to 8 2 to 8 2 to 8 2 to
(e.g., 1-4 FCALs)
Drives - 73 GB (15K rpm) 16 to 128
16 to 384 16 to 128 16 to
- 146 GB (10K & 15K rpm) (Increments of
- 300 GB (10K & 15K rpm) (increments of 16) (Increments of 16) (incremen 16)
- 500 GB FATA (7200 rpm)
Power Three-Phase Three-Phase Three-phase Three-P
Dimensions
76 x 33.25 x 43 in 76 x 66.5 x 43 in 76 x 33.25 x 43 in 76 x 66.5
- Height x Width x Depth
9.93 sq. ft. 19.86 sq. ft. 9.93 sq. ft. 19.86 s - Footprint
© Copyright IBM Corporation 2007
DS8000: primary frame topology
Redundant Power BBU: Battery Backup UnitsSt
Dual FC-AL Loop Switches
or 2 ESCON Ports
Front Device Adapter
4 FC-AL Ports © Copyright IBM Corporation 2007
DS8000 terminology
● Storage complex
• A group of DS8000s managed by a single Management Console
● Storage unit
• A single DS8000 including expansion frames. ● Processor complex
• One P5-570 p-Series server
– Two processor complexes form a redundant pair
• Divided into one LPAR (models 931 or 932) or two LPARs (model 9B2)
● Storage server
• The software that uses an LPAR
– Has access to a percentage of resources available on the Processor Complex for the LPAR
– At GA, this percentage is 50% (model 9B2) or 100 % (models 931 or 932)
• Union of 2 LPARs, one from each processor complex
– Each LPAR hosts one storage server.
DS8000 hardware components detail
Proce
DS8000 processor complex
Proce
DS8000 processor complex: POWER5 s
CEC Enclosures in the Model 921/931 each have one processor CEC Enclosures in the Model 922/932 & 9A2/9B2 each have twocards (4 Way)
CEC: Computer Electronic Complex
CEC Enclosures contain components such as the processor cards, cache CEC hard drives.
IBM Eserver p570
● Scales from a 1-way to a 16-way SMP using 4U building blocks
● Dynamic LPAR and Micro Partitioning
● Simultaneous Multi-threading (SMT)
● Self Healing features – Bit-Steering (bit sparing)
– Chipkill™ ECC (8-bit packet correct) – ECC on processor cache memories – L3 cache line deletes
8-way 12-way 2-way 4-way
– Memory scrubbing
– Dynamic processor deallocation
● Other RAS attributes – N+1 power and cooling
– Hot-plug PCI 16-way
– In-place service
– First fault data capture
● Optimized for storage – High I/O bandwidth RIO-2 – Large robust memories – 4K memory allocation FX0 Thread0 active FX1LS0 Thread1 active LS1 No Thread active FP0 FP1 BRX CRL
Execution units utilization
DS8000 processor complex PC
● IBM eServer System p POWER5 servers (921, 922, and
– 2-way 1.5 GHz (3X on ESS 800) – 4-way 1.9GHz (6X on ESS 800)
● New DS8000 Turbo (931, 932, and 9B2) are using POW processors – 15 % performance improvement.
– 2.2 GHz for POWER5+ 2- and 4-way
● The POWER5 processor supports Logical Partitioning
– The p5 hardware and Hypervisor manage the real to virtual me provide robust isolation between LPARs.
– IBM has been doing LPARs for 20 years in mainframes and 3 y ● At GA, LPARs are split 50-50, so:
– A 4-way has two processors to one LPAR and two processors t
• Post GA, 25-75 possible.
– LPARs only possible in the 4-way P5s (RIO-G cannot be share ● Cache memory from 16 GB - 256 GB
– Battery backed for backup to internal disk (4 GB per server) © Copyright IBM Corporation 2007
Server LPAR concept overview
● An LPAR:
– Uses hardware and firmware to logically partition resources
– Is a subset of logical resources that are capable of supporting a system
– Consists of CPUs, memory, and I/O slots that are a subset of t resources within a system
• Very flexible granularity according to AIX level (5.2, 5.3, and so on
• No need to conform to physical boundaries of building blocks ● In an LPAR:
– An operating system instance runs with dedicated (AIX 5.2) or resources: processors, memory, and I/O slots
– These resources are assigned to the logical partition
– The total amount of assignable resources is limited by the phys resources in the system
● LPARs provide:
– Isolation between LPARs to prevent unauthorized access betw boundaries
– Fault isolation such that one LPARs operation does not interfer operation of other LPARs
– Support for multiple independent workloads, different operating operating system levels, applications, and so on
LPAR applied to Storage Facility Images
Processor complex 0 LPAR01 LPAR02 LPARxyProcessor complex 1
Storage
Facility LPAR11 supports
Image 1 – Currently,
complex d
Storage
LPARs
● An LPAR in
Facility LPAR12 complex
Image 2 – Set of res
exec of a
Delivered AS IS, no need using the MC to configure
DS8000 processor complex
DS8000 persistent memory
●The 2107 does not use NVS cards, NVS batteries, or N chargers
●Data that would have been stored in the 2105 NVS card 2107 CEC cache memory
– A part of the system cache is configured to function as NVS sto
●In case of power failure, if the 2107 has pinned data in c written to an extra set of 4 disk drives located in each of enclosures
●Six disk drives total in each CEC:
– 2 for LIC (LVM Mirrored AIX 5.3 + DS8000 code) – 4 for pinned data and other CEC functions
●During the recovery process the pinned data can be rest extra set of CEC disk drives just as it would have been f cards on the ESS 800
DS8000 I/O enclosure
Proce
RIO-G and I/O enclosures
● Also called I/O drawers
● 6 PCI-X slots: 3.3V, 133 MHz blind swap Hot-plug:
– 4 port Host Adapter cards with 4 ports each: • FCP or FICON adapter ports
– 2 Device Adapter cards with 4 ports each: • 4 FC-AL ports per card
• 2 FC-AL loops per card
● Access to cache via RIO-G internal bus
● Each adapter has its own PowerPC processor ● Owned by processors in LPAR
● SPCN: System Power Control Network
– Used to control and monitor the status of the power and cooling within the I/O enclosure
– Cabled as a loop between the different I/O enclosures
DS8000 RIO-G port: layout example
Up to in sam
Up to to
P5-– Max
– 2000 loop
Each RIO-G port can operate at 1 GHz in bidirectional mode and is capable of pa direction on each cycle of the port. Maximum data rate per I/O Enclosure: 4 GB/s
It is designed as a high performance self-healing interconnect.
The p5-570 provides two external RIO-G ports, and an adapter card adds two more
Two ports on each processor complex form a loop.
Figure shows an illustration of how the RIO-G cabling is laid out in a DS8000 that h This would only occur if an expansion frame were installed.
The DS8000 RIO-G cabling will vary based on the model.
DS8000 host adapters HA
Proce
Host adapter with four fibre channel por
● Configured as FCP
– More FICON logical • ESS (1024) versu
– One FICON channe devices
– One HA card covers devices that a DS80
• (64k -256)
– Up to 16 HA into a into a DS8300
• 16 FICON chann single device
• Current System z subsystems limite
paths per device
– Front end of
• 128 ports for DS8
DS8000 FCP/FICON host adapters: HA
● Four LC 2Gb or 4Gb FC ports (2 Host Adapter m
● Auto-negotiates to 1Gbps, 2Gbps, or 4 Gbps
– Each port independently auto-negotiates to either 1/2 on 2 Gb Host Adapter models or 2/4 Gbps link speed Adapter models.
● Can be independently configured to FCP or FICO
– The personality of the port is changeable via the DS S Management tools (GUI or CLI).
● Ports cannot operate as FCP and FICON simulta
● FCP port can be Long Wave or Short Wave
– Short wave ports support a distance of 300m (non-rep – Long wave ports support a distance of 10Km (non-rep
– Switched point-to-point for fabric topology – FC-AL for point-to-point topology
DS8000 FICON / FCP host adapter
Processor QDR 1 GHz PPC 750GX Fibre Channel Protocol Engine Buffer Data Protection Data MoverFibre Channel ASIC
Protocol Flash
Engine Data Mover
Protocol QDR
Chipset
PCI-X 64 Bit 133 MHz •Four 2 or4GbpsFibre Cha
• Metadata Creation/Checki
• Configured at port level
•SW or LW
Performance evolution – from the model 800 the DS8000
– 2 GB host adapters
DS8000 4 GB host adapter performance
New 4 Gb Host adapters are designed to improve by 50% single port throuDS8000 device adapter DA
Proce
Fibre channel device adapters with 2 Gb
●DA perform RAI
– Offload servers o
– Each port has up throughput of pre based DA ports
– DS8000 AAL (Ar Loops):
DS8000 device adapters
● Device adapters support RAID-5 or RAID-10 ● FC-AL switched fabric topology
● FC-AL dual ported drives are connected to FC switch in enclosure backplane
● Two FC-AL loops connect disk enclosures to device ada
● Array across loops is standard configuration option in D – Two simultaneous I/O ops per FC-AL connection possible – Switched FC-AL or SBOD (switched bunch of disks) used for b
● 4 paths to each drive: 2 FC-AL loops X dual port access
– (Detailed later with Storage Enclosures cabling) © Copyright IBM Corporation 2007
DS8000 RAID device adapter
PPC Processor 750FX 500 MHz
NVRAMBridge SDRAM
Fibre Channel
Protocol
Engine
RAID - Buffer
Data Protection
-Fibre Channel Data Mover
ASIC
Protocol Engine
Data Mover
Protocol Chipset
PCI-X 64 Bit 133 MHz • Four 2Gbps Fibre C
AS
• Metadata checking
Performance evolution –
from the model 800 to the DS8000
DS8000: Disk enclosures installed in pairs: one in front an © Copyright IBM Corporation 2007
DS8000 storage enclosures
●Enclosure hold 16 DDMs
– Dual portedFC-AL DDMs
– 73, 146, or 300 GB DDMs
• 10 or 15K RPM
– New FATA Disk drives of 500 GB / 7200 rpm are also supported
in the same enclosures.
●Drives can be added in groups of
8 drives by DS8000 storage enclosure
● Enclosures act as a FC switch connecting drive using point to point connections
The picture ab simultaneous a
switched conn each device ad
DS8000 / DS6000 switched FC-AL / FC-A
● FC-AL
– Loop supports only time
• Arbitration of com
– Intermittent failure is – Increasing time as n grows
● Switched FC-AL
– Drives attached in p connection
• Faster arbitration processing
• 200 MB/sec exter
– Improved RAS • Switch detects in
Switched FC-AL advantages
● DS6000 and DS8000 use switched FC-AL technology to link the (DA) pairs and the DDMs.
● Switched FC-AL uses the standard FC-AL protocol, but the physi is different.
● The key features of switched FC-AL technology are: – Standard FC-AL communication protocol from DA to DDMs – Direct point to point links are established between DA and DDM :
• No arbitration and no performance degradation
– Isolation capabilities in case of DDM failures provide easy problem determi – Predictive failure statistics
– Simplified expansion: no cable rerouting required when adding another disk
● The DS8000 architecture employs dual redundant switched FC-A of the disk enclosures.
● The key benefits of doing this are:
– Two independent switched networks to access the disk enclosures – Four access paths to each DDM in DS8000 architecture (dual switches) – Each device adapter port operates independently
– Double the bandwidth over traditional FC-AL loop implementations
● Each DDM is attached to two separate Fibre Channel switches. – This means that with two device adapters, we have four 2Gb/sec effective d
connection, that uses arbitrated loop protocol.
– This means that a mini-loop is created between the device adapter and the
– Four simultaneous and independent connections, one from each device ada
© Copyright IBM Corporation 2007
DS8000 frames
● Base frame:
– The base frame contains two processor complexes: eServer p5
• Each of them contains the processor and memory that drive all fun DS8000.
– The base frame can contain up to 8 disk enclosures; each can disk drives.
• In a maximum configuration, the base frame can hold 128 disk driv
– The base frame contains 4 I/O enclosures.
• I/O enclosures provide connectivity between the adapters and the
• The adapters contained in the I/O enclosures can be either device (DAs or HAs)
– The communication path used for adapter to processor comple the RIO-G loop. ● Expansion frames:
– Each expansion frame can hold up to 16 disk enclosures which drives.
• In a maximum configuration, an expansion frame can hold 256 dis
– Expansion frames can contain 4 I/O enclosures and adapters if expansion frame that is attached to either a model 932 or a mo
IBM System Storage DS8100 (2-way)
Up to 12 Power supplies HMC IBM eSe POWERBatteries I/O draw
DS8300 (4-way with two expansion fram
p5 (POWER5) servers
Batteries I/O drawers
DS8100 (model 921/931) - 2-way
● Up to 16 Host Adapters (HA)
– FCP/FICON HA: 4 independent ports – ESCON HA: 2 ports
● Up to 4 Device Adapter (DA) pairs
– DA pairs 0 / 1 / 2 / 3
– Automatically configured from DDMs
● Maximum configuration (384 DDMs) – DA pair 0 = 128 DDMs – DA pair 1 = 64 DDMs – DA pair 2 = 128 DDMs – DA pair 3 = 64 DDMs – Balanced configuration at 256 DDMs: in other words, 64 DDMs per DA pair
– DA (card) plugging order: 2 / 0 / 3 / 1
2 2 0 0 C0 C1 0/1 1/0 2/3 3/2
DS8300 (Models 922/932 and 9A2/9B2)
-● Up to 32 Host Adapters
– FCP/FICON HA: 4 independent ports – ESCON HA: 2 ports
● Up to 8 DA pairs
– DA pairs 0 to 7
– Automatically configured from DDMs ● Maximum configuration (640 DDMs) – DA pairs 1, 3-7 = 64 DDMs – DA pairs 2, 0 = 128 DDMs – Balanced configuration at 512 DDMs:
in other words, 64 DDMs per DA pair – DA (card) pair plugging order:
2 6 2 6 0 4 0 4 C0 7 C1 7 5 5 0/1 1/0 4/5 5/4 2/3 3/2 6/7 7/6
DS8000 cache managemen
SARC: Simplified adaptive replacement
cache
© Copyright IBM Corporation 2007
Sequential prefetching in
adaptive replacement cache (SARC)
●
SARC basically attempts to determine four things
– When data is copied into the cache – Which data is copied into the cache
– Which data is evicted when the cache becomes full – How the algorithm dynamically adapts to different wor
●
SARC uses:
– Demand paging for all standard disk I/O
DS8000 caching
● Best caching algorithms in industry Benefits of adaptiv
● Over 20 years’ experience
● Simplified Adaptive Replacement Cache (SARC) – Self-Learning algorithms
• Adaptively and dynamically learn what data
should be stored in cache based upon the recent access and frequency needs of the hosts
– Adaptive Replacement Cache
• Most advanced and sophisticated algorithms to
determine what data in cache is removed to accommodate newer data
– Pre-fetching
• Predictive algorithm to anticipate data prior to a
host request and loads it into cache 1
● Benefits
– Leading performance
R
at
io 0.8
• Been proven to improve cache hit by up to
0.6
100% over previous IBM caching algorithms and
H
it
improve I/O response time by 25% 0.4
C
ac
h
e
– More efficient use of cache
patterns to determine what data is stored
0
• Need less cache than competitors
0 64 128
Cache Siz
Lower cache-to-backstore ratios with outstanding service tim
Nimrod Megiddo and Dharmendra S. Modha, "Outperforming LRU with an Adaptive Replacement Cache Algorithm," 2004. © Copyright IBM Corporation 2007
DS8000 RAS features (Reliability,
availability, an serviceability)
© Copyright IBM Corporation 2007
Processor complex RAS
● Processor complex has the same RAS features o which is an
integral part of the DS8000 architectu
● IBM Server p5 system main RAS features:
– First Failure Data Capture
– Boot process and operating system monitoring – Environmental monitoring
– Self-healing
– Memory reliability, fault tolerance and integrity • Error Checking Correction (ECC)
• Memory scrubbing and thresholding – N+1 redundancy
– Concurrent maintenance
Server RAS (1 of 2)
●
The DS8000 employs similar methodology to the provide data
integrity when performing write oper server failover.
– Metadata check: The metadata is checked by various components to validate the integrity of the data as it m the disk system or sent back to the host.
– Server failover and failback:
• LSS and server affinity:
– LSS with even number have an affinity with server 0 – LSS with odd number have an affinity with server 1
• When a host operating system issues a write to a logica DS8000 host adapter directs that write to the server tha which that logical volume is a member.
Server RAS (2 of 2)
●Under normal operation, both DS8000 servers are activ I/O requests – Each write is placed into the cache memory of the server ownin also into the NVS
memory of the alternate server.
●Failover: In case of one server failure, the remaining ser take over all of its functions
– RAID arrays which are connected to both servers can be acces device adapters of the remaining server.
– Since the DS8000 has only one copy of data in cache of remai now take the following mechanism:
• It de-stages the contents of its NVS to the disk subsystem.
• The NVS and cache of remaining server are divided in two, half fo half for the even LSSs.
• Remaining server now begins processing the writes (and reads) fo
– It completes in less than 8 seconds and is invisible to the attac © Copyright IBM Corporation 2007
Hypervisor – storage image independen
Logical
view: Storage Facility image 1 Storage Facility
virtual
Storage LIC RIO-G LIC LIC RIO-G
I/O I/O I/O
Memory Memory Memory
Facility Processor Processor Processor
images LPAR Hypervisor
Physical view: physical storage unit takes part of takes part of RIO-G I/O I/O Memory Me
Processor
P
Server failover
● Normal flow of data for a write:
1. Data is written to cache memory in the owning server.
2. Data is written to NVS memory of the alternate server.
3. The write is reported to the attached host as having been completed. 4. The write is destaged from the
cache memory to disk.
5. The write is then discarded from the NVS memory of the alternate server.
● After a failover, remaining server is processing all I/Os with cache and
NVS divided by two, one for odd LSSs and one for even LSSs.
NVS for odd LSSs
Cache memory for even LSSs
Server 0
NVS for odd LSSs
Cache memory for even LSSs
Server 0
NVS recovery after complete power loss
●DS8000
●Battery Backup Units (BBUs)
●Both power supplies stopped
– Batteries not used to keeping disks spinning – Scenario at power-off
• All HA I/O blocked
• Each server copies NVS data to internal disk
• Two copies per server
• When copy process complete, each server shuts down AIX
• When AIX shutdown complete for both servers (or time out expires powered down
– Scenario at power-on
• Processor complexes power-on and perform power-on self test
• Each server boots up
• During boot-up, server detects NVS data on its disks and destage
• When battery units reach a certain level of charge, the servers com
●Note: the servers will not come online until the batteries charged. © Copyright IBM Corporation 2007
Host connection availability
● On DS8000 host adapters are shared between th
– Unlike the DS6000 which uses the concept of preferre
● It is preferable for hosts to have at least 2 conne separate host
adapters in separate I/O enclosure
– This configuration allows the host to survive a hardwa component on either path.
– This is also important because during a microcode up enclosure may need to be taken offline.
● Multi-pathing software:
– Subsystem Device Driver (SDD) is able to manage bo and preferred path determination.
– MPIO PCM is also supported with AIX 5.2 ML5 (or late ML1 (or later).
Disk subsystem (1 of 2)
● RAID5 and RAID10
– RAID5 (7+P or 6+P+S) or RAID10 (2x4 or 2x3 + 2S) – DS8000 does not support non-RAID configurations
● Spare disk creation:
– A minimum of one spare is created for each array site following conditions are met:
• A minimum of 4 spares per DA pair
• A minimum of 4 spares of the largest capacity array site
• A minimum of 2 spares of capacity and RPM greater tha fastest array site of any given capacity on the DA pair
● Floating spare:
position to better balance the spares acr pairs, the loops, and the enclosures.
• (Useful after a drive replacement that became a spare d
Disk subsystem (2 of 2)
●
Each DDM attached to two FC switches
– Each disk has two separate connections on the backp
●
Each DA has a connection to the two switches
●
Hot pluggable DDMs
●
Predictive Failure Analysis (PFA)
– Failures anticipation
●
Disk scrubbing
Power and cooling
●
Completely redundant power and cooling in N+1
●
Battery Backup Units (BBU)
– Used for NVS (part of the server’s memory)
– Can be replaced concurrently
●
Rack Power Control cards (RPC)
– 2 RPC cards for redundancy
– Each card can control power of an entire DS8000
●
Power fluctuation protections
– DS8000s tolerate a momentary power interruptionfo 30ms.
– After that time, servers start copying content of NVS to disks. © Copyright IBM Corporation 2007
Microcode update
● Concurrent code update (since Bundle level 324
– Management console can hold 6 different versions of – Each server can hold 3 different versions of code
● Installation process:
– Internal Management Console (MC) code update – New DS8000 LIC downloaded on the Internal MC – LIC uploaded from MC to each DS8000 server interna – New firmware can be loaded from MC directly into eac
• May require server reboot with failover of its Logical Sub other server – Update of servers operating system and LIC
• Each server updated one at a time with failover of its Lo to the other server
– Host adapters firmware update
• Longer interruption managed by host’s multi-pathing sof © Copyright IBM Corporation 2007
Management console
● Redundant Ethernet switches
– Each switch used in a separate Ethernet network with private IP addresses assigned in networks
– 172.16/16 and 172.17/16 – 192.168..16.x and 192.168.17 – 10.0.16.x and 10.0.17.x
● Redundant Management Console
– Each DS8000 can be connected via the redundant Eth to both Management Consoles.
© Copyright IBM Corporation 2007
DS8000 I/O enclosure layou and
cabling rules
© Copyright IBM Corporation 2007
Model 921/931 – two I/O enclosures
I/O enclosure 3 Server 0 (EVEN LSS)Loop 0
RIO-G ports I/O enclosure 2This configuration will not be available at GA.
Model 921/931 – four I/O enclosures
I/O enclosure 0 I/O enclosure 3
Server 0 (EVEN LSS)
Loop 0
RIO-G ports
I/O enclosure 2 I/O enclosure 1
Model 922/932 – four I/O enclosures
I/O enclosure 0 Server 0Loop 0
EVEN LSSs RIO-G ports I/O enclosure 3 I/O enclosure 1Loop 1
I/O enclosure 2 On Loop 0 on 921
A model 922 has extra hardware to support a second RIO-In this configuration.
Model 922/932 – eight I/O enclosures
I/O enclosure 0 I/O enclosure 7
Server 0
Loop 0
EVEN LSSs
RIO-G ports
I/O enclosure 4 I/O enclosure 3 I/O enclosure 6 I/O enclosure 1
Loop 1
Eight enclosures is the maximum number for a model 922. More enclosures need more RIO-G ports.
To get more RIO-G ports we need more processor complexes
Model 9A2/9B2 – four I/O enclosures
I/O enclosure 0
Server 0 !!! Opposite / 922
EVEN LSSs
Loop 0 belongs to SFI 1
Loop 0
(two instances)
RIO-G ports
I/O enclosure 1 I/O enclosure 3
Loop 1
Loop 1 belongs to SFI 2 I/O enclosure 2The 9A2 is split into two storage facility images (SFIs).
Each SFI controls one RIO-G loop and all the enclosures and adaptersResources cannot be shared between SFIs.
Model 9A2/9B2 – eight I/O enclosures
!!! Opposite / 92
I/O enclosure 0 I/O enclosure 5
Server 0 EVEN LSSs
Loop 0 belongs to SFI 1
Loop 0
(two instances)
RIO-G ports
I/O enclosure 4 I/O enclosure 1 I/O enclosure 6 I/O enclosure 3
I/O enclosure 2 I/O enclosure 7
The 9A2 is split into two storage facility images (SFIs).
Each SFI controls one RIO-G loop and all the enclosures and adapters Resources cannot be shared between SFIs.
Device adapter pair layouts and server front view
DA pairs represent two device adapters.
One DA in each pair is owned by server 0 and the other DA in that pair is owned by server 1.
DAs in even numbered I/O enclosures belong to server 0. DAs in odd numbered I/O enclosures belong to server 1.
The 'outside' slots always get populated first
Even numbered I/O enclosures are always cabled closer to server 0. Odd numbered I/O enclosures are always cabled closer to server 1.
I/O Enclosure numbers
DA pair num Fro
(the numbers don't cha which they are in Base Frame Rack 1 Front view Complex 0 Complex 1 1 0 1 1 0 3 2 3 3 2
"0 1" means that one card from DA pair 0 and one card from this enclosure.
Warning: Because this is a front view, the left-hand card is in enclosure and the right-
hand card is in slot 3 of the enclosure
DA plug order and affinity – model 921/931 –
Base frame Expansion frame
Rack 1 Rack 2 2 3 2 3 0 1 0 1 2 Complex 0 2 0 Complex 1 0 2nd 4th 4th 2nd 0 1 1 0 1st 3rd 3rd 1st Plug order 2 3 3 2 DA pair
The numbers in the storage which storage enclosure pai which DA pairs.
"2" in the storage enclosure front and rear storage enclos is attached to DA pair 2.
The plug order shows the opairs are added to the
DS80
"1st 3rd" means that the left-first and the right hand DA is
So using DA pair numbers, t DA pair 2 then 0 then 3 then
The numbers in the I/O enclo pairs in those enclosures. "2 3" means that one card fr card from DA pair 3 are in thi
Because this is a front view, t in slot 6 of the enclosure and is in slot 3 of the enclosure.
The DA pairs are added in t
DA affinity – models 922/932, 9A2 and 9B2 –
Base Frame Expansion frame Expansion fra
(Rack 1) (Rack 2) (Rack 3)
2 6 3 2 6 3 0 4 1 0 4 1 7 2 Complex 0 7 2 5 0 Complex 1 5 0 2nd 8th 8th 2nd 4th 6th 6th 4th Plug ord
0 1 1 0 4 5 5 4
1st 7th 7th 1st 3rd 5th 5th 3rd
DA pair
2 3 3 2 6 7 7 6
Looking at this chart explains why the DA plug order is different machines because the 1st expansion frame cables to the I/O en that frame. This also means a model upgrade requires re-cablin
The DA pairs are added in this order: 2, 0, 6, 4, 7, 5, 3, 1 © Copyright IBM Corporation 2007
Resource division – model 9A2/9B2 – fr
Base Frame Expansion Frame Expansion
2 6
2 6
0 4
0 4
7
Complex 0 SFI1 SFI2 7
5 Complex 1 SFI1 SFI2 5 2nd 8th 8th 2nd Loop 0 4th 6th 6th 4th Plug ord 0 1 1 0 4 5 5 4 DA pair 1st 7th 7th 1st Loop 1 3rd 5th 5th 3rd 2 3 3 2 6 7 7 6
Each Storage Facility Image (SFI) is composed of an LPAR on e processor complex
derived from I/O enclosu ownership.
SFI resources are divided by color, blue for SFI1 and purple for SF Green and yellow are used to distinguish server 0 and server 1 on following pages. Dont confuse servers and SFIs.
I/O enclosure slot numbering – rear view
H os t a da pt er H os t a da pt er D ev ic e ad ap te r R IO -G p or ts H os t a da pt er H os t a da pt er D ev ic e ad ap te rSlot 1
2
3
7
4
5
6
This is a view o the rear of the
The ports on an numbered 0 to
The location c
is in the format:
Rack - enclosu
e.g. the top por 1 of I/O enclosu
R1- I1-C1-T0
R1 - Rack 1
I1 - Enclosure 1
C1 - Card 1
Warning: port R1-I1-C1-T0 is displayed as I0000 under dscli (lsioport comman (Full details in next
slides)
I/O ports numbering – DS CLI lsioport display –
I0000 I0010 I0030 I0040
I0001 I0011 I0031 I0041
I0002 I0012 I0032 I0042
I0003 I0013 I0033 I0043
Slot 1 Slot 2 Slot 4 Slot 5
Enclosure 1
I0200 I0210 I0230 I0240
I0201 I0211 I0231 I0241
I0202 I0212 I0232 I0242
I0203 I0213 I0233 I0243
Slot 1 Slot 2 Slot 4 Slot 5
I0100 I0110 I0 I0101 I0111 I0 I0102 I0112 I0 I0103 I0113 I0 Slot 1 Slot 2 Sl Enclosure I0300 I0310 I0 I0301 I0311 I0 I0302 I0312 I0 I0303 I0313 I0 Slot 1 Slot 2 Sl Enclosure
I/O ports numbering – DS CLI lsioport display – expansion
frame
I0400 I0410 I0430 I0440
I0401 I0411 I0431 I0441
I0402 I0412 I0432 I0442
I0403 I0413 I0433 I0443
Slot 1 Slot 2 Slot 4 Slot 5
Enclosure 5
I0600 I0610 I0630 I0640
I0601 I0611 I0631 I0641
I0602 I0612 I0632 I0642
I0603 I0613 I0633 I0643
Slot 1 Slot 2 Slot 4 Slot 5
I0500 I0510 I0 I0501 I0511 I0 I0502 I0512 I0 I0503 I0513 I0 Slot 1 Slot 2 Sl Enclosure I0700 I0710 I0 I0701 I0711 I0 I0702 I0712 I0 I0703 I0713 I0 Slot 1 Slot 2 Sl Enclosure