Hitachi Path Management &
Load Balancing with Hitachi
Dynamic Link Manager and
Global Link Availability Manager
The HDS WebTech Series
•
Dynamic Load Balancing
•
Who should attend:
– Systems and Storage Administrators
– Storage Specialists & Consultants
– IT Team Lead
– System and Network Architects
– IT Staff
– Operations and IT Managers
Learning Objectives
Upon completion of this seminar, you should be able to:
– Describe the primary purposes of path management software
– Diagram the critical path of data storage input and output with Hitachi enterprise storage
– Explain the complimentary roles of Dynamic Link Manager & Global Link Availability Manager
– Describe the load balancing algorithms available
– Use Dynamic Link Manager and / or Global Link Availability Manager to select the most appropriate algorithm
– Describe path management recommended practices
FC SAN I/O Path
Host Storage System
I/O Request HBA LU DEV CHA Application
•
Defining the path of the data I/O
•
Critical path management issues
– Fail-over – fall-back
– Optimizing (balancing) the I/O load
• I/O characteristics (random, sequential, reads, writes)
Path Failover and Failback
• Dynamic Link Manager software provides continuous storage access and high availability by distributing I/O over multiple paths
• Failover and fallback in either manual or automatic modes • Automated path health checks
• Allows dynamic LUN addition and deletion without a server reboot *
* O/S and array dependent, check system requirements for details
Simple Failover
Storage Storage Applications Applications Volume Volume HDLM HDLM Standby Standby Failure Failure Server Server Reduction of Reduction of Balancing Paths Balancing Paths Storage StorageWith Load Balancing
Load Balancing
•
Dynamic Link Manager software distributes storage
access across multiple paths to improve I/O performance
with load balancing
•
Bandwidth control at the HBA level - and in conjunction
with Global Link Availability Manager at the LUN level.
Storage Storage Applications Applications Volumes Volumes Regular Driver Regular Driver
Without Load Balancing
I/O Bottleneck I/O Bottleneck Server Server Load Balancing Load Load Balancing Balancing Storage Storage Applications Applications HDLM HDLM
With Load Balancing
Server
Server
350 ms (.35 sec)1
1350 ms is an example only…this does not include network time (LAN or WAN)
CPU
Time
Queue
Time
I/O Time
25% 15% 60%
• Review of an application transaction
• This is typical for OLTP (Online Transaction Processing) • Email, DSS, and Rich Media – have higher % of I/O content • OLAP, CRM, and ERP – have higher % of CPU content
Life of an I/O Operation
•
The Role of the I/O
– The I/O component plays an important role in overall transaction response time in many transaction types.
• I/O response time must be monitored on an ongoing basis to
insure customer satisfaction.
• When the overall transactional response time shows
degradation, the first element to be blamed is the I/O subsystem.
Cache
•
I/O Journey
Life of an I/O Operation
Application Host Bus Adapter Ed
ge Sw
itch
Core Director Storage Sy
stem
Port Storage Sy
stem
Back-End
• Cache Management Nomenclature
– Multiple queues are used in Hitachi storage systems
• We will look at the general LRU queue – The frames on a LRU queue:
• MRU position – Most Recently Used • LRU position – Least Recently Used – Cache Demotion Time:
• Time to travel from MRU to LRU position of the queue
• This can be improved by: – Increasing cache size
– Using intelligent caching algorithms – Cache Residency Time:
• Total time a track resides in cache
• Value for residency time goes from the “Demotion Time” value to the infinite
LRU MRU
Random Read Processing
1. When a random read request is received from the host, the Front-End Director (FED) accesses the cache directory
(Control Memory/Cache Meta Data) and performs a directory search to see if the requested record is in cache.
2. If the record is in cache, it is sent to the host from cache. The read hit ratio is
important for OLTP overall response time. 3. If the record is not present, the Back-End
Sequential Read Processing
1. The first steps are the same as a random read
2. When the FED receives requests for sequential records, a sequential
pattern is detected (sequential detect) 3. NSRs from the next tracks are
pre-loaded into cache. Following requests will be satisfied through cache
4. In this type of I/O request, the Read Hit Ratio should always be over 85% 5. Performance is typically very good on
Random Write Processing
1. The port receives a write operationfrom the host
2. Two copies of the updated record are written into cache onto two different memory boards under the NVS line 3. The frame is queued on a specialized
queue (dirty queue) for destage scheduling. At low NVS line value, back-end activity is reduced by keeping frames on the dirty queue (more writes per de-stage operation) for a longer period
4. When the frame arrives at the LRU position, it gets written to the disk
5. The other copy of the updated record is kept in cache and placed on the General LRU queue. This is to
increase the Read Hit Ratio in case of subsequent read or write request to the same record
Sequential Write Processing
1. The port receives a write operation from the host
2. Two copies of the updated record are written into cache onto two different memory boards under the NVS line 3. When sequential records are received,
a sequential pattern is detected
4. All following records are kept in cache 5. When a full stripe (or more) has been received, the parity track is created in cache and the entire stripe is written to the disks in one logical revolution
– All frames are sent to the free queue
Round Robin – Extended Round Robin
• Impact of ExRR Versus RR on Sequential Detect
Dynamic Link Manager
•
Enables fault-tolerant access to data
on Hitachi and EMC storage
systems for direct-attached storage
(DAS) and storage area network
(SAN) environments.
•
Path failover and I/O balancing over
multiple host bus adapter (HBA)
cards
– Improves performance by
distributing and balancing loads across multiple paths
– Improves application availability by automatically switching the path in the event of failure
Windows
Windows UNIX/Linux
Global Link Availability Manager
•
HiCommand Global Link Availability Manager add-on provides
single point management of all Dynamic Link Manager
connections in SAN
LAN Servers Servers SAN SANHitachi storage system
Hitachi storage system
Load Balancing in a Clustered Environment
• Windows
– Microsoft Cluster Server – Oracle RAC
– Veritas Cluster Server • Sun Solaris
– Sun Cluster
– VERITAS Cluster Server – Oracle RAC • HP-UX – MC/Serviceguard – Oracle RAC • AIX – HACMP
– Veritas Custer Server – Oracle RAC
• Linux
– Redhat AS Bundle Cluster – SuSE Linux Bundle Cluster – Veritas Cluster Server
– Oracle RAC Active Host Active Host Storage Storage HBA HBA Standby Host Standby Host CHA CHA HDLM HDLM HDLMHDLM Cluster Cluster Load Balance Load Balance HBA
HBA HBAHBA HBAHBA
CHA
CHA
LUN
HiCommand Global Link Availability Manager
HiCommand Global Link Availability Manager Features
• Path management
• Event notification
• Group management
Global Link Availability Manager
Features
•
Manage the entire Dynamic Link Manager multi-pathing
environment from a single console
– For each Dynamic Link Manager instance list path information for all paths or for each host, HBA port, storage system, and storage port.
– Aggregated path views corresponding to path status (online or offline) check the health of the entire multi-pathing environment. – Adjust the online / offline path status for single or multiple hosts – Adjust load balancing for individual LUNs
•
Group Management
– Control access to a specific “group” of hosts (subset of Dynamic Link Manager instances).
– Allows managing a subset of hosts as a single unit
– “Resource Groups” enable System Administrators to securely manage their own set of hosts.
Recommended Practices
•
Understand the I/O characteristics of your applications
– Applications generating sequential I/O: Extended Round Robin – Applications generating random I/O: Round Robin
•
If managing 10 or more hosts, use Global Link Availability Manager
•
Update latest release of HDLM
•
Always review latest release notes & user guides
•
Use 4 adapters – typically best balance of performance &
availability (deeper queue depths…greater scalability)
•
Zone HBA to storage port
•
Use switches from same vendor in same SAN fabric
Next Steps
•
Training:
http://www.hds.com/education– CCI0110 – Basic Storage Concepts
– TCC0260 – Introduction to Dynamic Link Manager & Global Link
Availability Manager (computer-based training)
– TSI0595 – Dynamic Link Manager / Global Link Availability Manager
– TSI0596-Dynamic Link Manager
– TSI0590ーGlobal Link Availability Manager
– TSI0945 - Managing Storage Performance with Hitachi Tuning
Manager
•
HDS Professional Certification –
http://www.hds.com/certification– HDS Certified Professional (Foundations)
– HDS Certified Storage Manager
•
White Papers:
http://www.hds.com/corporate/webfeeds/wp/– Use the RSS feed to automatically update with latest technical white