Hitachi Path Management & Load Balancing with Hitachi Dynamic Link Manager and Global Link Availability Manager

(1)

Hitachi Path Management &

Load Balancing with Hitachi

Dynamic Link Manager and

Global Link Availability Manager

(2)

The HDS WebTech Series

• Dynamic Load Balancing

• Who should attend:

– Systems and Storage Administrators

– Storage Specialists & Consultants

– IT Team Lead

– System and Network Architects

– IT Staff

– Operations and IT Managers

(3)

Learning Objectives

Upon completion of this seminar, you should be able to:

– Describe the primary purposes of path management software

– Diagram the critical path of data storage input and output with Hitachi enterprise storage

– Explain the complimentary roles of Dynamic Link Manager & Global Link Availability Manager

– Describe the load balancing algorithms available

– Use Dynamic Link Manager and / or Global Link Availability Manager to select the most appropriate algorithm

– Describe path management recommended practices

(4)

(5)

(6)

FC SAN I/O Path

Host Storage System

I/O Request HBA LU DEV CHA Application

(7)

• Defining the path of the data I/O

(8)

• Critical path management issues

– Fail-over – fall-back

– Optimizing (balancing) the I/O load

• I/O characteristics (random, sequential, reads, writes)

(9)

Path Failover and Failback

• Dynamic Link Manager software provides continuous storage access and high availability by distributing I/O over multiple paths

• Failover and fallback in either manual or automatic modes • Automated path health checks

• Allows dynamic LUN addition and deletion without a server reboot *

* O/S and array dependent, check system requirements for details

Simple Failover

Storage Storage Applications Applications Volume Volume HDLM HDLM Standby Standby Failure Failure Server Server Reduction of Reduction of Balancing Paths Balancing Paths Storage Storage

With Load Balancing

(10)

Load Balancing

• Dynamic Link Manager software distributes storage

access across multiple paths to improve I/O performance

with load balancing

• Bandwidth control at the HBA level - and in conjunction

with Global Link Availability Manager at the LUN level.

Storage Storage Applications Applications Volumes Volumes Regular Driver Regular Driver

Without Load Balancing

I/O Bottleneck I/O Bottleneck Server Server Load Balancing Load Load Balancing Balancing Storage Storage Applications Applications HDLM HDLM

With Load Balancing

Server

(11)

(12)

350 ms (.35 sec)1

1_{350 ms is an example only…this does not include network time (LAN or WAN)}

CPU

Time

Queue

Time

I/O Time

25% 15% 60%

• Review of an application transaction

• This is typical for OLTP (Online Transaction Processing) • Email, DSS, and Rich Media – have higher % of I/O content • OLAP, CRM, and ERP – have higher % of CPU content

(13)

Life of an I/O Operation

• The Role of the I/O

– The I/O component plays an important role in overall transaction response time in many transaction types.

• I/O response time must be monitored on an ongoing basis to

insure customer satisfaction.

• When the overall transactional response time shows

degradation, the first element to be blamed is the I/O subsystem.

(14)

Cache

• I/O Journey

Life of an I/O Operation

Application Host Bus Adapter Ed

ge Sw

itch

Core Director Storage Sy

stem

Port Storage Sy

stem

Back-End

(15)

• Cache Management Nomenclature

– Multiple queues are used in Hitachi storage systems

• We will look at the general LRU queue – The frames on a LRU queue:

• MRU position – Most Recently Used • LRU position – Least Recently Used – Cache Demotion Time:

• Time to travel from MRU to LRU position of the queue

• This can be improved by: – Increasing cache size

– Using intelligent caching algorithms – Cache Residency Time:

• Total time a track resides in cache

• Value for residency time goes from the “Demotion Time” value to the infinite

LRU MRU

(16)

Random Read Processing

1. When a random read request is received from the host, the Front-End Director (FED) accesses the cache directory

(Control Memory/Cache Meta Data) and performs a directory search to see if the requested record is in cache.

2. If the record is in cache, it is sent to the host from cache. The read hit ratio is

important for OLTP overall response time. 3. If the record is not present, the Back-End

(17)

Sequential Read Processing

1. The first steps are the same as a random read

2. When the FED receives requests for sequential records, a sequential

pattern is detected (sequential detect) 3. NSRs from the next tracks are

pre-loaded into cache. Following requests will be satisfied through cache

4. In this type of I/O request, the Read Hit Ratio should always be over 85% 5. Performance is typically very good on

(18)

Random Write Processing

1. The port receives a write operation

from the host

2. Two copies of the updated record are written into cache onto two different memory boards under the NVS line 3. The frame is queued on a specialized

queue (dirty queue) for destage scheduling. At low NVS line value, back-end activity is reduced by keeping frames on the dirty queue (more writes per de-stage operation) for a longer period

4. When the frame arrives at the LRU position, it gets written to the disk

5. The other copy of the updated record is kept in cache and placed on the General LRU queue. This is to

increase the Read Hit Ratio in case of subsequent read or write request to the same record

(19)

Sequential Write Processing

1. The port receives a write operation from the host

2. Two copies of the updated record are written into cache onto two different memory boards under the NVS line 3. When sequential records are received,

a sequential pattern is detected

4. All following records are kept in cache 5. When a full stripe (or more) has been received, the parity track is created in cache and the entire stripe is written to the disks in one logical revolution

– All frames are sent to the free queue

(20)

(21)

Round Robin – Extended Round Robin

• Impact of ExRR Versus RR on Sequential Detect

(22)

(23)

Dynamic Link Manager

• Enables fault-tolerant access to data

on Hitachi and EMC storage

systems for direct-attached storage

(DAS) and storage area network

(SAN) environments.

• Path failover and I/O balancing over

multiple host bus adapter (HBA)

cards

– Improves performance by

distributing and balancing loads across multiple paths

– Improves application availability by automatically switching the path in the event of failure

Windows

Windows UNIX/Linux

(24)

Global Link Availability Manager

• HiCommand Global Link Availability Manager add-on provides

single point management of all Dynamic Link Manager

connections in SAN

LAN Servers Servers SAN SAN

Hitachi storage system

(25)

Load Balancing in a Clustered Environment

• Windows

– Microsoft Cluster Server – Oracle RAC

– Veritas Cluster Server • Sun Solaris

– Sun Cluster

– VERITAS Cluster Server – Oracle RAC • HP-UX – MC/Serviceguard – Oracle RAC • AIX – HACMP

– Veritas Custer Server – Oracle RAC

• Linux

– Redhat AS Bundle Cluster – SuSE Linux Bundle Cluster – Veritas Cluster Server

– Oracle RAC Active Host Active Host Storage Storage HBA HBA Standby Host Standby Host CHA CHA HDLM HDLM HDLMHDLM Cluster Cluster Load Balance Load Balance HBA

HBA HBA_HBA HBAHBA

CHA

LUN

(26)

HiCommand Global Link Availability Manager

HiCommand Global Link Availability Manager Features

• Path management

• Event notification

• Group management

(27)

Global Link Availability Manager

Features

• Manage the entire Dynamic Link Manager multi-pathing

environment from a single console

– For each Dynamic Link Manager instance list path information for all paths or for each host, HBA port, storage system, and storage port.

– Aggregated path views corresponding to path status (online or offline) check the health of the entire multi-pathing environment. – Adjust the online / offline path status for single or multiple hosts – Adjust load balancing for individual LUNs

• Group Management

– Control access to a specific “group” of hosts (subset of Dynamic Link Manager instances).

– Allows managing a subset of hosts as a single unit

– “Resource Groups” enable System Administrators to securely manage their own set of hosts.

(28)

(29)

(30)

(31)

(32)

(33)

Recommended Practices

• Understand the I/O characteristics of your applications

– Applications generating sequential I/O: Extended Round Robin – Applications generating random I/O: Round Robin

• If managing 10 or more hosts, use Global Link Availability Manager

• Update latest release of HDLM

• Always review latest release notes & user guides

• Use 4 adapters – typically best balance of performance &

availability (deeper queue depths…greater scalability)

• Zone HBA to storage port

• Use switches from same vendor in same SAN fabric

(34)

Next Steps

• Training:

http://www.hds.com/education

– CCI0110 – Basic Storage Concepts

– TCC0260 – Introduction to Dynamic Link Manager & Global Link

Availability Manager (computer-based training)

– TSI059５ – Dynamic Link Manager / Global Link Availability Manager

– TSI059６－Dynamic Link Manager

– TSI0590ーGlobal Link Availability Manager

– TSI0945 - Managing Storage Performance with Hitachi Tuning

Manager

• HDS Professional Certification –

http://www.hds.com/certification

– HDS Certified Professional (Foundations)

– HDS Certified Storage Manager

• White Papers:

http://www.hds.com/corporate/webfeeds/wp/

– Use the RSS feed to automatically update with latest technical white

(35)

Upcoming WebTech Sessions:

• 22 August

- Optimal Storage Performance for Microsoft Exchange

• 12 September

- RAID Concepts

• 19 September

- Enterprise Data Replication Architectures that

Work: Overview and Perspectives

• …

(36)