Chapter 2. IP storage networking technical details
2.13 Data and network management
2.13.4 Storage virtualization
As we have seen, the development of NAS and SANs is being driven by the continued and unprecedented growth in storage. The greatest challenge posed by this growth, in the long term, is the cost-effective management of the storage resources, and the data residing on them. This has given rise to the concept of
virtualization of storage
. This refers to the abstraction of storage so that the logical representation of the storage to the operating system and the host applications is completely separated from the complexities of the physical devices where the data is stored. The benefits of virtualization are primarily in the ease of managing the resources, and the ability to set enterprise-widemanagement policies related to different pools of storage hardware.
A methodology to translate between the logical view and the physical view is required in order to implement storage virtualization. The question arises “Where and how should this be done?”
Virtualization techniques have been applied in several key areas of computing, including virtual memory in processors, and in individual disk and tape systems, like IBM’s RAMAC Virtual Array (RVA) and IBM’s Virtual Tape Server (VTS). This individualized virtualization delivers some benefits, but does not address the overall enterprise-wide management requirements.
Other approaches involve the introduction of specialized devices or storage manager servers through which all systems route I/O via the storage network. The network storage manager would handle the logical mapping of the storage to the physical attached devices, rather like a powerful disk controller. This is known as a “symmetrical” design.
IBM has announced that it is developing an approach to storage network virtualization based on a development project that has been referred to as a “Storage Tank.” This will provide central control of virtualization, but still allow all I/O to be done directly from servers to storage. A metadata server provides the virtual mapping functions as well as storage management processes. This is known as an “asymmetrical” design.
The Storage Tank ultimately will deliver the promise of heterogeneous storage networking. It will provide a universal storage system capable of sharing data across any storage hardware, platform, or operating system. Storage Tank is a software management technology that unleashes the flow of information across a storage area network, providing universal access to storage devices in a seamless, transparent, and dynamic manner.
Policy-based management of data storage will be enabled by the Storage Tank, providing:
Heterogeneous, open system platforms with the ability to plug in to the universal storage system and to share both data assets and data storage resources, regardless of physical location.
Management, placement, access and usage of data controlled by policies determined by the administrator.
Host systems that no longer need to configure storage subsystems as individual devices. Instead, they oversee and acquire storage capacity needed, with bytes allocated accordingly. This alleviates fragmentation and inefficient usage of storage resources due to pre-allocation of storage devices to specific host systems, logical volumes, or file systems.
Virtualized data storage resources, called storage groups, can be created. These enable new storage devices to be added and old ones removed without affecting access to data by applications. This provides for transparent scaling and ensures uptime by allowing new storage devices to be added dynamically, without manually cross-mounting volumes to a specific server. The illustration in Figure 2-14 shows that Storage Tank clients communicate with Storage Tank servers over an enterprise's existing IP network using the Storage Tank protocol. It also shows that Storage Tank clients, servers, and storage devices are all connected to a Storage Area Network (SAN) on a high-speed, Fibre Channel network.
Figure 2-14 The IBM Storage Tank concept
Heterogeneous Clients (Workstations or Servers) Fibre Channel Network Shared Storage Devices
Storage Tank concept
Active Data Backups and
migrated data Existing IP Network for Client/Server Communications
Existing IP Network for Client/Server Communications
Device to device copy for backup and migration
SAN
Installable File System Installble File System Installable File system Installable File System Metadata serversNT AIX Linux Solaris
Metadata
Matadata Servers for Authentication Access control Locking Data placement File level outboard services
An installable file system (IFS) is installed on each of the heterogeneous clients supported by Storage Tank. The IFS directs requests for metadata and locks to a Storage Tank server, and requests for data to storage devices on the SAN. Storage Tank clients can access data directly from any storage device attached to the SAN.
An enterprise can use one Storage Tank server, a cluster of Storage Tank servers, or multiple clusters of Storage Tank servers. Clustered servers provide load balancing, fail-over processing, and increased scalability. A cluster of Storage Tank servers are interconnected on their own high-speed network or on the same IP network they use to communicate with Storage Tank clients. The private server storage that contains the metadata managed by Storage Tank servers can be attached to a private network connected only to the cluster of servers, or it can be attached to the Storage Tank SAN.
Within a server cluster is a storage management server. This is a logical server that issues commands to back up and migrate files directly over the Fibre Channel network from one storage device to another. No client involvement is required to perform these tasks.
The Storage Tank architecture makes it possible to bring the benefits of system-managed storage (SMS) to a open distributed environment. Features such as policy-based allocation, volume management, and file management have long been available on mainframe systems via IBM’s DFSMS software. However, the infrastructure for such centralized, automated management has been lacking in workstation operating environments. The centralized storage management architecture of the Storage Tank system makes it possible to realize the advantages of open system-managed storage for all of the data the enterprise stores and manages.
For more details on storage network virtualization, refer to the IBM Redbook Storage Networking Virtualization - What’s it all about?, SG24-6211-00.