• No results found

Pivot3 Serverless Computing. Technology Overview

N/A
N/A
Protected

Academic year: 2021

Share "Pivot3 Serverless Computing. Technology Overview"

Copied!
19
0
0

Loading.... (view fulltext now)

Full text

(1)
(2)

© 2009, Pivot3, Inc. All rights reserved.

www. pivot3.com

Table of Contents

Introduction . . . .3

Serverless Computing Architecture . . . .4

1. Server Applications. . . .5

2. The RAIGE® Operating System . . . .6

3. Hosted Server Software . . . . 13

4. Cloudbank Appliances. . . . 15

(3)

Introduction

Industry transitions are often marked by inflection points where elegant software makes it practi-cal to use x86-based commodity hardware for tasks that previously relied on expensive proprietary hardware. These inflection points drive widespread deployment of new technology into more cost-sensitive markets.

Examples of companies that used software to drive technology transitions include Microsoft, VMware and Google. Microsoft, for example, introduced x86 servers to business-critical computing once the Windows NT™ operating system became a viable application platform, an event that effectively ended the era of mini computers. VMware similarly shook up the mainframe business by develop-ing virtualization software for x86 servers. Meanwhile, Google developed the Google File System to provide massive server farms using tens of thousands of inexpensive x86-based motherboards. In each case, the disruptive software had to anticipate that commodity hardware components would fail and that the performance of general-purpose hardware would be more difficult to manage than application-specific hardware. Offsetting these challenges was the disruptive cost base of the x86 platforms and the relentless performance improvements from Moore’s Law.

Pivot3® Serverless Computing™ software delivers an inflection point where low-cost x86-based server hardware is repurposed to deliver highly available on-demand infrastructure for capacity-intensive applications.

This document provides a technical overview of the workings of the Serverless Computing architec-ture and innovations inherent in the design.

(4)

© 2009, Pivot3, Inc. All rights reserved.

www. pivot3.com

Serverless Computing Architecture

The Pivot3 Serverless Computing Architecture is the first and only storage area network (SAN) storage system that simultaneously hosts server applications on shared x86 hardware. The com-bined server/storage platform is both a high-availability SAN storage solution with no single point of failure and a high-performance high-availability server solution that exploits commodity hardware components.

There are four major components in Serverless Computing Arrays:

The following chapters describe each of these four architectural components in more detail.

For specific product information such as capacity calculations, power specifications and performance by array size, please see the Pivot3 product specification sheets and architectural and engineering specifications which are available on the Pivot3 web site at www.pivot3.com.

(5)

1. Server Applications

The Pivot3 Serverless Computing Array is an open systems server and storage platform for server ap-plications. The architecture is ideal for environments that are CPU-intensive and I/O-loaded and that require high storage capacity.

Supported Applications

The Serverless Computing Architecture supports server applications running on Microsoft Server products that support the iSCSI storage standard. Because of the standards-based approach, there is no application integration required and no certification hurdles for new applications.

Remote management and monitoring of server applications and operating systems is completely supported since the platform is based on standard Ethernet networks.

Pivot3 platforms are also Microsoft Windows Hardware Quality Lab (WHQL) certified for compatibility with Microsoft operating systems.

The Pivot3 application lab does test certain applications that are commonly deployed in the field to speed field support resolution. An active list of open system ISV partners can be found on the Pivot3 web site at http://pivot3.com/partners/solution.

Windows Storage Server NAS (network-attached storage)

Users can also use part or all of a Serverless Computing Array as a NAS share by running the Windows Storage Server on one or more Cloudbank Appliances in the Serverless Computing Array. All

Windows Storage Server features are supported including:

Distributed File System (DFS) for simplified access and high availability across locations Centralized management with File Server Resource Manager (FSRM) interface

SIS file-level de-duplication for up to 128 volumes

Integration with Microsoft ecosystem standards including Active Directory™

(6)

© 2009, Pivot3, Inc. All rights reserved.

www. pivot3.com

. The RAIGE® Operating System

The Pivot3 RAIGE Operating System (RAIGE OS) runs on each Cloudbank Appliance. The RAIGE OS provides logical volume management, distributed data protection and automatic load balancing across appliances for ease of management and high performance.

Logical Volume Management

The RAIGE OS virtualizes physical disks and appliances in a Serverless Computing Array so that capac-ity can be managed logically beyond the physical limits of each appliance. Reliabilcapac-ity also improves because volume access is not disrupted by physical hardware component failures.

Pivot3 appliances discovered by the RAIGE Director Software are presented as the RAIGE Domain. Physical appliances on the same local subnet can be selected and assigned to one or more Serverless Computing Arrays. Multiple Pivot3 Arrays can be managed with one instance of the RAIGE Director Software.

Appliances are assigned to an array using the RAIGE Director Software

(7)

Capacity Management

The aggregate capacity of the underlying appliances can then be parceled into logical volumes using the RAIGE Director Software. Attributes for each logical volume, such as RAID protection, name, rebuild priority, and access control are set by volume without requiring knowledge of the underlying physical hardware. Capacity can be physically and logically added to both the array and to existing logical volumes. Capacity expansion is dynamic and does not interfere with data storing or retrieval. Technically, the aggregated capacity of all the appliances is recognized by application servers as a multi-ported iSCSI target.

Initiator Management

Access to volumes is managed by a list of iSCSI initiators that are allowed to login to specific volumes. iSCSI initiator logins can either be un-authenticated or authenticated via MD5 based CHAP and an initiator shared-secret. For initiators that support mutual CHAP logins, the array can be configured to authenticate volumes to the initiator via MD5 based CHAP and an array-wide shared-secret.

Initiator names/logins can have an Access Control List (ACL) defined that allows Write, Read-Only or no access to volumes.

Array Management

Pivot3 provides a software utility called the RAIGE Connection Manager to automatically configure and maintain network connections between servers and volumes since there can be many connec-tions in a large Serverless Computing Array.

Distributed Data Protection

The key elements of Pivot3 distributed data protection are RAID across Gigabit Ethernet (RAIGE) algo-rithms, disk write cache, virtual global sparing, parallel rebuilding of failed drives, priority rebuilding of volumes and continuous background verification.

Logical volumes are created from the Serverless Computing Array

(8)

© 2009, Pivot3, Inc. All rights reserved.

www. pivot3.com

RAID Algorithms

The RAIGE OS distributes data and parity across Cloudbank appliances so that data is efficiently protected against component failures. There is no need to create physical disk or RAID sets as you would with a traditional RAID system. Rather, disks are treated as raw capacity and the RAID function is implemented at the volume level. The normally burdensome management tasks of defining RAID groups and partitioning volumes, which are associated with traditional RAID devices, are not neces-sary with a Pivot3 Serverless Computing Array.

RAID Protection

Four RAID protection levels are provided to meet the data protection goals of each application:

RAID 0

Striping with no parity. Data is not protected against failures. There is no protecti\on capacity required for RAID 0.

RAID 1e

Enhanced network mirroring. Data is protected by striping an exact copy of the primary data across drives in each of the other appliances in the array.

RAID 1e protects against the failure of any drive in the array. The “e” indicates enhanced RAID 1 since data is also protected if an entire appliance with all twelve drives fails. The protection capacity required for RAID 1e is 100% of the primary data.

RAID 5e

Enhanced network parity. Data is striped across each appliance in the array and protected by network parity. Network parity is also striped across each appliance in the array so that data stored in each appliance is protected by parity in another appliance.

RAID 5e protects against one failure which can be either a drive failure or an appli-ance failure. The “e” indicates the enhappli-anced RAID 5 protection for all twelve drives in the appliance.

Capacity required for RAID 5e protection is calculated using the number of appliances in the Serverless Computing Array. For example, in an array with twelve Cloudbanks, a volume designated as RAID 5e effectively has an 11+1 data + parity scheme.

• • • • • • •

(9)

RAID 6e

Enhanced network and disk parity. Data is striped across each appliance in the array and protected by two levels of parity. The first level of parity, network parity, is striped across each appliance in the array, much like the parity in RAID 5e. Network parity protects data against either a drive or appliance failure. The second level of parity, disk parity is striped across the disks within an appliance. Disk parity protects against any disk failure within each appliance.

RAID 6e protects against three simultaneous disk failures. The “e” indicates enhanced RAID 6 since data is also protected against a simultaneous failure of one drive and an entire appli-ance with all twelve drives.

Capacity required for RAID 6e protection, roughly at two appliances per array, is double that of RAID 5e since parity is distributed to two locations. Precise usable capacity is included in the product specification sheets.

Drive Groups

Drive Groups further minimize the effect of drive failures on the overall array. Drive Groups consist of one drive per appliance and are automatically created and maintained by the RAIGE OS. By organizing the placement of parity and mirror data within one Drive Group, the impact of a drive failure is limited to its Drive Group.

Drive Groups effectively increase the number of simultaneous drive failures that each Serverless Computing Array can sustain without data loss since drive failures outside of a Disk Group do not affect other Disk Groups.

Changing and Mixing RAID Protection in an Array

Since RAID protection is set by volume, different RAID levels may co-exist on an array and RAID levels can be changed while the system is running, without disruption to data reads or writes. For example, a RAID 6e volume can be changed to a RAID 5e volume to free up space while access to the volume continues uninterrupted.

Adding Physical Capacity to an Existing Array

Capacity can be physically added to an existing Serverless Computing Array by connecting new physical appliances to the SAN and then configuring them into the existing Serverless

(10)

10

© 2009, Pivot3, Inc. All rights reserved.

www. pivot3.com

Computing Array using the RAIGE Director Software. The additional physical capacity is dynamically added to the Pivot3 Array and data is restriped across the appliances so that capacity is automatically provisioned. The complexity normally associated with managing meta-LUNs for volumes larger than the domain of one physical conventional RAID controller is consequently eliminated.

Allocate-on-write

The RAIGE OS uses an allocate on write method so that a configured volume can be written to imme-diately and does not require disk formatting time, which for large conventional arrays may take over 24 hours.

Disk Write Cache

Pivot3 RAIGE OS uses a patented “Disk Write Cache” to protect in-flight data against power loss and to eliminate unreliable and expensive battery-backed RAM write cache memory. This patented ap-proach takes advantage of the massive network bandwidth available in a Serverless Computing Array and delivers excellent performance by using parallel disks as caching elements.

On each physical disk, “cache zones” spread across the sectors of each disk are used for the intermedi-ate caching of write data. Depending on the position of the disk head on the platter when a write request occurs, the cached data is saved in the nearest cache zone, greatly reducing head seek laten-cies. Host acknowledgements allow the host server to move on to the next activity and the cached data is moved to its final placement on the media as a background task.

Virtual Global Sparing

Virtual drive sparing is used to automate and speed drive rebuilding if a drive fails in a Serverless Computing Array. The capacity of one spare drive is reserved across all of the drives in the array and removed from usable capacity. In the event of a drive failure, the rebuild process begins immediately using the previously reserved capacity.

Unlike spare drives in conventional systems that standby during normal operation, virtual global spare drives in a Serverless Computing Array contribute to the overall performance of the RAID system during normal operation.

Virtual RAID controller sparing is effectively supported since data is protected in the case of an appli-ance failure.

Disk Write Cache Zones

(11)

Parallel Rebuild Innovation

Conventional RAID systems are constrained by the physical relationship between RAID groups and their member disk drives. As a result, sparing and rebuilds are similarly constrained to physical drives. This becomes an important limitation as drive capacities grow to beyond 1 TB and rebuilding times for single drives increase.

Serverless Computing Arrays provide extremely fast parallel rebuilds of failed drives because of the distributed nature of data allocation and sparing. Many drives contribute to the rebuild process and the recovered data is written to all drives resulting in a massively parallel activity. Only sectors of a failed disk that actually have data allocated and written need to be rebuilt which further speeds rebuild times in lesser utilized arrays.

Priority Rebuilds by Volume

Rebuilds are performed by volume. All volumes in the array are allocated to all of the drives in the array. An added benefit is the ability to designate a priority level for each volume so that higher priority volumes are rebuilt first. Rebuilding any specific volume may require rebuilding only a small portion of a drive.

With conventional systems, rebuilding happens at the disk level which generally means volumes are only protected once the entire disk is fully rebuilt.

Background Verification

The Serverless Computing Array continuously performs background disk verification. Each disk is completely scanned to identify disks that are beginning to fail and to detect and repair bad blocks on the media. This is another process that benefits from the massive available bandwidth of the array and the processing power available in the Cloudbank Appliances.

(12)

1

© 2009, Pivot3, Inc. All rights reserved.

www. pivot3.com

Automatic Load Balancing

Load balancing of bandwidth and capacity across network ports, appliance controllers and disk drives is managed by the RAIGE OS with no administrative intervention. Since data is equally distrib-uted across the Serverless Computing Array, changes to either the physical infrastructure or logical entities can be quickly accommodated to eliminate disk, controller and network hot spots.

For write operations, the dedicated x86 processors in each appliance have ample processing power for both RAID operations and TCP offload processing. For read operations, Cloudbank Appliances return data to the application servers in parallel, providing load balanced performance across all ap-pliances and all drives.

Physical Cloudbank Connections

iSCSI Storage Area Network

2 ports per Cloudbank

Server Local Area Network

4 ports per Cloudbank

Four Cloudbank Array Example

RAIGE Director Software

The parallel architecture and load balancing of the RAIGE OS allow Serverless Computing Arrays to effectively aggregate many 1Gbps Ethernet ports and quickly surpass the bandwidth in proprietary 4Gbps Fibre Channel systems.

Load balancing of capacity and performance extends to physical reconfiguration of each Serverless Computing Array. Following additions or removals of physical appliances to an existing array, the RAIGE OS restripes data across the new physical appliance count and automatically optimizes the load across the new physical network connections.

(13)

. Hosted Server Software

Each Cloudbank Appliance runs a virtualization software layer that allows storage and server operat-ing systems to run simultaneously on the same appliance.

Virtualization Software

The virtualization software in the Serverless Computing Array is open source software provided by Xen.org. The RAIGE OS runs in the first guest operating system. One additional server operating system can then be added as a guest.

Cloudbank Appliances are hardware provisioned for storage and server functions as follows: Four x86 cores, four gigabytes RAM and two gigabit NIC ports for the RAIGE OS Four x86 cores, four gigabytes RAM and two gigabit NIC ports for the server operating system and applications

• •

Virtual Machine I/O

iSCSI I/O in a Serverless Computing virtual machine (VM) begins with a single iSCSI session between the iSCSI initiator in the host and each Pivot3 logical volume accessible by the host. The iSCSI ses-sions utilize a purely virtual network interface card (NIC) that interfaces directly to RAIGE OS running on the Pivot3 Cloudbank Appliance hosting the VM. This network path is immune to the failure points common in physical networks (cables, switches).

A host I/O destined for a Pivot3 logical volume is sent by the iSCSI initiator to the iSCSI target for that logical volume through the virtual NIC. The RAIGE OS examines the logical block address associated with the command to determine which Cloudbank Appliance the data should be written to or read from. If the I/O can be processed on the local appliance, the request is serviced immediately and the data never traverses a physical network cable. If the data is associated with another Cloudbank in the array, the RAIGE OS on the local Cloudbank will read or write the data to the appropriate Cloudbank. This I/O takes place utilizing the fault-tolerant and load balanced storage networks forming the back-bone of the Pivot3 Array. Once the data transfer is complete, status for the I/O is returned to the host over the virtual NIC.

Managing Hosted Virtual Servers

Hosted virtual servers running on a Cloudbank Appliance have access to the entire shared capacity and bandwidth of the underlying iSCSI storage array. Server instances can be started, stopped and managed as would any remote server.

(14)

1

© 2009, Pivot3, Inc. All rights reserved.

www. pivot3.com

Server VM Recovery

Pivot3 provides added server application reliability with a software recovery feature that protects ap-plications against server hardware failures. The VM recovery feature can be selected using the RAIGE Director Software for each virtual machine.

In the case of a Cloudbank hardware failure, the failover appliance automatically reloads the virtual machine on an available Cloudbank Appliance in the Serverless Computing Array and dynamically re-establishes network, camera, and storage connections. This speeds the restoration of the application and access to the storage volumes which are protected against appliance failures by the RAIGE OS.

Unlike hardware-based failover techniques, Pivot3 VM recovery does not require dedicated physi-cal server, storage or network hardware and is a standard feature included with Pivot3 Serverless Computing Arrays.

(15)

. Cloudbank Appliances

Cloudbank Appliances are the hardware building block of the Serverless Computing Architecture.

Dual CPU enterprise server motherboard. Each appliance contains an enterprise server moth-erboard with dual quad-core Xeon x86 CPUs and 8 Gigabytes of ECC DIMM RAM. Of the eight available cores, four cores are dedicated to the server operating system and applications and four cores are dedicated to storage operations and TCP offload processing.

Four 1 Gigabit Ethernet LAN ports. Four 1 Gigabit Ethernet Network Interface Card (NIC) ports are dedicated to server connectivity.

Two 1 Gigabit Ethernet LAN ports. Two 1 Gigabit Ethernet NIC are dedicated to the iSCSI SAN.

Redundant hot-swappable power supplies and fans. Redundant power supplies eliminate power circuits as a single point of failure and are easily replaced without interrupting appliance operation for fast field service. Three redundant fans are also hot-swap devices.

Audible alarms. An audible alarm is activated on physical component failures to alert support personnel. This alarm is a requirement in some regulated environments and is helpful for envi-ronments where less-trained operators are managing the appliances.

Environmental monitoring and diagnostics. Each appliance self-monitors key environmental conditions of major components. Changes in state to any environmental condition is displayed in the Pivot3 user interface and can be can be transmitted as an SNMP event.

a. b. c. d. e. f.

(16)

1

© 2009, Pivot3, Inc. All rights reserved.

www. pivot3.com

State-indicating LEDs. Drive bay LEDs assist field support and maintenance. LEDs flash blue to indicate read and write operations to the drives under normal conditions. Failed drives are quickly identified by a corresponding red LED. A strobe function allows users to identify specific drives or appliances for diagnostic purposes.

Enterprise SATA drives. The 2U twelve drive form factor is the densest possible storage config-uration that maintains front access to hot-swappable drives. Front accessibility is a key element in simplifying field support so that replacement of failed drives can be quickly accomplished by users in the field. Appliances are delivered with fully populated drive bays. SATA drives are de-livered in the appliance although the backplane and controller infrastructure supports both SAS and SATA interfaces.

The following environmental states are monitored and reported: CPU failure

Disk failure

Power supply failure Network failure iSCSI port failure

Thermal temperature threshold exceeded Fan failure

Diagnostic logs are also kept for each appliance and can be remotely accessed and analyzed.

g. h. • • • • • • •

(17)

SNMP Support

Pivot3 appliances can be monitored using the Simple Network Management Protocol (SNMP) proto-col. Community strings for SNMP are configured through the RAIGE Director Software and the SNMP MIB (management information base) is provided with the Pivot3 software. Because appliances co-operate within an array, SNMP agents can be set once at an array level and do not need to be set for each appliance.

Cloudbank Appliances run the Pivot3 SNMP agent and send out SNMP events to third-party soft-ware applications that receive, or trap, the SNMP notifications. Since SNMP traffic is on the storage network, the server running third party application or an SNMP trap receiver needs to have access to both the storage network and the server network. Many network-management applications use SMTP to forward traps as email. The network-management application will push an SMTP message to the corporate email server, which will forward the message as an email to a specified email address.

(18)

1

© 2009, Pivot3, Inc. All rights reserved.

www. pivot3.com

Summary

While intense scrutiny has been placed on server-centric and switch-centric virtualization approach-es, Pivot3 Serverless Computing has quietly developed a third approach to on-demand infrastructure that integrates server virtualization, for the first time, in a storage platform.

For application environments characterized by high-capacity needs and I/O-intensive workloads, a storage-centric approach offers higher performance, lower cost and higher availability than either switch or server alternatives. By virtualizing RAID block storage and then collapsing the storage and sever hardware into a single platform, Pivot3 is uniquely positioned to deliver Storage Centric Computing.

Newer scale-out storage systems spill-over with x86 resources and can easily integrate server ap-plications to reduce cost, power, cooling, and rack space while improving availability and simplifying management. By contrast, conventional storage arrays are a poor platform for integrating server vir-tualization technology for the simple reason that there is not enough compute horsepower available. Customers should look carefully at the predominant workloads of their environments and select virtualization platforms that best meet their requirements. For capacity-rich and I/O-intensive work-loads, the use of storage-centric platforms based on newer scale-out architectures can be dramatic and should be central to data center consolidation planning.

(19)

Copyright © 2009 Pivot3, Inc. All rights reserved. Specifications are subject to change without notice. Pivot3, RAIGE, Pivot3 Serverless Computing, Cloudbank, Databank, NVR Recovery, and High-Definition Storage are trade-marks or registered tradetrade-marks of Pivot3. All other tradetrade-marks are owned by their respective companies. Techover4.1 • June 2009

www.pivot3.com Tel: 877.574.8683 Fax: 281.516.6099

References

Related documents

The XLGMII, which supports the 40 gigabit per second data rate, and the CGMII, which supports the 100 gigabit per second data rate, are defined as logical interfaces between the

The  IEEE Std 802.3ba™‐2010  40  Gb/s  and  100  Gb/s  Ethernet  amendment  to  the  IEEE Std 802.3™‐2008  Ethernet  standard  was    approved  on  June  17, 

Rebuilding a Show Store RAID with a replacement drive If the RAID controller does not see the failed drive during the Rebuilding a Show Store RAID with the original drives

attainment of metro Atlanta’s adult population), the charts above show that the majority of job postings require at least a bachelor’s degree, while the majority of the

Automatic  Self-­‐healing:    Validate  that  an  appliance  failure  will  not  affect  SAN  storage  and  that   VMware  HA  will  restart  affected

With RAID 5, CLARiiON write cache flush operations begin with old data and parity being read from the disk by the storage processor.. What is the

Direct Attached Disk Production Pool Protection Pool Data Services Backup Server Pyhsical Tape Library Data Services CDP VTL Serverless Backup Virtual to Physical.. © 2013

The Intel Matrix Storage technology creates two partitions on each hard disk drive to create a virtual RAID 0 and RAID 1 sets.. This technology also allows you to change the hard