Whitewater Cloud Storage Gateway

(1)

Whitewater Cloud Storage Gateway

Best Practices Guide for Backup Applications

Riverbed Technical Marketing

October 2011

(2)

Introduction

Data protection is a mature practice, and existing solutions are well optimized for fast, reliable backup, typically to local disk, tape, or virtual tape libraries (VTLs). However, traditional backup strategies are always limited by an IT organization’s ability to accurately size, trend, source, budget for and manage multiple disks, tapes, and libraries, in different locations around the globe, and often on different platforms. Moreover, backup data often needs to be restored to a new location or a distant one, straining corporate WAN links.

Many innovative enterprises are turning to cloud storage to meet both demands. Cloud storage promises an elastic pay-as-you-go storage pool for backups and archives, which not only reduces expensive up-front backup disk and tape costs, but eliminates the need to maintain secondary sites for off-site disk or tape storage.

Riverbed^®Whitewater^®cloud storage gateways can dramatically improve the capabilities of an existing backup infrastructure by leveraging cost-effective cloud-based storage and eliminating tape infrastructure, reducing capital and operational costs by 30- 50% for managing DR datasets, and improving disaster recovery RTO/RPO reliability and efficiency. Whitewater is a disk-to-disk data storage optimization system with unique cloud storage integration. It easily integrates with existing backup applications to securely protect critical production data offsite, without the complexity of using tape management solutions or the cost of using in- house disaster recovery sites and services. Add Whitewater as a target for your existing backup infrastructure. The backup server simply connects to Whitewater using CIFS or NFS protocols. When you backup to Whitewater, it performs in-line byte level deduplication of the backup data and replicates data into the cloud. Whitewater uses the local disk to store enough data for recovery of recent information. Such a mechanism provides LAN performance for the most likely restores. Whitewater then writes the backup data to your cloud storage such as Amazon S3, AT&T Synaptic Storage as a Service, Microsoft Azure cloud storage, Nirvanix Storage Delivery Network, Rackspace Cloud Files, and general instances of EMC Atmos and OpenStack (Swift) Object Storage. Whitewater also optimizes restores from the cloud as it moves only deduplicated data over the WAN.

The following best practices will help you get the most out of the Whitewater cloud storage gateway when configured for the backup application that you currently have configured. Note that every user environment is different; understand that some of the parameters mentioned in this best practices guide may have to be further tuned to provide you optimal performance.

Audience

Riverbed customers, partners, and professional services engineers who are interested in backup application best practices when using Riverbed Whitewater gateways are recommended to use this best practices guide. Previous experience with backup application configuration is highly recommended. Refer to the backup vendor documentation and Whitewater Deployment Guides for details on terms and options used throughout this document.

(4)

Whitewater Best Practices

Prior to configuring Whitewater to a backup application, review the best practices for Whitewater sizing and configuration:

 Size a Whitewater solution based on a clear understanding of the amount of source data that will be backed up, the backup strategy used, daily change rate, annual data growth rate, the makeup of the source dataset, and WAN bandwidth available for replication. Correct sizing will help ensure data is processed and replicated to the cloud within an acceptable time to provide offsite protection. This sizing exercise is best performed by doing analysis with Riverbed.

For example:

Source Dataset Size: 20TB

Backup Strategy: Using Symantec NetBackup, implementing a Saturday full backup plus daily incremental backup, keeping 4 full backups and 2 weeks of incremental backups in local Whitewater disk cache storage

Daily Change Rate: 5%

Annual Data Growth Rate: 10%

Dataset Makeup: File server WAN speed: OC3 - 155mbps

Given the requirements above, and estimated assumptions about the deduplication rates of 2x for the first full, 20x for subsequent fulls, and 7x for incrementals, this could translate to a requirement for two Whitewater 2010 gateways and one Whitewater 710 gateway, which would collectively hold an estimated 15TB of full backups, and 2,000GB of incremental backups in order to store the necessary versions of data for the time frame requested. This result can vary, depending on dataset analysis which can significantly alter the overall backup and deduplication rates achieved with Whitewater for full and incremental backups.

 A Whitewater gateway CIFS folder target should receive backups from only one backup server. This maximizes the potential benefit of deduplication, eliminates potential locking scenarios, and prevents overwrite conflicts from two backup servers attempting to write to the same folder.

 Whitewater is best suited to large, sequential backup workloads, which are typical of backup applications and large database backups which stream a consistent steady flow of data. Data sets which are comprised of many small files may cause performance bottlenecks in the overall backup solution, including to Whitewater gateways.

 Backup data from a backup application or database must be provided in an uncompressed, non-deduplicated format in order to achieve maximum storage savings with Whitewater. Since most backup applications do not offer as granular a deduplication for data as Whitewater, benefits will be less if a backup application compresses or deduplicates data prior to sending the stream to Whitewater.

 Backups must be unencrypted, as Whitewater will encrypt those backups using its own encryption mechanism during data ingest from the backup server. The encryption key provided by Whitewater can be backed up as an individual file, or as a part of the Whitewater configuration backup for safe keeping by a system or backup administrator.

 Whitewater gateways should be deployed to maximize all available data paths. As an example, Whitewater 2010 provides four gigabit connections, which should all be configured and accessed by the backup application.

 When possible, reduce the number of simultaneous backup streams written to each data path available on Whitewater.

While multi-streaming backup data from several sessions to the same Whitewater data path will improve overall aggregate traffic to the data path, it also reduces the performance of the per stream throughput, as well as reduce efficiency of deduplication and compression by Whitewater because the data is not in sequential order as would be expected with a typical single backup stream. In most environments, it is recommended to stream no more than 8 to 10 backup streams to Whitewater at the same time.

 Whitewater folder shares can be configured to help describe a policy target. For example, critical system backups may be directed by a backup application to point to a critical folder on one Whitewater data connection, while non-critical backups may be directed by a backup application to point to a non-critical folder on the remaining Whitewater data connections. This methodology can help balance priorities of data over the network, as well as organize data for recovery in case of a disaster.

 When possible, attempt to organize backup policies such that the most similar backup data sets arrive to the same Whitewater unit. For example, if backing up a Windows server farm to multiple Whitewater gateways, operating system backups will likely have the best deduplication rates when grouped together to the same Whitewater gateway. File and application server backups may see better deduplication when grouped together, due to the likelihood that similar data is stored in each location.

(5)

Virtual Whitewater Best Practices

 Requirements for Virtual Whitewater are as described in the following table:

Component Virtual Whitewater

Virtual CPUs 2 minimum; 4 or more recommended Physical CPUs 2.3 GHz + Xeon (or similar)

Memory 6 GB minimum; 8 GB or more recommended Networking Adaptor type Intel E1000

Disk 2 TB. RAID-1 or high throughput disk subsystem. Separate disk subsystems other than the one used for backed up servers.

 Virtual Whitewater must be configured on a 64bit capable CPU that has virtualization enabled. On supported CPUs, virtualization is enabled in the BIOS of the motherboard. Please visit Intel for information about VT enabled CPUs, and AMD for information about AMD-V enabled CPUs.

 The largest volume that can be created for the virtual Whitewater datastore is 2TB - 512bytes. This is a limit as enforced by VMWare ESX and ESXi, as mentioned in this knowledge base article.

 It is recommended to use a dedicated physical drive if possible for the virtual Whitewater datastore. As this is the device which will deduplicate and store segments from virtual Whitewater activity, sharing this drive with other VMs can impact the overall performance of operations performed to or from a virtual Whitewater.

 Use at least a Gigabit link for interfaces - For optimal performance, connect the virtual interfaces to physical interfaces that are capable of at least 1 Gbps.

 Do not share physical NICs - For optimal performance, assign a physical NIC to a single interface. Do not share physical NICs destined for virtual interfaces with other VMs running on the ESX host. Doing so might create performance bottlenecks.

 Always reserve virtual CPUs - To ensure Virtual Whitewater performance, it is important that the Virtual Whitewater receives a fair share of CPU cycles. To allocate CPU cycles, reserve the number of virtual CPUs for the Virtual Whitewater and also reserve the number of clock cycles (in terms of CPU MHz).

 Do not over-provision the physical CPUs - Do not run more VMs than there are CPUs. For example, if an ESX host is running off a 4-core CPU, all the VMs on the host should use not more than 4 vCPUs.

 Use a server-grade CPU for the ESX host - For example, use a Xeon or Opteron CPU as opposed to an Intel Atom.

 Always reserve RAM - Memory is another very important factor in determining Virtual Whitewater performance. Reserve the RAM that is needed by the Virtual Whitewater model plus 5% more for the VMware overhead—this provides a significant performance boost.

 Do not over-provision physical RAM - The total virtual RAM needed by all running VMs should not be greater than the physical RAM on the system.

 Do not use low-quality storage for the datastore disk - Make sure that the Virtual Whitewater disk used for the datastore VMDK uses a disk medium that supports a high number of Input/Output Operations Per Second (IOPS). For example, use NAS, SAN, or dedicated SATA disks.

 Do not share host physical disks - VMware recommends that to achieve near-native disk I/O performance, you do not share host physical disks (such as SCSI or SATA disks) between VMs. While deploying a Virtual Whitewater, allocate an unshared disk for the datastore disk.

(6)

Whitewater Best Practices for CA ARCserve

CA ARCserve uses Disk-Based devices associated with clients via File System Device (FSD) Groups to store data from client systems. Devices and Groups can be viewed and modified from the Quick Start menu item and selecting Administration  Device from the drop down.

Figure 1 Manager Console Device Drop Down

Figure 2 Devices and Groups

 Currently CA ARCserve only allows one backup operation to use one device at a time. To configure multiple backups to use Whitewater, configure multiple devices that point to separate folders configured on the Whitewater gateway, and assign each device to a separate FSD group. Then configure each backup job to use one of the configured FSD groups.

(7)

Whitewater Best Practices for Commvault Simpana

Commvault Simpana utilizes disk libraries to store data received from backup client systems. These can be viewed under the Commcell console, and browsing the left pane tree to the Storage Resources > Libraries item. Library configuration can be done globally and at a per library level.

Figure 3 CommCell Console Library Properties View

Figure 4 Library Properties > Mount Path Panel

 It is recommended to initially set the Allocated Number of Writers value to no greater than 5.

(8)

Figure 5 Mount Path Properties > Device Paths Panel

 It is recommended to initially set the Allocated Number of Writers value to no greater than 5.

(9)

Whitewater Best Practices for EMC NetWorker

EMC NetWorker uses Advanced File Type Device (AFTD) targets to store backup data from client systems. Advanced File Type Device targets can be viewed and modified from the Devices view of the EMC NetWorker Administration screen.

Figure 6 EMC NetWorker Administration Screen

Figure 7 Advanced File Type Device Properties

 Target Sessions should be set to 1 (default)

 Max sessions should be initially set to no greater than 5

(10)

Whitewater Best Practices for IBM Tivoli Storage Manager

IBM Tivoli Storage Manager uses Device Classes and Storage Pools to store backup data from client systems. Tivoli Storage Manager Device Classes and Storage Pools can be viewed and modified from the Storage Devices Panel of the Tivoli Storage Manager Administration Console.

Figure 8 TSM Device Class Properties

 Tivoli Storage Manager uses Whitewater CIFS shares as a FILE device class.

Each FILE device class can point to one or more Whitewater CIFS share(s) via the directory option. It is not recommended to use Whitewater CIFS shares as DISK device class volumes.

 Mount Limit should not be set to a value higher than 5 per Whitewater based FILE device class.

 Maximum Volume Capacity should not be set to a value greater than 20GB (20480 MB).

Figure 9 TSM Storage Pool Properties

(11)

 Reclamation Threshold should be set to a value of no lower than 80. Reclaiming volumes which have lower than 80%

reclaimable space can cause unnecessary disk read and write activity.

 Maximum number of scratch volumes can be set to any value allowed by Tivoli Storage Manager. If you wish to limit the total amount of data stored to the cloud by Tivoli Storage Manager, set a maximum scratch volume limit which will accomplish this size requirement. For example, if you wish to only use a maximum of 40TB of space in the cloud and maximum volume capacity is set to 20GB, then the maximum number of scratch volumes value should be set to 40,000GB / 20GB = 2,000.

 Number of days before an empty volume is reused should be set to 0 (default). This will immediately allow a volume to be reused for holding new backup data.

Figure 10 TSM Storage Pool Identify Duplicates

 When creating a Whitewater based storage pool, leave the Identify the duplicate data in this storage pool checkbox unchecked as shown in Figure 5, and click Next and Finish to complete the wizard.

(12)

Whitewater Best Practices for Oracle Database Server

Oracle Recovery Manger (RMAN) is the backup utility provided with Oracle database server which provides online and offline backup capabilities for Oracle databases. It can be run via the command line, or can be executed by the Enterprise Manager, which provides a graphical front end interface for backup and restore operations. When configuring Oracle backups via Enterprise Manager configure the target disk and allocated channels appropriately.

Figure 11 Oracle Backup Settings Parallelism

 Defines the number of allocated channels to stream data to the backup location. It is recommended to set an initial value of 5 for this option.

Disk Backup Location

 Specify the Whitewater CIFS or NFS target that the backup will be written to. This overrides the fast recovery area disk target for backups. It is not recommended to configure a Whitewater CIFS or NFS share as the fast recovery area disk target.

Disk Backup Type

 Specify Backup Set or Image Copy as the backup type.

Test Disk Backup

 Use this button to test the backup configuration with the selected disk backup target.

On Unix based Oracle database systems, it is recommended to use Oracle Direct NFS in order to improve performance between Oracle and Whitewater. This NFS client is built directly into the Oracle database kernel, and is recommended when using NAS based disk targets. Further information about Oracle Direct NFS can be reviewed here: http://www.orafaq.com/wiki/Direct_NFS If Oracle Direct NFS client is being used, the Whitewater NFS share must be configured with the insecure option to allow connections from Oracle Direct NFS. This step is not required if using the default NFS client with Linux or Unix. The insecure option is only available via the Whitewater command line interface, and cannot be set via the Whitewater GUI. Configure the NFS share via the Whitewater GUI, then login to the command line interface and issue the following commands:

en conf t

nfs export modify name <existing-Whitewater-NFS-share-name> insecure

It is not recommended to configure a Whitewater CIFS or NFS share as the Fast Recovery Area destination target. As the Fast Recovery Area is intended for localized recovery of Oracle database data (such as online redo logs), Oracle suggests having this on locally attached disk and not utilizing a network share.

(13)

Whitewater Best Practices for Quest vRanger

Quest vRanger repositories must be configured properly for use with Whitewater. Repositories are accessed via the My Repositories view of the vRanger User Interface.

Figure 12 Add Repository Properties

 vRanger repositories must be configured against an individual user account, and cannot access the Whitewater shares using the default Whitewater admin account. Create an individual user account on Whitewater, then use that account name when establishing the vRanger repository.

 vRanger repositories should not be enabled for encryption. Leave the Encrypt all backups to this repository checkbox unchecked.

Figure 13 Backup Job Options

 Compress Backed up Files should be left unchecked.

 Enable Active Block Mapping (ABM) should be left checked.

(14)

Whitewater Best Practices for Symantec NetBackup

Symantec NetBackup uses Storage Units to store backups from client systems, and Client Policies to dictate how backups are performed by client systems. In addition, media and master servers can set individual Client Attributes for clients backing up to these storage targets. All options below can be accessed via the NetBackup Administration Console.

Figure 14 Storage Unit Properties

 Client backups should always be performed to a dedicated media server.

A media server that also serves as the master server will incur a performance penalty from having to manage the disk activity for incoming backup data as well as metadata activity for the backup job itself.

 NetBackup Storage Units use Whitewater CIFS shares as a Disk storage unit type and BasicDisk disk type.

Each Storage Unit will point to one Whitewater CIFS share.

 Configure NetBackup policies to use specific Storage Units, rather than Storage Unit Groups.

Use of Storage Unit Groups can lead to uneven usage of the Whitewater CIFS shares, depending on how policies and the related backup jobs use the storage units.

(15)

 Maximum concurrent jobs should be initially set to a value of 5 or less within NetBackup Storage Units.

 Reduce fragment size to should be set to a value that is 20GB (20480 MB) or less within NetBackup Storage Units.

 To back up data via NetBackup 6.5 to a Whitewater CIFS share in a Windows environment, you must first configure the

“NetBackup Remote Manager and Monitor Service” and the “NetBackup Client Service”. Failure to perform this configuration could result in the NetBackup failure status 800 “resource request failed”. Please refer to the following link and/or the Symantec NetBackup Administrator’s Guide:

http://www.symantec.com/business/support/index?page=content&id=TECH56362 1. Open Windows Control Panel

2. Select Administrative Tools 3. Select Services

4. Double click on NetBackup Remote Manager and Monitor Service 5. Select Stop to stop the service

6. Select LogOn tab

7. Select the This Account radio button and enter valid credentials which match the credentials for a Whitewater CIFS user. Refer to the Whitewater User’s Guide > Chapter 3 > Configuring CIFS for further information on how to configure a CIFS user account.

8. Select General tab and select Start to start the service 9. Repeat steps 4-7 for the NetBackup Client Service

10. Use Windows Explorer to Map a Network Drive from the NetBackup Media server to the Whitewater CIFS share using the Whitewater CIFS user credentials to verify that access is available for NetBackup.

Figure 15 NetBackup Policy Properties

 Compression and Encryption should not be enabled within NetBackup policies.

(16)

 Allow multiple data streams should be enabled within NetBackup policies.

 Disable client-side deduplication should be enabled within NetBackup policies. Performing NetBackup deduplication and then Whitewater deduplication will impact overall backup performance due to the resources required to deduplicate the data twice. This may also increase disk consumption used to store that data, resulting in reduced deduplication efficiency by the Whitewater gateway.

Figure 16 Client Attributes Configuration

 For each client to be backed up, the Client Attribute Maximum Data Streams parameter should be set to a value up to the number of physical disk volumes that the client will back up.

(17)

Whitewater Best Practices for Symantec Backup Exec

Symantec Backup Exec uses Backup-To-Disk Folders to send backups to Whitewater. Backup jobs and backup policies should also be tuned for use with Whitewater. Configuration of both areas can be performed within the Backup Exec GUI.

Figure 17 Backup-To-Disk Properties

Backup Exec uses Whitewater CIFS shares as a Backup-To-Disk Folder device type. Each Backup-To-Disk Folder device will point to one Whitewater CIFS share.

 Maximum size for backup-to-disk files should be set to a value that is 20GB (20480 MB) or less within Backup Exec Backup-To-Disk folders.

 Allocate the maximum size for backup-to-disk files should be disabled within Backup Exec Backup-To-Disk folders.

 Concurrent Operations should be set to an initial value of 5 or less within Backup Exec Backup-To-Disk folders that are duplication targets of local Backup Exec staging volumes. Backups of Exchange servers must be directed to local staging volumes first.

 If the Backup-To-Disk folder is the direct target of a backup job or backup policy use a separate share on Whitewater for each concurrent backup job. Backup Exec by default sets the number of concurrent jobs on a Backup-To-Disk Folder to 1.

(18)

Figure 18 General Settings for Backup Jobs and Backup Policies

 Verify after backup completes is suggested to be disabled for backup jobs and backup policies. If enabled this will consume additional time and resources of Whitewater to validate data for Backup Exec, which can impact performance if Whitewater is processing multiple backups.

Figure 19 Network and Security Settings for Backup Jobs and Backup Policies

 Encryption type should be set to none for backup jobs and backup policies.

(19)

Figure 20 Exchange GRT Options

 Backup Exec GRT for Exchange utilizes multiple technologies to backup Exchange data, such as mailboxes, databases, transaction logs, etc. These types of backups may not succeed if written directly to a Whitewater share, as the process of backing up Exchange transaction logs utilizes multiple open file handles which Whitewater was not designed to handle. To perform backups of Exchange, setup your Exchange backup job to first backup to a local staging area and then duplicate the data in the local staging area to the Whitewater share. This approach will lead to increased network throughput, as well as potentially increased de-duplication factors via this approach.

(20)

Whitewater Best Practices for Veeam Backup & Replication

Veeam Backup and Replication performs backup jobs straight to CIFS based targets, and must be configured through the backup wizard to use Whitewater CIFS shares. The backup wizard icon is located in the main GUI interface (Figure 21).

Figure 21 Backup Wizard Icon

Figure 22 Configure Backup Destination

 Configure the backup target to the Whitewater CIFS share name, using universal naming convention format (\\hostname\sharename).

(21)

Figure 23 Advanced Job Settings Storage Tab

 Via the Advanced job settings button, configure the Storage options to disable deduplication and compression.

 Configure the storage optimization for a LAN target.

Figure 24 Advanced Job Settings vSphere Tab

 Via the Advanced job settings button, configure the vSphere options to enable Change Block Tracking to improve incremental backup performance to Whitewater

When using Veeam for vPower recovery operations (usch as Instant VM Recovery) with a Virtual Whitewater, it is recommended to configure the Virtual Whitewater appliance to its recommended memory settings (4 vCPUs or higher, 8GB RAM or higher) and follow the Virtual Whitewater best practices since Whitewater will be used for a significantly higher workload operations.

(22)

Conclusion

Riverbed is extending its industry-proven deduplication to cloud storage. Riverbed deduplication make replication more efficient than in the past by reducing the amount of data that is transferred. Now, the Whitewater gateway will extend those industry- leading deduplication capabilities to cloud storage, cutting the cost of data backup and disaster recovery by significantly reducing the capital and operational costs of storage consumed as well as the bandwidth requirements for moving backup data into and out of the cloud. Backup policies can easily be reconfigured to take advantage of Riverbeds core technology strengths, resulting in improved backup and recovery benefits and increasing availability of data to users.

About Riverbed

Riverbed delivers performance for the globally connected enterprise. With Riverbed, enterprises can successfully and intelligently implement strategic initiatives such as virtualization, consolidation, cloud computing, and disaster recovery without fear of compromising performance. By giving enterprises the platform they need to understand, optimize and consolidate their IT, Riverbed helps enterprises to build a fast, fluid and dynamic IT architecture that aligns with the business needs of the organization. Additional information about Riverbed (NASDAQ: RVBD) is available at www.riverbed.com.

Riverbed Technology, Inc.

199 Fremont Street San Francisco, CA 94105 Tel: (415) 247-8800 www.riverbed.com

Riverbed Technology Ltd.

Farley Hall, London Road, Level 2 Binfield, Bracknell

Berks RG42 4EU Tel: +44 1344 401900

Riverbed Technology Pte. Ltd.

391A Orchard Road #22-06/10 Ngee Ann City Tower A Singapore 238873 Tel: +65 6508-7400

Riverbed Technology K.K.

Shiba-Koen Plaza Building 9F 3-6-9, Shiba, Minato-ku Tokyo, Japan 105-0014 Tel: +81 3 5419 1990

Whitewater Cloud Storage Gateway