vsphere Virtualization and Data Protection without Compromise

(1)

vSphere Virtualization and Data

Protection without Compromise

Scott D. Lowe

Rick Vanover

Tom Gillispie

Brien Posey

(2)

Introduction ... 1

Setting Expectations ... 1

Should Tier-1 Applications be Virtualized? ... 2

Strategies for Protecting Virtual Machines ... 2

Considerations for Organizing Virtual Environments ... 3

What about Agentless Backups? ... 4

Backup Storage Considerations ... 5

Is Tape Still Necessary? ... 6

How ExaGrid Can Help ... 8

How Veeam Can Help ... 9

Today companies want it all: The highest virtualization and data protection performance, the best deduplication ratios and restores that move at the speed of business. Arriving at that infrastructure utopia is a distinct reality today, with the right approach. This Technical Brief from Veeam and ExaGrid will show you how you can transform your vSphere investment to an infrastructure that lets you have it all.

To help us get there, this Technical Brief will cover a number of topics on VMware virtualization; protection for the Modern Data Center with Veeam and scalable disk-based backup with ExaGrid.

Throughout much of its history, server virtualization and data protection have been somewhat at odds with one another. Virtualization offered tremendous potential for decreased hardware costs and simplified management. At the same time however, server virtualization initially made data protection much more difficult because the backup software of the time had trouble coping with the virtualization stack.

Today, administrators no longer have to choose between server virtualization and data protection. Both vSphere and related backup products have had time to mature and VMs can now be reliably protected without trading performance for cost or worrying about robust solutions.

Setting Expectations

The last several years have seen a number of changes in IT. One of the biggest changes perhaps is that executive and end user expectations have changed tremendously. Server virtualization has matured to the point that new resources can be spun up on a whim, and can be made available almost immediately.

The problem with these expectations is that in many organizations the backup solutions that are currently in place lag behind the virtualization infrastructure. Businesses have grown accustomed to getting what they want, when they want it. These expectations have crept into the area of disaster recovery. Businesses expect the IT department to perform recovery operations and to “get their stuff back” with the click of a mouse.

Although single click recovery might not be completely realistic, technologies do exist that allow for virtual machine recovery (or for recovery of granular items within a virtual machine) to happen almost instantaneously. It is clearly in an administrator’s best interest to invest in such solutions. Not only will the administrator’s life be easier if they are able to meet expectations, but down time costs the organization money and faster recovery translates directly into cost savings.

From a Fellow VM Admin – Rick Vanover, Veeam Product Strategy Specialist

“There has always been a critical disconnect with the business’ expectations and what the infrastructure has traditionally been capable of doing. Today, we really can do it all. VMware vSphere is a great platform to run our workloads on scale, and with the right management and data protection strategies we can allow the infrastructure to move at the speed of the business, including when things go wrong.“

(4)

vSphere Virtualization and Data Protection without Compromise

2

-Should Tier-1 Applications be Virtualized?

Although server virtualization has been a mainstream technology for a number of years, administrators have historically been reluctant to virtualize servers that are running tier-1 applications. In some situations, these applications could not be virtualized due to hypervisor limitations (such as the inability to attach a VM directly to Fibre Channel storage). More often however, there were concerns around reliability and the ability to protect the VM. Today, vSphere is a mature product and hypervisor limitations are much less of a factor than they once were. As such, the acceptability of virtualizing servers running tier-1 applications is often determined by factors such as performance and reliability. In essence, administrators must be able to guarantee that mission critical applications perform just as well and function as reliably in virtual environments as they did when running on physical hardware.

From a Fellow VM Admin - Scott D. Lowe, Independent IT Consultant

“Too many times an application design that works for a physical configuration doesn’t work so well in a virtualized environment. Today, there are plenty of best practice resources to virtualize critical applications such as SQL Server and Exchange on vSphere.“

The same basic philosophy also extends to data protection. Compromising protection for the sake of running an application on virtual hardware is unacceptable. Data protection must be at least as good for the vSphere environment as it was for applications running on physical hardware.

Although it might at first seem somewhat counterintuitive, virtual environments can actually provide better performance, reliability, and protection than is possible in a physical environment. For instance, the virtualization infrastructure provides failover clustering capabilities for servers that are running applications that could not otherwise be clustered.

Strategies for Protecting Virtual Machines

Although it is ultimately up to the backup software to provide recoverability for your vSphere environment, there are some best practices that you should be adhering to as a way of helping to protect your virtual machines.

One such best practice is to define vApps for your multi-tier applications. A vApp is really nothing more than a container for a collection of virtual machines. A vApp offers resource control and portability for the VMs that are stored within the container. Furthermore, vApps allow the entire collection of VMs within to be powered on or off, suspended, shut down, or even cloned collectively.

Obviously vApps can make it a lot easier to manage multi-tier applications that span multiple virtual machines. At the same time however, vApps can also help you to protect your multi-tier application. Obviously, not every VM requires the same level of protection. Oftentimes administrators will attempt to define a series of backup jobs as a way of offering different levels of VM protection. vApps can help with this approach, because it may be possible to define a single backup job for the entire vApp rather than trying to protect each application server individually. By defining a series of vApps, administrators can potentially reduce the number of backup jobs that they must manage.

(5)

“Without a doubt, one of the most difficult things to communicate today is the need to change the application. When companies see the benefits associated with virtualization coupled with the rich restore options, those benefits simply can’t be ignored. But we can’t design applications the same way. They must fit into the VMware environment well, and the good news is that today there are plenty of resources to help application and VMware administrators get there.“

One thing that you must keep in mind with regard to container usage is that depending on the backup software that you are using, backup jobs tend to be dynamic. This means that if you add a VM to a container, a container specific backup job may automatically be modified to accommodate the new VM. As such, it is important to consider the impact that container modifications will have on your backup jobs prior to adding VMs to a container.

So what about VMs that are not a part of a container? Container specific backup jobs tend to be the most appropriate in situations in which a multi-tier application (or a set of VMs within a container) have very specific protection

requirements. Under normal circumstances it is a good idea to perform host level backups whenever possible. Doing so allows for the creation of application consistent, image based backups.

Considerations for Organizing Virtual Environments

One of the primary benefits provided of server virtualization is that virtual environments tend to be far more flexible than physical datacenters. This flexibility makes it relatively easy to create a logical structure that aligns with the organization’s business processes. However, one of the best strategies for protecting your virtual datacenter is to organize your virtual machines in a way that makes it easy to achieve your data protection goals as well.

Perhaps the single most important thing that an administrator can do to ensure the health of their vSphere environment while also guaranteeing the ongoing ability to achieve the required level of protection is to develop a strategy for controlled growth.

From a Fellow VM Admin - Scott D. Lowe, Independent IT Consultant

“While virtualizing with vSphere is a technology-rich set of features and software products, there is just as much of a business discussion that needs to happen. There need to be clear expectations on what level of service applications receive. This includes not just running them, but also protecting them.“

All too often, administrators are guilty of deploying new virtual machines without doing much planning. In some ways, the organization’s expectations play into this problem. Administrators are under tremendous pressure to deliver nearly instantaneous results, so it is no wonder that virtual machines tend to be deployed very quickly. The problem with rapid VM deployment is that simply checking to make sure that vSphere has sufficient capacity to accommodate the new VM is inadequate. In addition to the required capacity, there must be sufficient storage bandwidth available to provide the required IOPS for the new VM, without introducing any extra latency in the process. As an administrator, you never want to be in a position where creating a new VM causes the demand for storage IOPS to exceed what your underlying storage infrastructure can deliver.

While these types of challenges are most commonly thought of in conjunction with a VM’s use of primary storage, it is just as important to make sure that backup storage is also able to cope with the added workload caused by the need to back up a new VM.

(6)

4

-From a Fellow Backup Storage Admin - Tom Gillispie, ExaGrid Director Application Interop & Product Management

“Many times people forgo giving backup storage the same consideration as primary storage. That’s natural; it’s a different storage administration practice. Further, it’s difficult to balance all interests: performance, costs, features such as deduplication and more.”

This is where the concept of controlled growth comes into play. It is important for administrators to monitor performance metrics such as latency, read and write IOPS, and capacity, and have a plan in place to provide additional hardware before any predetermined thresholds are exceeded.

Obviously there are a number of different ways for planning for controlled growth, but one especially good approach involves the creation of pods. Pods refer to dedicated collections of hardware resources such as host servers and storage arrays. There are vendors who sell prebuilt pods, but pods can also be created in house.

The idea behind using pods is that each pod represents a finite set of resources. Administrators keep track of

resource consumption, and when consumption levels near the predetermined threshold a new pod is deployed. The advantage to this approach (besides ease of management) is that each pod’s resources are self-contained. As such, administrators do not encounter the growing pains that so often occur as a result of adding resources to an existing cluster. For example, adding an extra host to a cluster can stress the cluster storage, whereas each pod is connected to a dedicated LUN and so the addition of a new pod has no impact on existing storage performance.

What about Agentless Backups?

Although it is undeniably important to achieve comprehensive protection for virtual machines, manageability is also an important consideration. VM proliferation tends to occur over time, and as an organization’s VM count increases the difficulty of managing backup agents is also compounded. As such, administrators should strongly consider using an agentless backup solution.

Companies today may struggle with the best way to back up VMs. The reality is that the numbers of VMs have increased, but the operational process to get there may not be able to move as fast. One way to address this is to leverage the vSphere constructs and perform agentless image-based backups. Constructs such as the vApp, Cluster, Datastore and more can function as containers to move data. The best part is that when the backup job runs again, it will query vCenter Server to determine what components are now in the constructs and back them up accordingly. This can be done with an agentless processing engine with full awareness of the vSphere environment.

(7)

“Everyone has their story on how an agentless backup has let them down in the restore process. I’m convinced that you can have it all today with an agentless backup: application consistency, granular recovery and log truncation. Veeam Backup & Replication makes that easy!”

In the past, agents played a major role in the backup process. Backup agents not only facilitated communications between the protected resource and the backup server, but also allowed for granular recovery and application consistency.

Agentless backup products have existed for a number of years, but previously had a reputation for delivering subpar protection. Some of the earlier products that supported agentless backups gave administrators a choice of whether or not to use agents, but going agentless usually meant giving something up. The only way to achieve comprehensive protection with such products was to make use of agents (thereby negating the benefits of agentless backups).

Today this is no longer the case. Agentless backups have become a truly viable option. Backup applications exist that can create agentless backups without making any compromises. Some such products provide true application consistency (not just crash consistency), including the ability to perform log file truncation when necessary. It is also possible to achieve granular recovery using agentless backup solutions. Some such products can recover host servers, individual VMs, and files, folders, and applications within virtual machines.

Some of today’s agentless backup products are every bit as good as backup applications that still rely on agents. However, it is important to review an agentless backup product’s capabilities prior to making a purchase since some of the agentless backup applications being sold today still have major limitations.

Backup Storage Considerations

Disk-based backups have quickly become the norm because they offer an unprecedented level of flexibility that simply cannot be achieved through tape. Even so, not all storage is created equally. A dedicated backup appliance, for example, is typically going to do a much better job than JBOD storage because the appliance contains features and capabilities that are specifically geared toward the backup process.

There are also significant differences among backup appliances. Every backup storage vendor offers their own set of features and capabilities. These features and capabilities have a direct impact on the efficiency with which backup and restore operations can be completed.

At first it is easy to dismiss the notion of operational efficiency. Most modern disk-based backup solutions are designed to perform continuous data protection. Because continuous data protection is ongoing, the backup window is eliminated, and it seemingly becomes much less important to use high-performance hardware that can complete a backup as quickly as possible.

The problem with this logic, however, is that it does not take restore operations into account. When disaster strikes and a restore must be performed, time is almost always of the essence. Every minute that critical resources are unavailable costs the organization money. As such, it makes sense to invest in backup appliances that are able to perform restorations with optimal efficiency.

(8)

6

-From a Fellow Backup Storage Admin - Tom Gillispie, ExaGrid Director Application Interop & Product Management

“Deduplication is the top priority when it comes to disk-based backups. But all deduplication appliances function differently. ExaGrid’s approach is truly ready for the future in that there are no forklift upgrades required, because the GRID can scale as you grow in storage as well as processing power, connectivity and memory.”

Of course this raises the question of what you should look for in a backup appliance. Obviously features such as high-speed data links, caching or a disk landing zone, and the use of multiple spindles are important, but sometimes the way in which certain features are implemented is just as important as the features themselves.

Take deduplication for example. Deduplication has become a mainstream technology for storage appliances that can help to dramatically decrease storage costs by allowing capacity to be used much more efficiently. Even though deduplication is undeniably important, it can have an impact on an appliance’s overall performance. As such, deduplication must be implemented in a way that does not sacrifice performance for storage efficiency. One of the biggest problems with deduplicating backup data is that the deduplication process can slow down restore operations because the data must be rehydrated before it can be restored. Some storage appliance vendors get around this problem by taking data age into account when performing post-process deduplication. By doing so, it is possible to ensure that the most recently written data remains uncompressed, thereby allowing the most recent backup to be restored without having to first be rehydrated. This approach allows the backup appliance to take advantage of the benefits the deduplication can provide, but without sacrificing performance in the event that it is necessary to restore a recent backup.

Another way to minimize the impact of the deduplication process is to take a multi-tier approach. This approach involves the backup software and the backup appliance working together to perform global deduplication. The basic idea behind this concept is that most modern backup software can perform deduplication at the backup server level. The advantage to allowing the backup server to deduplicate the backup data is that the deduplication process is offloaded from the backup storage appliance, thereby improving the appliance’s performance. However, performing deduplication at the backup server level may not always deliver optimal results.

To see why this is the case, imagine what would happen if a backup server were to deduplicate backup data on a per job basis. The data within each job is deduplicated, but there might still be duplicate data across backup jobs. Allowing the backup appliance to perform a second round of deduplication eliminates cross job duplicate data. This same approach also works well in situations in which multiple backup servers are being used.

Is Tape Still Necessary?

Various IT analysts and storage vendors have been proclaiming tape’s demise for nearly a decade. While such proclamations could best be described as premature, backup technology has evolved considerably over the last several years, and disk backups are far more prevalent than they were just a couple of years ago. As such, it seems fair to once again consider the question of whether or not tape is still necessary.

(9)

One must consider the ways in which tape has traditionally been used. There are a few different use cases for tape. First, some organizations use tape as a primary backup medium. As demonstrated in the previous section, disk storage can easily replace tape for this purpose.

A second use case for tape is for the purposes of data portability. The only way to truly protect data is to have multiple copies of that data. Ideally, one copy should exist on premise, while another copy is stored safely off premise. Tape is often used as a convenient medium for shipping data offsite.

A third use case for tape is that of long-term archiving. Byte for byte, tape has historically been a less expensive medium then spinning disk. As such, large organizations that need to retain data over a long period of time have often relied on a multi-tier storage architecture in which aging data is archived to tape in an effort to decrease consumption on primary storage.

Although tape obviously works well for each of these use cases, tape is no longer the only available choice. Today there are disk-based storage solutions that can do just as good of a job as tape for these purposes.

In the case of offsite storage, most modern storage appliances offer a data replication feature which allows data to be replicated to a remote datacenter. Assuming that adequate bandwidth is available, storage replication can actually be considered to be superior to tape-based storage for a few different reasons.

First, storage replication sends data offsite in near real time. In the case of tape-based storage, data is shipped to the remote location at periodic intervals. Because the tape remains on premise for a period of time, there is a potential for data loss should a fire or other disaster occur before the tape can be shipped.

A second reason why storage replication can be considered to be superior to tape-based storage is because there are certain hazards associated with shipping a tape. The tape could be misplaced or damaged in transit. There is also a potential for tape theft, which could lead to data exposure. Storage replication eliminates these concerns because data is never physically shipped. The storage devices remain in the datacenter at all times.

Today tape is probably most commonly used as an archive solution. Cost and the capacity limitations of spinning media have traditionally made it a poor choice for long-term storage of archive data. Recently however, archiving data to spinning disks has become much more practical. Not only have storage costs decreased dramatically in recent years, but deduplication technology has matured to the point that it is safe for use with even the most important data. Modern archiving solutions combine deduplication with changed block tracking to create archives (monthly, quarterly, annual, etc.) without consuming excessive storage.

In an era where disk backup systems are rich in options, cloud is a consideration and tape is still an option; companies have a lot to choose from.

(10)

8

-How ExaGrid Can Help

Landing Zone: VM Instant Recoveries in Minutes – ExaGrid’s unique landing zone means that with ExaGrid and backup applications that support instant VM recovery, you can have the benefits of both instant restore and deduplication with

no tradeoff. ExaGrid’s high-speed landing zone maintains a full copy of the latest VM backups in their original, non-deduplicated formats. This allows IT staff to instantly boot a VM from the ExaGrid system for fast recovery, helping businesses to get what they want, when they want it.

Additional Data Deduplication – ExaGrid’s deduplication improves the overall deduplication rate. The total

deduplication is the Veeam duplication times the additional ExaGrid deduplication. As a result, at a certain retention, a dedicated ExaGrid appliance will cost far less than straight disk. ExaGrid has a calculator that shows how much disk is required at each retention point and the resulting cost. The calculator also shows where the cutover point is when an ExaGrid appliance will actually cost less than straight disk. Due to the additional deduplication, the cutover point where an ExaGrid will cost less than straight disk is above four weeks of retention or more.

Scale-out Architecture – ExaGrid’s scale-out architecture adds full-server appliances with processor, memory, network bandwidth, and disk into a scalable GRID system. As the data grows, so do all four resources. By adding compute with capacity, the backup window stays fixed and does not grow over time. The GRID allows for modular growth when the customer needs it, without having to rip and replace an appliance that has been outgrown, with no forklift upgrades. Whether it’s virtualized tier-1 applications, vApps, or agentless backups, ExaGrid’s scale-out architecture ensures you will have the proper resources to protect it.

Disk-based Backup with Deduplication and Replication – Achieve the near real-time replication of backups along with cost-effective longer term retention required by today’s business environments, while reducing or eliminating the high cost of tape-based data protection.

Learn more about ExaGrid’s five guaranteed customer commitments: 1) The shortest backup window, 2) no backup window growth regardless of data growth, 3) fastest restores, tape copies, and recovery from a disaster, 4) VM instant recoveries in minutes, and 5) lowest cost solution up front and over time with no forklift upgrades or obsolescence, and ExaGrid’s price guarantee. All this plus world-class customer support. www.ExaGrid.com

(11)

ExaGrid Systems, Inc. | 2000 West Park Drive | Westborough, MA 01581 | 800.868.6985 | www.exagrid.com

ExaGrid reserves the right to change specifications or other product information without notice. ExaGrid and the ExaGrid logo are trademarks of ExaGrid Systems, Inc.

Veeam Backup & Replication is made to address the challenges that have been identified, yet more so allow you to restore data as fast as it can provision it. One example is Veeam’s Instant VM Recovery. This technique can allow a VM to be booted up and running in as little as two minutes. This can truly save the day if someone accidentally deleted a VM or you have a SAN failure.

Why is this important? Simply put, things can go wrong in a fully abstracted world. Further, the business moves fast. This is where agentless backups can make or break things today. Agentless backups, coupled with a full-featured image-based backup, can give all of the requirements applications have for today. One example is Veeam Explorer for Exchange, which gives you instant visibility into your Exchange backups. You can browse, search and selectively export items (emails, notes, contacts, etc.) directly from Veeam backups.

Instant VM Recovery and Veeam Explorer for Exchange are just two examples, but see for yourself. Download a trial of Veeam Backup & Replication today at www.Veeam.com.

vsphere Virtualization and Data Protection without Compromise