Published: 13 Nov 2009
Migrating your servers to virtual machines comes with many obvious benefits; hardware consolidation, performance management, and easier disaster recovery being a few.
Some of the drawbacks, however, come from your old infrastructure. The backup systems in particular can come under disruptive fire because most have been designed to support a certain number of physical servers. Now that your server count can grow exponentially, and the assumptions of resource use go out the window, you need to address your backup solution before virtualization causes backups to get out of control.
Take the opportunity to build backup planning into new virtualization projects so you can intelligently review your current approach with those that are available.
Big Mistake #1: Let it ride and do nothing
Agent-based backup is a common method for backing up servers. This involves installing the backup agent on your servers and pushing settings to it from your backup server. You can then backup those servers with nearly the same settings -- save for some application specific differences -- during a specific time of day and in a specific time window.
Now take that same approach and apply it to virtual servers. First, you'll have a dozen or more servers running on one physical hardware host. There are also a dozen or more backups running on the same hardware with the same CPU resources, accessing disk I/O, and likely backing up much of the same data all at the same time. Add to this scenario that virtualized environments tend to grow much more rapidly than their physical equivalents, and you can see how you will begin to tap out your virtual host resources as well as your backup infrastructure.
Big Mistake #2: Change everything without thinking
Virtualization and backup vendors will sell you a whole suite of tools to deal with the problems inherent to virtual backups, but you need to think these methodologies through, especially since some solutions will see an invoice well into five or six digits that make you throw out what you already have. Plus, they all come with their own list of pros and cons.
One backup method is host-based backup, which entails backing up the virtual machines as files. All virtual machines are really just big files on disk storage, so backing them up without using resources can allow you to back up the virtual machine in one fell swoop. This allows you to backup those virtual machines using one backup job, reduces the number agents you install and track, and minimizes the effect of resource-intensive features such as agent-side software compression.
With this solution you can lose the granularity of an agent on every virtual machine. It's much more difficult to retrieve a single file from a specific backup, let alone know where it is amongst the big files that make up your whole VM backups. You lose the file-level cataloging that most of us have come to rely on for small restorations.
In this situation, you trade simplicity in backup for complications during restoration. You also have the possibility of a real headache when dealing with features like VMware's VMotion that allows a virtual machine to move between hosts. This can make it very difficult to track down the correct virtual machine based on a host backup.
When approaching backup, you don't necessarily need to throw the baby out with the bathwater. The fact is, an agent running on each machine brings with it significant benefits. You are using a proven method of backup that has been working in your environment for years. The agent gives you granularity. You also have a full catalog of which files were backed up and when. The downsides are all that have been mentioned previously.
Consolidated backup, also known as hot backup or off-host backup, is a virtualization-specific backup technology. It takes a snapshot of a virtual machine and transfers it to either a backup proxy server over the LAN or, more likely, a storage area network (SAN). The backup proxy will take care of tracking the contents of those snapshots and where they belong; it will catalog everything.
Retrieving a restoration for a set of files requires that a proxy mount the virtual hard disk and pull the data. The downside is the extra infrastructure. This will require new hardware, usually a disk-based backup, and software as part of a consolidated backup solution. This solution is slick, but can be expensive and it requires you to start over with a new approach to backup.
Big Mistake #3: Quick decisions
It will take some work to come to the right decision for your organization. In order to pick the right option, look at your environment as it stands. Will the traditional agent on each sever -- even virtual servers -- work for you? Deploy those agents on a test bed of VMs and measure CPU, network throughput, and time to complete. VMware technologies and Microsoft Hyper-V have their own counters you can collect information from to see the effect on the host machine.
How much data is being stored in your backup environment? Will you have enough space? If you see that agent backups are going to have a negative effect on backup space or time to complete, review your backup jobs. You likely have inefficient backup jobs that are taking more than they need. You don't need 500 copies of C:\Windows, and there are always a slew of examples that see unnecessary duplication.
More on backing up Windows
If system or network resources are the problem, see if there is anything you can do to stagger the jobs. On a single host, running all backup jobs at the same time may put a strain on resources, but staggered throughout the night you could see enough of an improvement to stick with those agents.
If network bandwidth becomes the real bottleneck, consider a separate virtual network for backups. Multiple virtual network cards and internal virtual local area networks (VLANs) on the virtual switch are common. You could consider running that traffic over a different VLAN and physical NIC to alleviate network heartburn.
When considering host-based backup, see if your host agent is virtual machine-aware. Backing up virtual machine files can lead to inconsistencies, so an agent and jobs that take VMs into account are a must. Understand what SLAs apply to backups and restores, because restoring a backup now means mounting a virtual machine to retrieve files, especially if those virtual hard disks are on tape. It's really a two-step process with much more disk space and time needed to bring that virtual machine back to mount.
Consolidated backup will be a brand new solution that will replace what you already have. This is not a small project. If you can remember back to when you were setting up and fine tuning your original backup infrastructure, you know a new system will take some getting used to. Remember that it is a project just like any other in IT, with new hardware, software and configurations to support -- not a magic wand.
The right decision
Virtualization is something that is coming so fast, it's disrupting our core systems and support structures. Backup is a fundamental service that you can't get wrong. Take the opportunity to build backup planning into new virtualization projects so you can intelligently review your current approach with those that are available. Applying smart decision making will help you avoid the big mistakes and allow you to have a new view on an old, reliable solution.
About the author:
Eric Beehler has been working in the IT industry since the mid-90's, and has been playing with computer technology well before that. His experience includes over nine years with Hewlett-Packard's Managed Services division, working with Fortune 500 companies to deliver network and server solutions and, most recently, I.T. experience in the insurance industry working on highly-available solutions and disaster recovery. He currently provides consulting and training through his co-ownership in Consortio Services, LLC.