- Having a plan -- your plan must account for every possible contingency that might necessitate a recovery. At a minimum, this plan will include hardware failures, corruption or loss of your Exchange data, failure of the infrastructure components (such as Active Directory—AD—and electrical power) that Exchange requires, and interruption of physical access to your servers.
- For each of these contingencies, you need to have a response. This response might be simple (for example, if non-critical hardware breaks, you wait for the vendor's service technician) or complicated (if your Los Angeles data center is damaged by an earthquake, you fail over its operations in your Denver data center). The point is to accurately describe the potential problems you might run into, and to have solutions identified for them.
- Being able to follow the plan -- just having a plan is fairly useless if you don't also have the ability to put your plan into action. This action will probably require a combination of money, persuasion, education, management support, and acquisition. For every solution you identify in your disaster recovery plan, you must have the necessary mix of equipment, skills, and preparation to make it actually happen.
Every cliche' you've ever heard about the value of prior planning applies here, in spades. The best way to make sure that your disaster recovery plan includes both of the necessary components is to write down the plan and then practice it. Writing down the plan is important because it sets out everything that you think should be included -- and that makes it easier to identify what's not included but should be. Practicing the plan is important because prior testing will make it much easier for you to identify shortcomings in the plan, in your equipment or infrastructure, or in the people who have to implement it.
The third component of a successful disaster recovery plan is perhaps the most often overlooked -- keeping the plan up to date as your IT operations, staffing, and business requirements change. For example, a disaster recovery plan originally written for Exchange 5.5 doesn't take advantage of some of the best new features in Exchange Server 2003, such as recovery storage groups (RSGs). A plan that assumes restore windows of 12 hours might not work well when the actual current SLA only allows for 6 hours of downtime. Performing regular and frequent tests of your disaster recovery plan will act as an antidote to this problem by highlighting areas of the plan that need to be brought up to date.
10 tips in 10 minutes: Fundamentals of Exchange Server disaster recovery
Tip 1: Defining Exchange disaster recovery
Tip 2: How Exchange backs up data
Tip 3: Choosing a backup type for Exchange
Tip 4: Online vs. offline Exchange Server backups
Tip 5: Basic Exchange backup and restore
Tip 6: Exchange vendor snapshots and point-in-time copies
Tip 7: VSS for Exchange
Tip 8: Exchange Server replication
Tip 9: Exchange design choices and issues
Tip 10: Exchange disaster recovery planning
This chapter excerpt from the free e-book The Definitive Guide to Exchange Disaster Recovery and Availability, by Paul Robichaux, is printed with permission from Realtimepublishers, Copyright 2005. Click here for the chapter download or download all available chapters here.