Problem solve Get help with specific problems with your technologies, process and projects.

When the lights go out: Exchange disaster recovery, part 1

Do you have a backup plan in the event of an unexpected disaster? Here are the basics on what you need to do if a power outage or crash strikes and you are left in the dark.

Part 1 of 2-part series

Microsoft Exchange is a large and complicated system, so when there's a disaster, numerous things can go wrong on many different levels. A good deal of how well you recover from an unexpected disaster hinges on whether or not you have any backup measures in place beforehand, and how extensive they are. If you have no backup methodology worth speaking of, you must work with what you have left over, and for that you need the right tools.

If there's been a crash or an outage, the first thing you need to do is see if you can start the computer and mount the Exchange database. If you can't boot the computer, place the disk in another machine and work on the database offline and deal with recovering the computer itself separately. Only then should you start attempting recovery of the Exchange databases.

Tools of the (disaster) trade
The two most powerful tools for disaster recovery with Exchange databases are ESEUTIL and ISINTEG, two command-line tools that come with Exchange. Any Exchange administrator worth his salt should know how they work. They can save your skin if you have no backups, or if your backups are not recent enough to replace everything with. However, they are also powerful enough that if you don't know what you're doing with them, you can only make the damage worse.

The Exchange database (the .EDB file) is stored in a series of 4K segments called "pages," each of which has a checksum to guard against data corruption. ESEUTIL examines the database page-by-page, looking for checksum problems, but doesn't actually inspect the underlying data (such as the mail stored in the database). ISINTEG attempts to recover the actual mail data from the database (what are called "application-level" errors) after you've used ESEUTIL to repair its underlying structure.

The first kind of checks that should be run with ESEUTIL are the /G, /M and /K checks, so named for the switches used to invoke them in ESEUTIL. /G checks the integrity of the database file to determine if there are physical problems. /M (usually invoked as /MH) prints detailed diagnostic information about a database, including the last time an incremental or full backup was made. /M can also be used to determine if the data in a damaged page isn't worth recovering—for instance, if that page holds mail for a user with a long-closed account, or someone's deleted items. (Microsoft Knowledge Base article 262196 talks about how to do this.) The /K switch verifies the checksum of each database page, and logs an event whenever a page with an invalid checksum turns up. These events can be used to determine which pages may be lost in a recovery operation.

Once you've run these non-destructive scans and determined what is or isn't damaged and what can or cannot be repaired, then you can consider using ESEUTIL to do actual work. Before you do this, make a copy of the database and work on the copy if you can—you should never do potentially destructive work on an original. If you suspect a hardware problem, move the database to a completely different computer with a copy of Exchange present and work on it there, or look at Microsoft Knowledge Base article 244525 for instructions on how to use ESEUTIL on a computer without Exchange installed.

The /R switch in ESEUTIL goes through the database's transaction logs and rolls forward any as-yet-uncommitted transactions, but does not repair or touch damaged pages. This is useful when a check of the database structure shows no problems, but there are still uncommitted transactions in the logs. Microsoft Knowledge Base article 259751 tells you how to determine if there are uncommitted transactions pending; it involves running ESEUTIL /M /K on the transaction log files. When the power goes out abruptly, this will be the most common problem found, and it's not hard to recover from.

ESEUTIL's /P switch repairs corrupt database pages, but doesn't commit unfinished transactions. You should find out if any transactions are pending (and commit them, if need be) before you run /P, and you should also find out if there are damaged pages which do not have anything crucial in them. If so, running /P will cause that data to be lost as a way to restoring the whole database's integrity.

After you have finished with ESEUTIL, put the database back into its original location (if you were working on a copy) and run ISINTEG with the switches –fix –test alltests. ISINTEG works through the Information Store process to rebuild the structures stored inside the database that Exchange sees -- messages, mailboxes and so on. ISINTEG generates very verbose warnings, but don't panic -- what matters is the error count at the end. If ISINTEG continually produces errors you can't resolve, it's best to create a new database and move what you can into there manually instead of relying on a possibly broken data structure.

Even with tools like these, they are no substitute for a good backup strategy. In part two tomorrow, I will talk about the different ways Exchange (and Exchange servers) can be backed up and restored.

Read part 2 here.

Serdar Yegulalp is the editor of the Windows 2000 Power Users Newsletter.

Do you have a useful Exchange tip to share? Submit it to our monthly tip contest and you could win a prize and a spot in our Hall of Fame.

Dig Deeper on Exchange Server setup and troubleshooting

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.