Although it is important to have a disaster recovery plan that includes a procedure to prevent replication failure...
if every domain controller in the hub site becomes unresponsive, that scenario is quite unlikely to happen. A more likely predicament would be the breakdown of a single domain controller due to a hard disk crash, a bad network card, file system corruption, corruption of the Active Directory or the large variety of commonplace glitches you deal with on a regular basis.
So what is your disaster recovery plan for the failure of a single domain controller?
There are actually two answers to that question:
- Restore the domain controller from backup (Note: the Active Directory is held in the system state, thus restoration of the system state will restore the Active Directory).
- Don't restore the domain controller. Reinstall and repromote, or simply demote and repromote.
The first solution, restoring from backup, isn't as simple as it seems. Although the most direct method is to repromote the domain controller as described above, there are cases when a restore from backup is required. For instance, restoring an entire domain or an entire forest where no active domain controller exist is only possible by restoring the system state of one domain controller in the domain, then installing other servers and promoting them to domain controllers. There is no reason to restore other domain controllers from backup. In a multiple domain forest, you must restore the root domain first, followed by child domains.
In restoring a domain controller from backup media, it is important to note the following:
- It is only necessary to restore a single domain controller in each domain (starting with the root domain if it's a multiple domain, parent-child structure). After the first domain controller is restored, bring additional domain controllers in using DCPromo.
- In a true disaster recovery plan, you must allow for the fact that the restore will likely take place on different hardware than the original that the backup was made from. Refer to Microsoft KB article 263532 How to perform a disaster recovery restoration of Active Directory on a computer with a different hardware configuration.
- Backup tapes are only useful for 60 days or whatever the Tombstone lifetime value is set to. (see Microsoft KB 216993 Backup of the Active Directory Has 60-Day Useful Life). Ensure that you have a process to create backup tapes. regularly and validate and store them safely.
- It is not necessary to restore a domain controller simply because it holds one or more FSMO roles. These roles can be seized to other domain controllers. If you do seize a role, then the original role holder should never come back on line (wipe and reload it).
- Restoring a domain controller in an existing domain from a backup tape automatically makes that domain controller out of date by the number of days since the backup was performed. This will cause a synchronization to take place that will take longer than a normal replication update and have a bigger impact on the network since there will probably be more changes to replicate. This depends on the number of changes made since the backup tape was made.
I have worked with administrators who decided to restore a failed domain controller from backup tape, often with near disastrous results. In one case, two domain controllers failed and could only be restored using tapes from different days. It took two days to get the system working again, and we did it by doing what they should have done in the first place – manually demote the domain controller, clean up the Active Directory, wait for replication then repromote it with the same name.
It's important to note that one of the common reasons for demoting and repromoting a domain controller is because replication is broken. But if replication is broken then demotion via DCPromo is not going to work either.
More on restoring domain controller in Active Directory
Disaster Recovery Planning for Active Directory
Part 1: How to create an AD replication lag site to minimize disasters
Part 2: How to build redundancy in AD replication
Part 3: How to restore a domain controller from backup in AD
Part 4: How to use Install from Media to restore a domain controller
Gary Olsen is a systems software engineer for Hewlett-Packard in Global Solutions Engineering. He wrote Windows 2000: Active Directory Design and Deployment and co-authored Windows Server 2003 on HP ProLiant Servers. Olsen is a Microsoft MVP for Windows Server-File Systems.