Self-healing NTFS keeps admins one step ahead of data corruption

While it's not perfect, the new self-healing NTFS has several benefits over past Windows file systems. Learn to avoid data corruption with this new feature in Windows Server 2008.

A couple of years ago I used an external, hardware-based RAID array for a volume containing all of the data on...

my primary file server. To make a long story short, the cable that connected the array to the computer went bad, and my data was corrupted.

Normally, a situation like this wouldn't have been a big deal; after all, I back up all of my data religiously. The problem was that the NTFS file system of the day was not designed to automatically detect corruption. In fact, the only way to know if corruption had occurred was for a backup application to report that the data was corrupt. This was a big problem, because data was becoming corrupt before it could even be backed up.

Fortunately, things are a little bit better today. Both Windows Vista and Windows Server 2008 make use of a feature known as self-healing NTFS. Basically, when self-healing NTFS is in use, Windows monitors the file system for any signs of corruption, and then automatically corrects any errors that it finds in an effort to prevent data loss.

Looking for some basics on Windows NTFS?

Check out our Windows NTFS Tutorial for tips dealign with NTFS vs. FAT32, NTFS recovery techniques and more.

Before you get too excited, I have to warn you that self-healing NTFS is not perfect. In fact, I recently had to run CHKDSK on one of my Windows 2008 servers in order to correct a minor corruption issue that self-healing NTFS had missed. Even so, self-healing NTFS can usually detect and correct minor instances of corruption on an NTFS volume.

Right about now, you might be asking yourself what good is it to have a self-healing NTFS feature if it isn't as powerful as CHKDSK. Well, there are several benefits.

For starters, self-healing NTFS prevents downtime. As most Windows administrators know, any time you use CHKDSK to fix a problem, you must run it in exclusive mode. This usually involves rebooting the server and running CHKDSK as part of the boot process. Of course, the server is not available until CHKDSK has completed, and depending on the size of the volume and the extent of the corruption, the server may be down for a while.

There's another benefit to using self-healing NTFS: When corruption does occur, it can be mitigated because self-healing NTFS begins the repair process as soon as the corruption is detected. This often prevents the corruption from spreading.

This leads me to another advantage that I mentioned briefly before: Self-healing NTFS monitors NTFS volumes for corruption. If corruption is detected, and the file system cannot repair itself, Windows is smart enough to alert you to the problem so that you can run CHKDSK. As I mentioned, previous versions of Windows would not do that, and data loss would sometimes occur because the administrator did not realize there was a problem with the volume until it was too late.

New Windows edges out legacy Windows

One last feature I want to mention is that self-healing NTFS is aware of critical system files. For example, suppose you tried to use CHKDSK to repair a Windows system file that was part of a legacy version of Windows. CHKDSK isn't smart enough to identify system files. If it attempted to repair the damage to a system file, it could potentially leave the system in an unbootable state. Windows Server 2008 and Vista, on the other hand, validate and then preserve the integrity of critical system files.

Since the whole self-healing process happens transparently, here's the question: How do you know if it's working? Well, the repair process writes events to the event log. If you happen to be monitoring the server with System Center Operations Manager, then you will be notified of the operation. If you aren't using a monitoring tool, and you want to know whether or not the file system has fixed any errors, then you must search the system event log for event numbers 130 and 55, with the event source listed as NTFS.

Of course, you don't have to wait for a file system error to find out if self-healing NTFS is enabled. Simply open a Command Prompt window as an administrator and enter the following command:

FSUTIL Repair Query c:

You can see what this command looks like in Figure A.

Figure A (click to enlarge)

In the past, some people have expressed concerns about the way the Windows file system consumes CPU resources. While I haven't seen any benchmarks on how much CPU time the monitoring and repair processes actually use, I have never seen a situation in which self-healing NTFS caused a performance problem.

Self-healing NTFS is known to cause one issue: When it makes a repair, the file that is being repaired is unavailable until the fix is complete. Even so, it's still better to have one file blocked than the entire file system blocked.

If for some reason you want to disable self-healing NTFS, you can do so by entering the following command in a Command Prompt window:

FSUTIL Repair set C: 0

If you want to re-enable self-healing NTFS later on, enter this command:

FSUTIL Repair set C: 1

It may not be the "be all, end all" of file system integrity, but self-healing NTFS is a huge step forward from what we had to work with in previous Windows operating systems. The best part is that this feature is enabled by default in both Windows Vista and Windows Server 2008.

Brien M. Posey, MCSE, has received Microsoft's Most Valuable Professional Award four times for his work with Windows Server, IIS and Exchange Server. He has served as CIO for a nationwide chain of hospitals and healthcare facilities, and was once a network administrator for Fort Knox. You can visit his personal Web site at www.brienposey.com.

Dig Deeper on Windows Server storage management