At least, that's the conventional wisdom.
Over time I've grown curious as to how much of the conventional wisdom about fragmentation—and defragging—is true. To that end, I set out to examine the subject with a fresh eye and find out just how much of a problem file fragmentation is, as well as how much of a benefit disk defragmentation is.
What is file fragmentation?
Let's begin by determining what fragmentation is. A file is stored in a file system as one or more allocation units, depending on the size of the file and the size of the allocation unit on the volume in question. As files get written, erased and rewritten, it may not be possible to write a file to a completely contiguous series of empty allocation units. One part of a file may be stored in one part of a disk, the rest of it somewhere else. In extreme cases, it may be scattered quite widely. This scattering is called file fragmentation.
The more fragmented a file, the more work the computer has to do to read it. Usually, this comes down to how fast the hard drive can seek to a specific sector and read the allocation units in question. If the computer has to read several fragmented files at once, the number of head movements and the amount of contention for disk access will go up --
File fragmentation became widely recognized as a problem in the early '90s. Hard disk drives were relatively small and slow (compared to today), and so fragmentation had a correspondingly big impact on system speed. To address the problem, Microsoft shipped DOS 6.0 with the command-line DEFRAG utility (licensed from Symantec). Some editions of Windows licensed a stripped-down version of what is still the most popular third-party disk defragmentation program for Windows: Executive Software's Diskeeper. Depending on their budget, as well as their inclination, most admins use either Diskeeper or Windows' native Disk Defragmenter tool. Either way, the prevailing sentiment is to defrag early and often.
Most of the conventional wisdom regarding defragging evolved from the way storage worked in the early 90s. The file systems used on PCs then were FAT12 and FAT16, holdovers from the days of MS-DOS. When Windows 3.1 and Windows 95 appeared, the limitations of FAT12 and FAT16 came to the fore. Neither file system could support volumes more than 4 GB in size or file names longer than eight characters. Both were notoriously prone to errors and data loss and were particularly prone to fragmentation.
Dump the FAT and move to NTFS
Microsoft introduced FAT32 as a way around some of these issues. But the best long-term solution was to dump File Allocation Table (FAT) entirely as a system-level file system and move to NTFS, a more robust file system that had been in use for Windows NT for some time. One of the many improvements NTFS provided was a reduced propensity for fragmentation. It doesn't eliminate file fragmentation entirely, but it does guard against it much better than any version of FAT ever did.
Today, no Windows system comes shipped loaded with anything less than Windows XP or Windows Server 2003, and the drives are almost always formatted with NTFS. So much of the impact of file fragmentation is lessened by NTFS' handling of the problem.
Nevertheless, file fragmentation can still create problems in NTFS. One of NTFS' quirks is that much of the metadata on a given volume is stored as a series of files, hidden to the user and prefixed with a dollar sign, e.g., $MFT for the volume's Master File Table and $QUOTA for the volume's user-quota data.
This allows NTFS to be extended quite elegantly in future iterations: If you want to create a new repository for metadata on a volume, simply create a new metadata file, and it should work with a high degree of backwards compatibility. In fact, NTSF 3.0 implemented new file-system features in precisely this manner, without completely breaking backwards compatibility with older Windows systems. (The one caveat: You couldn't perform a CHKDSK operation on an NTFS 3.0 volume if you were running a version of Windows that didn't support it completely.)
The downside of this mechanism is that the metadata itself can become fragmented, since it's stored as nothing more than a file. So the real issue with file fragmentation today (at least on NTFS) is not so much that individual files become fragmented, but that larger structures become fragmented, for instance, the file in a given directory, or the metadata used by NTFS. Does this, then, affect performance in a way that file fragmentation of files themselves might not?
An upcoming article on this subject will show how the way hard disk drives work has significantly changed the way file fragmentation works, as well as how we deal with it.
Disk Defragmentation Fast Guide
Disk defragmentation: Performance-sapper or best practice?
New hard disk drives reduce need for disk defragmentation
Four steps to lessen the effect of fragmentation.
Flash memory drive defragmentation: Does it make sense?
Three disk defragmentation issues defined
This was first published in September 2006