How long file names complicate data recovery

When storing and recovering files, long file names can cause data recovery issues, because the FAT and FAT-32 file systems don't natively support them. Learn how to resolve long file name storage problem.

In the previous article in the series, I discussed how disks use clusters for storing files. Hopefully, you now understand how to determine how many clusters a file uses, because calculating the number of clusters in use will be critical when we actually go on to discussing data recovery.

Before we get to that point, I need to discuss another concept: long file names. A previous article explained that when you delete a file, the operating system replaces the first character of the file name with a sigma sign (or, on very old systems, a question mark). If we were still using the old eight dot three naming convention, that's really all you'd need to know about file names. However, long file names complicate things. It's important to understand how they're saved to a disk.

Why are long files names a big deal?

Why are long file names such a big deal? Because the FAT and the FAT-32 file systems don't natively support them. The FAT file system has been around since 1982. Back then, only the eight dot three naming convention was supported. When Windows 95 was released in 1995, the FAT file system was still the only file system available to Windows. (NTFS existed, but was not supported by Windows 95).

Microsoft wanted Windows 95 to support long file names, so it came up with a technique for retrofitting the FAT file system so that multiple directory entries could be used in conjunction with each other to store long file names. When Microsoft released OEM Service Release 2 (otherwise known as OSR2) of Windows 95, they introduced the FAT-32 file system. Although FAT-32 overcomes many of the limits of FAT, it is still an adaptation of the FAT file system rather than a completely new file system.

How long files names are stored to disk

With that in mind, let's take a look at how long file names are stored on a disk. When Windows stores a long file name on a FAT or FAT-32 partition, it must do so in a way that allows the file system to recognize the file outside of Windows. This may sound a little strange. But remember, despite what the marketing folks might tell you, all versions of Windows (except for the ones based on the NT Kernel) ride on top of DOS. This means that for those versions of Windows (3.x, 95, 98 and ME), DOS is the actual operating system and, technically, Windows is a platform. As such, files written to the drive must be readable from DOS, even though some versions of DOS do not support long file names.

To maintain this backward compatibility, files that use long file names make use of a DOS alias -- an alternate file name that fully conforms to DOS naming conventions. This is necessary not only because of the length of the filename, but also because long file names make use of characters that are not normally allowed by the FAT or FAT-32 file systems.

Because of these limitations, it is impossible to save a long file name directly to a FAT or FAT-32 file system. When you save a file with a long file name to these types of file systems, the file is actually saved under its DOS alias. For example, if I saved a file named "Briens File.txt" to a FAT file system, the file would actually be saved as BRIENS~1.TXT.

Let's take a closer look at how the DOS alias works. You will notice in my example that the ~ is inserted where the space would normally occur. This is just a coincidence. The ~ does not take the place of a space. Instead, the DOS alias is created by taking the first six valid characters of the filename, and then appending a ~, then a number.

The reason for the number is because it is theoretically possible that multiple files could potentially be broken down into the same DOS alias. For example, the file names "Brien's File" and "Brien's Document" would both have the same DOS alias because the first six characters of the filenames are identical. To prevent this from becoming a problem, DOS aliases are numbered. If these two files existed on a disk, their DOS aliases would be BRIEN~1 and BRIEN~2.

Okay, so files with long names are saved to the disk using their DOS alias rather than their long file names. Where does the rest of the file name go? As I alluded to earlier, Windows is able to save long file names to the disk by making use of multiple directory entries. Any time a file with a long name is written to the disk, the file itself is saved using the file's DOS alias as a file name. However, Windows writes a second file to the disk that is used to hold the file's long file name.

If you look at a long file name entry through Disk Editor, you'll see it is not actually a file. If you look at a disk's file allocation table through Disk Editor, you'll see that the entry in the ID column identifies the object type of the directory entry. Usually, this object type is set to either File or Erased. But in the case of a long file name entry, the ID column is set to LFN, indicating that the entry is a part of a long file name rather than an actual file.

The long file name directory entry itself can contain up to 13 characters. A normal DOS file name can contain up to 12 characters. There is the 8-character filename, a period and a 3-character file extension. Long file names use all 12 positions, plus one extra character for their own purposes. I have no idea where the 13th character comes from; if you know, please send me an mail.

Long file names and directory entries

One last thing to keep in mind about long file names is that sometimes two directory entries aren't sufficient to store the long file name. For example, the name of the file I'm working on right now is Data Recovery and Long File Names.doc. If I were saving this file to a FAT or to a FAT-32 volume, the file's DOS alias would be DATA R~1.DOC (This format might vary slightly among versions of Windows). However, the long file name is longer than what can be accommodated by the file system's 13-character limitation. As such, multiple directory entries would be created. The following list contains the entries that would actually be written to the disk's file allocation table:

DATA R~1.DOC
Data Recovery
and Long Fil
e Names.doc

You would only see these directory entries when viewing the disk's file allocation table through Disk Editor. If you look at the disk's directory through DOS, you might see the long file names, or you might see the DOS aliases, depending on the version of DOS. If you view the directory through Windows, you'll see the long file names. But if you use Disk Editor, you'll see something like my example above, which is how the directory entries are actually made on the disk.

In the example above, I arranged the long filename entries in a way that makes them semi-readable. In the real world though, long filename entries aren't necessarily going to be in order. In fact, they are usually listed in reverse order. They would look something like this:

e Names.doc
and Long Fil
Data Recovery
DATA R~1.DOC


Data Recovery Techniques for Windows
- Introduction
- How to recover data
- How to create a boot disk to run Norton Disk Editor
- How disk clusters size affects data recovery processes
- How long file names complicate data recovery
- How to recover deleted files on FAT via Disk Editor
- How data recovery for NTFS differs from FAT
- How to recover corrupt NTFS boot sectors
- Signature-based data recovery: A last ditch technique

About the author: Brien M. Posey, MCSE, is a Microsoft Most Valuable Professional for his work with Windows 2000 Server, Exchange Server and IIS. He has served as CIO for a nationwide chain of hospitals and was once in charge of IT security for Fort Knox. He writes regularly for SearchWinComputing.com and other TechTarget sites.

This was first published in June 2006

Dig deeper on Microsoft Windows Data Backup and Protection

Pro+

Features

Enjoy the benefits of Pro+ membership, learn more and join.

1 comment

Oldest 

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to:

-ADS BY GOOGLE

SearchServerVirtualization

SearchCloudComputing

SearchExchange

SearchSQLServer

SearchWinIT

SearchEnterpriseDesktop

SearchVirtualDesktop

Close