Let's first take a moment to define DFS from a non-Microsoft standpoint. Essentially, a distributed file system is a layer of abstraction between the user and a file system. This abstraction layer allows for the creation of a virtual file system that may span multiple servers. Users see the virtual file system as a single entity, and as such, they do not need to know where files are physically located in order to use them – the entire file structure is accessible through a single namespace. This approach also gives administrators the freedom to move files from one location to another without confusing users in the process.
One of the most popular non-Microsoft DFS products is GlusterFS. As with many open source applications, GlusterFS is designed for Unix and Linux. The server component is natively supported on Linux, FreeBSD and Open Solaris, but the required client component is only supported on Linux. It is possible, however, for Windows administrators to use a utility such as Cygwin to run GlusterFS in a Windows environment.
GlusterFS supports many of the same features as the Windows DFS feature. For instance, files can be replicated to multiple servers as a way of establishing fault tolerance. Additionally, GlusterFS includes a Web interface that can be used for volume management and resource monitoring. The resource monitor tracks CPU, memory and disk utilization across the servers that make up the distributed file system.
Ceph is another open source DFS application that is designed to run on Linux as a highly-scalable application that can handle tens of thousands of clients at a time. Unlike some of the other available DFS solutions, Ceph strives to make scalability as seamless as possible. For instance, users can expand volumes by simply adding disks.
Besides the scalability Ceph provides, it's also completely fault-tolerant. All data is replicated to multiple storage nodes, so if a storage node fails, the remaining storage nodes automatically replicate the data to an additional storage node. This is because the underlying architecture has been designed in a way that prevents bottlenecks during the rebuilding of a storage node.
Like GlusterFS and Ceph, MooseFS is another open source distributed file system application that can be downloaded for free. MooseFS provides all of the standard DFS features such as relatively easy scalability the ability to replicate data to multiple servers.
MooseFS also has one additional feature that caught my attention: it allows for the configuration of a file system-level "recycle bin" that works across the entire file system. That way if a user deletes a file, that file is retained in the recycle bin for as long as the administrator wants to keep it. Files in the recycle bin are automatically purged after a configurable amount of time.
Note that although MooseFS is designed for Unix and Linux, it can run on any operating system that has FUSE implementation. This includes Windows (via Cygwin) and even Mac OS X.
As you can see, Microsoft does not have a monopoly on distributed file system technology. GlusterFS, Ceph, and MooseFS are just three examples of competing DFS products, and there are others out there as well.
Although each of these products is freely available, there are a couple of extremely important considerations that you must take into account before you commit to using them in a production environment. First, you must consider whether or not these solutions will be compatible with your existing network. For example, you may find that some free DFS products do not support NTFS file permissions. Second, it is important to determine whether or not you will be able to get reliable technical support on your DFS application of choice.
ABOUT THE AUTHOR
Brien M. Posey has received Microsoft's Most Valuable Professional award six times for his work with Windows Server, IIS, file systems/storage and Exchange Server. He has served as CIO for a nationwide chain of hospitals and healthcare facilities and was once a network administrator for Fort Knox.
This was first published in July 2010