Most administrators wear a variety of hats when it comes to job responsibilities. However, there are two responsibilities
that almost always take precedence over all others: protecting your organization's data and keeping your systems up and running.
One way administrators can help protect their organizations' data (and, in some cases, give the illusion that the systems are up and running) is through the use of replication software.
Replication software is designed to mirror an organization's data to an alternate location. If a hard drive fails or becomes corrupted, the replicated data provides an up-to-date copy of all of the organization's data -- stored safely on a different server.
Not all replication software is created equally, though. Prices and features vary widely. Some of the software is free (included with the Windows Server operating system) while others cost thousands of dollars.
Bearing in mind that there is a wide range of products out there, here is a checklist of features you should look for when shopping for replication software.
|What to look for in replication software|
|How scalable and flexible is the software?|
|Your first consideration should be the software's scalability and flexibility. Replication software is usually rated as supporting "one to one," "one to many" or "many to one." Some packages support a combination of these configurations.
"One to one" means that the software supports mirroring the data from one server to another server. "One to many" means that the data from one server can be mirrored to multiple replicas. "Many to one" means that the replication software can consolidate the data from multiple servers by storing all of the replicas on a single server. Some higher- end replication software packages also support a configuration known as "many to many." This means that the software can replicate data from all of your servers onto as many replicas as you need.
|How frequently does replication occur?|
|Another important consideration is the frequency at which replication occurs. Some packages perform replication as soon as a change to the data occurs. Other packages replicate data according to a schedule (typically hourly). Both have their advantages and disadvantages.
Real-time replication provides you with the most up-to-date replicas, but it also consumes more system resources. This type of replication typically guzzles a lot of CPU and memory resources on the server containing the source data and a lot of disk space on the server containing the replica. Scheduled replication tends not to be as resource intensive, but it does not give you an up-to-the-minute snapshot of your data. A failure on your primary server could mean that you lose an hour's worth of data.
|Does it support unidirectional or bidirectional replication?|
|When shopping for replication software, you should find out if the software you are considering buying supports bidirectional or unidirectional replication. Unidirectional replication means that data is replicated from the source server to the destination server. Bidirectional replication means that data can be written to any of the servers in the replica set and it will be replicated to all of the other servers in the set. Typically, a replication package that supports bidirectional replication is used for load balancing. Users access data through a virtual path that points to any number of file servers with identical data sets. This prevents any one file server from being overworked.|
|Does the software mirror data or an entire server?|
|What is the software mirroring: your data or an entire server? Server mirroring typically provides a high degree of fault tolerance. If the primary server fails, then the mirror server can take over for it, similar to the way that a cluster would operate. Replication software that mirrors only data usually doesn't offer this type of failover protection.
Again, there are advantages and disadvantages to both types of mirroring. Server mirroring usually gives you a high degree of fault tolerance, but it typically has very stringent hardware requirements. Data mirroring usually doesn't give you true, server-level failover support, but it often gives you more flexibility than server mirroring does. Data mirroring is usually less strict about the hardware that you use and often supports multiple replicas. Furthermore, data mirroring sometimes supports the retention of multiple versions of your files.
|Does the software offer versioning support?|
|Versioning support is an important consideration when shopping for replication software. Versioning allows your replicas to store multiple versions of each file. This means that when a user makes a change to a file, both the new and the old version remain available. This allows you to roll back files if you discover a mistake.
Versioning is important for another reason, too. I once saw a situation in which a hard drive became corrupt. The data was mirrored to another server, and the replication software saw each corrupt file as a changed file, and therefore mirrored the corrupt data to the replica, thus overwriting good data. The software did not support versioning, so the company had no choice but to restore a backup. Had the software supported versioning though, the replica could have been rolled back to a state in which all of the data was good.
|Is "on demand" replication available?|
|This is a minor feature, but it's useful. You don't really need on-demand replication if you use real-time replication, but it's handy for software that uses scheduled replication. The idea is that the software has a button or a menu option that you can click to create a replica on the spot -- even if it isn't time to replicate yet.|
|Are compression and encryption features offered?|
|Any replication package that you buy should offer some degree of compression and encryption. Compression and encryption are important when you send data to the replica server because it conserves bandwidth while preventing data theft.
Many replication packages also compress the data on the replica server's hard disk to conserve disk space. That is especially important if you plan to store multiple versions of each file.
|Is information queuing supported?|
|Not every replication package supports information queuing, but I think it is an important feature if you are going to use versioning. The idea behind information queuing is that it protects you from a network communications failure.
In a typical situation, if the network link between the primary server and the replica server fails, the replica server would not be updated until the link comes back up. By this time, a file could have been updated multiple times, but none of those versions are captured because the link was down. Information queuing stores each version of the files locally until the link comes back up and the queue's contents can be sent to the replica.
|Does the software offer cross-platform support?|
|One last thing to look for is cross-platform support. This might not be important to everyone, but if you have a mixture of Linux, Windows and Macintosh servers, then it would be nice if your software could replicate data from all of them.|
|ABOUT THE AUTHOR: Go back to checklist|
|Brien Posey, MCSE, is a Microsoft Most Valuable Professional for his work with Windows 2000 Server and IIS. Brien has served as the CIO for a nationwide chain of hospitals and was once in charge of IT security for Fort Knox. As a freelance technical writer he has written for Microsoft, CNET, ZDNet, MSD2D, Relevant Technologies and other technology companies. You can visit Brien's personal Web site at http://www.brienposey.com.|