It's a dreaded event: The primary server in your cluster goes down.
If you have high-availability Windows clusters, another server in the cluster will quickly -- often, almost immediately -- continue at the point where the failed server stopped.
More companies are finding it necessary to adopt this type of technology, which is generally reserved for mission-critical applications. In the Microsoft Windows world, we're talking about Microsoft SQL Server-based production applications and, more recently, Microsoft Exchange as e-mail messaging becomes the lifeblood of more organizations and the foundation of business processes.
The roadblock, especially for many small and midsized companies, is the cost of Microsoft's Windows Clustering Services, which is simply out of reach.
"You will need at least two copies of the Enterprise Editions of Windows Server and SQL Server or Exchange," as well as more server and storage hardware, noted Peter Pawlak, senior analyst for server applications at Directions on Microsoft, an independent research firm based in Kirkland, Wash.
While the ultimate cost would vary depending on an organization's specific needs, you would at the very least have the expense of Microsoft Server 2003 (the Standard Edition lists for $600) plus the cost of two licenses for Enterprise Editions of your application, either Microsoft Exchange Server or Microsoft SQL Server. Enterprise Editions of SQL Server 2000 start at about $20,000 for one CPU.
The good news for companies that don't have big budgets is that less expensive non-Microsoft alternatives are available. Case in point: The cost for NSI Software Inc.'s GeoCluster starts at about $7,500. You still need to buy Microsoft Windows Server, but Enterprise Editions of the software are not necessary.
Among the alternatives to Windows high-availability clustering are NSI's Double-Take, and GeoCluster, which is based on Double-Take; SteelEye Technology Inc.'s LifeKeeper; Neverfail from Neverfail Group Ltd. and PolyServe Inc.'s cluster file system software. Although the products differ in their specific approaches to achieving high availability, each product gives an organization a level of high availability for Windows servers.
Stop counting sheep; monitor servers while you sleep
Mattress Discounters Corp., in Upper Marlboro, Md., is a company that needs high-availability clustering but can't afford the price tag. The company relied on one server to handle its Microsoft Exchange instant messaging. One day, that server experienced a sudden hardware failure, and the company's Exchange-based messaging stopped dead. "We had it repaired in one day, but it was a critical time of year for us. This kind of failure was not acceptable," Craig Foss said. He's the network support manager and noted that the company set out to prevent such a situation from happening again.
A Microsoft Exchange cluster could have prevented the loss of Exchange services, but "a Microsoft Exchange cluster was too expensive," Foss said, noting the added expense of multiple enterprise copies of the software and the added hardware costs. The company's search for a solution instead led it to Neverfail.
Neverfail offers a variety of failover software products for Windows, including Neverfail for Exchange. The Exchange product proactively monitors the health of the physical server hardware, network infrastructure, operating system and Exchange itself. When Neverfail senses a problem, it can take any of a variety of preemptive and corrective actions to resolve the issue. As a last resort, Neverfail will initiate a full system failover to a standby server. When the problems have been resolved, Neverfail will switch back and resynchronize the primary server.
"Neverfail let us use hardware we already had and gave us true failover much less expensively. And it was a simple solution that was easy to set up," Foss said. Although the company has not experienced another sudden server failure, it has simulated an Exchange crash to test its newfound high availability. Neverfail worked as intended, switching the company to the secondary server and later bringing back the primary server.
Aggies activate high availability
Texas A&M University also wanted high availability for its two-node Windows server cluster running a point-of-sale system and meal plan application with Microsoft SQL Server. The obvious solution would be Microsoft's own cluster services, but "Microsoft required us to have Enterprise licenses and we only had Standard licenses," said Steve Stone, IT associate at the school.
PolyServe, which provides a cluster file system that enables multiple servers to share the same data, works with servers running Windows Standard licenses, Stone discovered. It offers switchover capabilities for SQL Server database failover and a general purpose cluster file system that allows any server in the cluster to take over the workload of any other server. It also allows the cluster to be managed from a single control point.
Texas A&M set up a PolyServe-based cluster in active-active mode, where both servers are able to perform the work concurrently.
"Microsoft only lets you do active-passive unless you pay more," Stone noted. In active-passive mode, one server sits idle until a problem with the active server forces a failover to the passive server, which suddenly becomes active.
Court time for NSI
The Boston Celtics, the legendary National Basketball Association team, opted for a different alternative: NSI's Double-Take. And it did so with Microsoft's blessing.
"Microsoft's clustering has multiple servers using the same storage. I wanted the storage replicated, too. So Microsoft referred us to NSI," said Jay Wessel, the Celtics' senior director of technology.
The Celtics run 10 Windows servers and wanted high availability for three of them. The problem was the team doesn't have shared storage -- neither network-attached storage (NAS) nor storage area network (SAN). "I guess it would make it easier if we had a big SAN with replication, but that would cost a lot more," Wessel said. Instead, the Celtics rely on internal RAID storage on the three servers.
The Celtics' high-availability target was the servers running Microsoft Exchange. The team, however, wanted more than just failover in the event of a problem with a server. It wanted failover to a remote location for disaster recovery purposes. It installed Double-Take on a server at headquarters and on the off-site server 50 miles away. Presto: Wessel had both high availability and disaster recovery.
Double-Take continuously captures and replicates server activity as it occurs at the byte level using standard network connections. In the event of a problem, it will fail over to its twin automatically.
Whether you are looking for continuous operations or disaster recovery, Windows clustering promises the solution. However, if the Microsoft offering is too pricey or otherwise doesn't meet your needs, there are alternatives that give you more implementation options and can save you money.
Alan Radding is a freelance writer specializing in business and technology. He can be reached at alanradding.net, www.technologywriter.com/.