The systems that support the operations of the Iowa Department of Corrections can never stop. There is always something happening, 24 hours a day, 365 days a year. "Our requirement is that the software never goes down," said John Baldwin, deputy director. The department runs a system called ICON (Iowa Corrections Offender Network) and to ensure the system never goes down, it runs 8 to 10 Microsoft Windows servers in clustered pairs.
Munder Capital Management, an investment firm based in Birmingham, Mich., delivers financial information in real time to its customers. It, too, cannot allow its systems to go down. To deliver the kind of reliability its customers demand, it operates six pairs of Windows servers running Microsoft SQL Server and takes advantage of Windows high- availability clusters.
Clustering is a proven technique for achieving both high availability and improved performance. Unix systems have long used clustering to deliver industrial strength robustness. Linux also touts its clustering capabilities as evidence of enterprise-class capabilities. Microsoft has offered basic clustering for its enterprise and data center applications, such as SQL Server and Microsoft Exchange, since Windows NT.
Once reserved for the largest companies and the most mission-critical applications, clustering is being adopted today by midsized organizations like Munder Capital and the Iowa Department of Corrections. This is due to managers' growing
"While transaction processing and customer relationship applications have traditionally been considered critical applications, studies recently highlight the growing importance of e-mail as a mission-critical application for the majority of customers," said Ryan Rand, Microsoft's senior technical product manager, Windows Server Division. With e-mail becoming mission-critical in many organizations, interest in Windows clustering is rising.
Microsoft defines a cluster as a group of computers working together to run a common set of applications and to show a unified system to the client and application. The computers are physically connected by cables and programmatically connected by cluster software. These connections allow computers to take advantage of failover and load balancing. Failover is a high-availability technique in which one server in the cluster quickly takes over the workload if the primary server in the cluster fails. Load balancing is a technique to spread the workload evenly across all servers in the cluster.
Clustering the Windows way
Although Windows has long supported clustering, Microsoft has continually enhanced the offering with every Windows upgrade. In Windows Server 2003, Microsoft expanded support for clusters as large as eight nodes, up from four nodes. It also added integration with Active Directory and 64-bit support.
The basic Windows high-availability cluster is two nodes -- two servers connected in an active-active or more likely active-passive arrangement. In the active-passive arrangement, which is easier to set up and more common, the passive server sits idle until it is called to take over the workload in the event of a problem with the active server.
Microsoft, however, keeps rolling out enhancements to Windows clustering. "With Windows Server 2003 Service Pack 1, we now have a much more comprehensive agent for Microsoft Operations Manager 2005. It allows application clusters based on Windows clustering technologies to more effectively notify operations staff of minor problems and suggests the best course of action to resolve them," said Rand.
Windows also supports load-balancing clusters, called NLB (network load balancing) clusters. Companies typically use NLB clusters to support horizontal scaling of IP-based applications that would be encountered with e-commerce. They also can use NLB clustering to improve performance of the Web and mobile portions of an MS Exchange solution.
Challenges of deploying Windows clusters
Despite the improvements Microsoft has made in Windows clustering, it is not simple to deploy. To begin, clusters need shared storage, which usually means implementing a storage-area network (SAN).
"We set up the SAN using Compellent SAN technology," said Michael Dufek, director of information systems at Munder Capital. It configured its six pairs of SQL Server servers for active-passive operation over a GigaMAN link, an optical IP Ethernet connection. Dufek also split its cluster pairs across two geographically separate locations to give the company both high availability and disaster resilience. At the second site, the company put another Compellent SAN. Data for the cluster is stored at the primary site and immediately replicated to the other site.
Clustering is not simple to do. "There are some setup and configuration issues," Dufek noted. For example, the company needed to find the correct StorPort drivers. "You need to have the right HBA [host bus adapter] drivers and configure things correctly for timeouts," he explained. But the company was able to do it without involving Microsoft except for the StorPort driver.
The Iowa Department of Corrections relied on its software developer, Advanced Technologies Group Inc., in West Des Moines, Iowa, to build and deploy ICON and set up the clusters. The vendor also maintains the system, "although the clusters require very little maintenance," noted Baldwin.
Clusters at what cost?
Beyond the increased complexity of clusters, they entail added expenses. "To begin with, you need at least two copies of the enterprise editions of Windows Server 2003 and your Windows applications," said Peter Pawlak, senior analyst at Directions on Microsoft, a research firm based in Kirkland, Wash.
In addition, you will need a SAN, either Fibre Channel or iSCSI. "A SAN entry point is about $15,000 anyway," said Pawlak. In addition to at least two servers for a basic cluster, you'll need a switch for the SAN and the usual storage backup infrastructure.
Ultimately, the cluster decision comes down to the cost of downtime. "We didn't think about the costs. The drawbacks of being down far outweighed the costs. That was not a conversation we would even consider," Baldwin said.
There are lower cost alternatives to Windows clustering for high availability. "We looked at other options, but nothing at that time gave us real-time failover," said Dufek. For Munder Capital, increased high availability without the real-time aspect wasn't worth the cost savings.
Windows high-availability clustering is a powerful capability. For organizations willing to pay for the assurance that their database applications or their e-mail will be there whenever they need, Windows clustering will be worth it.
Alan Radding is a freelance writer specializing in business and technology. He can be reached at firstname.lastname@example.org, www.technologywriter.com/.