I occasionally receive e-mail messages from people asking me if I think that it is a good idea for them to cluster...
their Exchange Servers. In environments in which high availability is a requirement, I believe clustering is a good idea. However, before you jump blindly into clustering, you need to know what you're getting yourself into.
There's a common misconception that installing Exchange onto a cluster is no different than installing Exchange onto a single server. In reality, however, the process is quite different.
Clustered environments make use of virtual servers. A virtual server is basically a logical server that is node independent. An application, such as Exchange, is assigned to a virtual server, which is running on a specific clustered node. If the node fails, the virtual server is moved to another node in the cluster. The client is never aware of the failure (aside from a brief interruption in service) because they are communicating with a portable, virtual server rather than a physical server.
Step 1: Get some training. Because of the inherent complexities in setting up and maintaining a clustered Exchange environment, I strongly recommend getting some cluster-specific training prior to beginning the implementation.
Step 2: Choose Microsoft compatible hardware. Plan carefully when choosing your cluster hardware. You need to make absolutely sure that the hardware you choose is on Microsoft's hardware compatibility list. Although it's always a good idea to look at the hardware compatibility list, it's very common to see Windows running perfectly well on non-certified compatible hardware. Even so, guaranteeing hardware compatibility is absolutely essential for a clustered environment because of the unique demands being placed on the hardware.
Step 3: Select the hardware work horses. Another thing to consider when choosing your hardware is performance. Remember that clustering is different than network load balancing. In a clustered environment, one server handles the entire workload, while the other just waits to take over if necessary. The hardware that you choose must be capable of comfortably handling all of the demands that Exchange places on it, while still leaving resources available for the clustering service to use and to accommodate future growth.
Step 4: Redundancy is a good thing. Another consideration when choosing your clustered hardware is redundancy. Think about it this way: You are thinking about installing a cluster as a way of providing high availability. Why not guarantee even higher availability by building as much redundancy as possible into each of the cluster's nodes? By building redundancy into each node, you reduce the chances of a failover ever happening, which is a good thing. Depending on your configuration, failovers can keep your server online, but preparing to bring the failed server back online can get really messy. Believe me when I say that you really want to prevent a failover from ever happening if you can.
Step 5: Variety is not a good thing. Make sure you select identical hardware for each node. I have seen instances in which someone has made a cluster work with dissimilar hardware, but the cluster seems to be more stable when identical hardware is used. This is especially true of the cluster's NIC cards. All NICs should be of the same make and model and should have the same firmware version. You should also use the same NIC driver across all nodes.
Step 6: Remember that Exchange is not self contained. If you really want to prevent Exchange failures, then you must understand that Exchange isn't entirely self contained. Exchange is very dependant on both domain controllers and on global catalog servers. No where is this more true than in a clustered environment.
For example, imagine that you had a two node Exchange cluster. Now imagine that the network contained two other servers, a domain controller and a domain controller/global catalog server. Now imagine that the global catalog server goes off line. This might not seem like a huge deal at first because Exchange can still access the Active Directory via the other remaining domain controller. However, if DSAccess can't contact a global catalog server, it will cause the System Attendant to fail. If the System Attendant fails, the Information Store will also fail. This will cause the cluster to have a failover, but the failover will not fix the problem because the problem is external to the node.
Step 7: Plan on 2 global catalog servers. A possible solution to avoiding Exchange failures is planning to have at least two global catalog servers on the same network as the Exchange Server. This way, if one Global Catalog fails, Exchange will not crash as a result.
You could even tilt the odds in your favor even more by placing multiple NICs in each Global Catalog server. You could then connect each NIC to a different network segment. That way if a hardware failure causes a communications failure over an entire network segment, the Exchange server could use an alternate segment to continue communicating with the Global Catalog servers.
Brien M. Posey, MCSE, is a Microsoft Most Valuable Professional for his work with Windows 2000 Server and IIS. Brien has served as the CIO for a nationwide chain of hospitals and was once in charge of IT security for Fort Knox. As a freelance technical writer he has written for Microsoft, CNET, ZDNet, Tech Target, MSD2D, Relevant Technologies, and numerous other technology companies. You can visit Brien's personal Web site at http://www.brienposey.com.