Some of the toughest network issues to troubleshoot are those dealing with Windows failover clusters. Fortunately...
there is a tool, Validate, built into the Windows 2008 and Windows 2008 R2 operating systems that can assist with isolating network failures.
The original intent of Validate, or the Cluster Validation Wizard, was to check potential servers to determine if all the components were configured properly to support a failover cluster. A variety of tests are performed to ensure the various components are working consistently across the different servers. Once all the tests have passed, the cluster can be created and supported by Microsoft. Validate can also be used to inventory a variety of components such as network adapters, driver revision levels and firmware settings.
As Validate runs the tests and collects configuration data, it generates an HTML report that can be viewed with a Web browser. The report is organized into sections such as network configuration and storage with hyperlinks that display further details.
Validate is included with the Windows Failover Cluster Management MMC snap-in. This software is installed when using the Server Manager to add the Failover Cluster feature. Once you invoke the Failover Cluster Management snap-in, the action to "Validate a Configuration" becomes available.
Validate can also be used to troubleshoot an existing cluster. The tool allows you to select specific tests that focus on a particular component such as the network configuration or storage. This allows you to isolate a network problem without disrupting the storage. In the following figure, Validate prompts to run all tests or just the ones selected.
To troubleshoot a network problem, select the various tests under the Network section as seen below. Clear the check box for all the other tests such as system configuration and storage to avoid unnecessary disruption. The network tests will verify network communication, IP and subnet configuration, binding order, firewall and cluster network configuration (Figure 1).
More specifically, the List Network Binding Order test will list the order in which networks are bound to adapters. You always want to favor the public adapter over the heartbeat (private) adapter since outbound connections to clients will only be resolved across the public interface. The IP configuration test verifies a unique IP address is being used across the cluster and that multiple adapters are not in the same subnet.
The Firewall test validates that the firewall configuration is properly setup to support failover clusters. And the Cluster Network test ensures that all IP addresses are either static or DHCP, but not both. The Cluster Network test also confirms that IPv4 or IPv6 are consistently set across all network adapters. Finally, the Network Communication test verifies that servers can communicate across multiple interfaces with acceptable latencies and without a single point of failure. As the tests are executed, a status window displays progress and results. The tests that pass successfully are marked in green while those resulting in errors are flagged in red. In the following example, the IP configuration test failed so it is marked in red (Figure 2).
After Validate is finished executing the tests a summary window is displayed, listing the results of each one. This generates a report containing a hyperlink (Figure 3) where one can drill down for more troubleshooting information.
Select "View Report" to open the Web browser to display the Validation Report. Clicking the hyperlink for the Cluster Network Configuration test displays the error message encountered during testing in a red box. You can use this information (Figure 4) to troubleshoot what is causing network-related problems.
The Cluster Validation Wizard is a menu-driven tool that can be used to troubleshoot a variety of cluster network related problems, in addition to storage and system configuration issues. One can selectively use the tool to validate certain components depending on the symptoms.
You can follow SearchWindowsServer.com on Twitter @WindowsTT.
ABOUT THE AUTHOR
Bruce Mackenzie-Low, MCSE/MCSA is a master consultant at HP providing 3rd level worldwide support on Microsoft Windows based products including Clusters and Crash Dump Analysis. With over 25 years of computing experience at Digital, Compaq and HP, Bruce is a well known resource for resolving highly complex problems involving clusters, SAN’s, networking and internals. He has taught extensively throughout his career always leaving his audience energized with his enthusiasm for technology.