Troubleshoot Windows server clusters with ClusDiag

Learn how Microsoft's Cluster Diagnostics Tool (ClusDiag) simplifies the way admins investigate and report on problems with server clusters in Windows environments.

Part 1 | Part 2 | Part 3

Previously in this three-part series, we took a look at the installation process for Microsoft's ClusDaig tool, as well as some of its useful reporting capabilities. Part three examines the tool's powerful troubleshooting functionality, including cluster log file manipulation, filtering and color-coding events.

As mentioned earlier in this series, ClusDiag has these two modes of operation:

  • online mode -- used for configuration verification and reporting
  • offline mode -- used for troubleshooting cluster and event logs

Remember that before you can use the tool for troubleshooting, you first need to capture the logs (as described in part two) or use Microsoft's MPS Reports to gather the troubleshooting data.

ClusDiag was specifically designed to read MPS Reports in order to provide troubleshooting data for remote servers. In fact, MPS Reports is the principal tool used by Microsoft support to gather troubleshooting data. There are several variants of the MPS Reports and your choice depends on what type of configuration you're troubleshooting (cluster, Active Directory, performance/setup, etc.). For our purposes, you would use the Cluster variant of the MPS Reports to gather the required data and extract it to a folder. You can find an overview of MPS Reports and how to download a free copy here from Microsoft's website.

Once you've gathered the data using ClusDiag or MPS Reports, the next step is to invoke ClusDiag in offline mode. To do this, simply select Offline with the radial button and specify the location of the log files using the arrow or browse button as seen in Figure 1 below.

Figure 1

ClusDiag will then read the information and construct a view similar to the Cluster Administrator utility. It will list the various nodes, groups, resources and log files, and highlighting any of the cluster objects in the left-hand pane provides details in the right-hand pane. See Figure 2 for an example of ClusDiag in offline mode.

Figure 2 (click to enlarge)

On the left side you will notice the various log files grouped by All Files, Event Logs and Test and Cluster Logs. Expanding the Log Files tree on the left-hand pane displays the various log files on the right side. Clicking on any of the log files in the right-hand pane brings up the appropriate viewer. For example, if you click on an EVT (event log file), the Event Viewer will open the EVT file, as illustrated in Figure 3 below.

Figure 3 (click to enlarge)

Similarly, if you click on a cluster log file, a viewer will launch displaying the contents of the cluster log. By default, ClusDiag displays the cluster log file in a filtered format, color coding each entry as to its severity. Informational entries are color coded in black, warnings are coded in maroon and errors are coded in red. This allows you to quickly identify which records are noise or informational and which ones are warnings or errors. You can customize the color coding and filter criteria using the View pull-down menu and selecting Color Code. Figure 4 shows an example of a cluster log revealing the error entries in red.

Figure 4 (click to enlarge)

You can quickly jump from node to node by opening multiple cluster log files simultaneously in different panes. It is also possible to synchronize the multiple log file entries according to the timestamp or GUM (Global Update Manager) sequence number. This is accomplished by right-clicking an entry to bring up the context sensitive menu, then selecting either Synchronize Time Stamp or Synchronize GUM. In Figure 5, you can see an example of ClusDiag displaying three cluster log files.

Figure 5 (click to enlarge)

Finally, one of the handiest ClusDiag features gives you the ability to merge multiple cluster logs according to timestamp. This allows you to view a single merged cluster log from different nodes highlighting the background color to distinguish which records belong to which node. To accomplish this, select multiple cluster logs by holding down the control key, then right click to bring up the context-sensitive menu and select Find In Selected Files. Figure 6 shows a merged cluster log with records interspersed from different nodes with different colors.

Figure 6 (click to enlarge)

So that concludes this series on ClusDiag. Complete online help explores some other options available to you, such as displaying records in local time versus GMT or using Ctrl-T to follow different threads within the cluster logs. Still, it is plain to see that using ClusDiag's online reporting capabilities and offline troubleshooting tools gives Windows administrators a very powerful tool for managing Windows server clusters.

 Part 1: Overview and installation
 Part 2: Reporting capabilities
 Part 3: Troubleshooting server clusters

Bruce Mackenzie-Low, MCSE/MCSA, is a systems software engineer with HP providing third-level worldwide support on Microsoft Windows-based products including Clusters and Crash Dump Analysis. With more than 20 years of computing experience at Digital, Compaq and HP, Bruce is a well known resource for resolving highly complex problems involving clusters, SANs, networking and internals.

Dig Deeper on Windows Server deployment