Home > Windows Server Tips > Windows Systems and Network Administration > Troubleshooting Windows application crashes or hangs
Windows Server Tips:
EMAIL THIS
 TIPS & NEWSLETTERS TOPICS 

WINDOWS SYSTEMS AND NETWORK ADMINISTRATION

Troubleshooting Windows application crashes or hangs


Bruce Mackenzie-Low, Contributor
10.02.2009
Rating: -4.57- (out of 5)


Expert advice on Windows-based systems and hardware
Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us    Add to Google


One of the most challenging issues for Windows administrators to troubleshoot is when a user application unexpectedly hangs or crashes. Due to the intermittent nature of crashes and hangs, it can be very difficult to "catch" the application misbehaving, leaving you with very few clues as to what caused the problem.

Fear not! There are a few simple tools you can use to help isolate the issue to a particular program, DLL, error, or condition that may lead you to a documented workaround or patch. This article will survey a variety of free tools, including Mark Russinovich's new ProcDump utility, which can assist you with troubleshooting applications that crash or hang so you can intelligently search the World Wide Web for a solution.

Free tools

Everyone loves free tools, but sometimes there's still a price to pay for them on the Internet. Free tools often require you to provide an email address before you download them so you can be spammed by product offerings for years to come. They can also open the floodgates for spyware or other Trojans which can negatively impact your server. For these reasons, I rarely download non-Microsoft tools.

Fortunately, Microsoft offers a variety of free tools that can be used to troubleshoot hanging applications and outages. For years now, a tool called Dr. Watson has been available as part of the Windows operating system. When properly configured, Dr. Watson will detect applications that crash and provide a log file and user dump file for troubleshooting. Analyzing this data will often lead to a known error code or condition that has a documented workaround or hotfix. For more details on using Dr. Watson, you can refer to the Microsoft KB articles 246084 and 278689, or ...


Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us    Add to Google



RELATED CONTENT
Microsoft Systems and Network Troubleshooting
DNS troubleshooting best practices
Troubleshooting tops Windows admins' most tedious tasks
Troubleshooting poor Windows logon performance in Active Directory environments
Immediate steps for Windows disaster recovery
Quick hits: Troubleshooting service account failure, batch job execution
Case Study: Troubleshooting Windows service dependency failures
Troubleshooting common Windows service failures
How can I boot to a floppy and receive a command prompt without being directed to the system drive?
RRAS utility in Windows Server 2003 traces network problems
How to ease troubleshooting: View running services in command line

Windows Systems and Network Management Tools and Techniques
Perfmon made easy with PAL utility
Free Windows security tools every admin must have
Top five Server Core management tips for Windows 2008
Top free tools for Windows server administration
A first look at Internet Information Services 7.0
Windows registry hack improves offline file access for mobile users
Reducing the size of network backups in Windows
Monitor network bandwidth with CyberGauge
How to format NTFS: More tricks to improve file system performance
Key enhancements to SCCM give admins more control over assets, licensing

Windows Systems and Network Administration
Converting VMware ESX machines to Hyper-V format
Using DFSR for SYSVOL replication in Windows Server 2008
Top 25 Windows PowerShell commands for administrators
Key DFS improvements in Windows Server 2008 R2
Free Windows security tools every admin must have
Group Policy makes strides in Windows Server 2008 R2
Quick tips for troubleshooting NTFS permissions
Common causes of Windows server security vulnerabilities
Cutting the cost of Windows identity and access management
Using NTFS on a non-Windows OS with NTFS-3G

RELATED RESOURCES
2020software.com, trial software downloads for accounting software, ERP software, CRM software and business software systems
Search Bitpipe.com for the latest white papers and business webcasts
Whatis.com, the online computer dictionary


Drwtsn32.exe online help. You can also review my previous article on installing and using the Windows debugger, also known as Windbg.

Perhaps a little more useful than Dr. Watson is a tool called ADPlus, which can be downloaded with the Debugging Tools for Windows. ADPlus is a VBScript that can be used to monitor an application for an unexpected condition and capture a user dump file when it occurs. The tool can also be used to force a crash dump on a hung user application, allowing you to analyze the dump with the Windows debugger. Extensive documentation can be found on using ADPlus in Microsoft KB article 286350 or my tip on troubleshooting Windows print spooler outages.

If your troublesome application involves Microsoft Internet Information Services (IIS), then the tool of choice would be Microsoft's DebugDiag. This is a comprehensive tool used to identify a variety of problems including Web server hangs, slow performance, crashes and memory leaks. The tool can also be used with simple Win32 applications that don't involve IIS. You can download DebugDiag from Microsoft and there is abundant documentation on its use including Microsoft KB article 931370and Tim Fenner's tip on using Debug Diagnostics to troubleshoot IIS issues.

Finally, there is a new tool from Microsoft called ProcDump. This utility combines many of the features found in the tools mentioned above, but also contains a very handy feature to dump a process when CPU activity spikes to a predetermined level for a specified period of time. The remainder of this article will examine ProcDump in detail, using Windbg to analyze the dump.

Using the ProcDump utility

ProcDump is a command line tool that can be used to troubleshoot a variety of issues. Like Dr. Watson, ADPlus and DebugDiag, ProcDump can be used to capture a process memory dump when an unexpected condition or exception occurs. Also like ADPlus and DebugDiag, it can be used to force a process dump on a hung application. But unlike any of its predecessors, ProcDump can be used to dump a process when its CPU activity spikes to a particular level. This can be especially useful for those intermittent performance issues where it is hard to predict when the problem will occur.

A single executable (procdump.exe) is provided which accepts a number of different options. Without any options, the ProcDump tool will force a memory dump and leave the application running. For example, the following command will force a dump of the Microsoft Outlook application, capturing the memory contents in outlook.dmp:

By default, only thread and handle information is capture in the process memory dump. By using the option -ma, a complete process memory dump will be performed. This will allow the debugger to identify more information about the application including the thread environment (!teb), the process environment (!peb), and the locking information (!locks –v) as illustrated in the following debugger output of winword.exe:

Figure 1 (Click to enlarge)
[IMAGE]

By using the -h option, ProcDump will detect a hung Windows application and force a memory dump. This is similar to the functionality found in ADPlus and DebugDiag. Using the -e option will cause ProcDump to detect an unhandled exception with the application and capture a process dump. By subsequently analyzing the process dump, you can determine what program, DLL and error condition were present at the time of the outage. This will allow you to intelligently search the Web for similar scenarios to determine if a known issue exists or whether you need to contact the vendor.

What sets ProcDump apart from its predecessors is the ability to detect CPU spikes and collect a process dump when they occur. This is especially useful for intermittent issues when no one is around to intervene. Three options are available to implement this functionality:

To clarify the usage, let's take a look at a simple example. The first command below will start a separate window with a directory command that traverses the system disk. You can then use the tlist command to quickly find the PID (Process ID). Finally, using the ProcDump command specifying a CPU threshold of 10% for at least two seconds on PID 1432 (in this example) should generate a process dump:

Of course this is an unrealistic example of its usage, as one would typically specify a much higher CPU threshold for a longer period of time to avoid transient CPU spikes. But the example does provide a simple illustration of how these options work together to trigger a process dump.

Once the process dump has been written, you can use Windbg to analyze the dump. You will notice in the figure below how the debugger comments that a process has exceeded 10% CPU for two seconds and the corresponding thread ID (0xc2c). The thread ID is important because a process may have multiple threads, so you should focus on the stack pattern of the thread that caused the CPU spike.

Figure 2 (Click to enlarge)
[IMAGE]

Finally, by issuing the debugger ~*kv command, the stack pattern will be revealed for all threads, providing you with the names of the functions being executed. In the example below, you can see how the directory command executes several functions such as cmd!FileIsConsole, cmd!WriteEol, cmd!NewDisplayFile and cmd!WalkTree. While these function names may mean little to us, they can be used as keywords when searching the Web for possible solutions to our runaway process.

Figure 3 (Click to enlarge)
[IMAGE]

As you can see, there are several free tools available to troubleshoot application crashes and hangs. We saw how Dr. Watson, ADPlus, DebugDiag and ProcDump all provide the capability to capture a process dump. Then by using Windbg, you can analyze the dump and review the stack pattern of the current thread for a crash scenario, or the runaway thread as identified by ProcDump.


[IMAGE] Bruce Mackenzie-Low, MCSE/MCSA, is a systems software engineer with HP providing third-level worldwide support on Microsoft Windows-based products including Clusters and Crash Dump Analysis. With more than 20 years of computing experience at Digital, Compaq and HP, Bruce is a well known resource for resolving highly complex problems involving clusters, SANs, networking and internals.


Rate this Tip
To rate tips, you must be a member of SearchWindowsServer.com.
Register now to start rating these tips. Log in if you are already a member.




DISCLAIMER: Our Tips Exchange is a forum for you to share technical advice and expertise with your peers and to learn from other enterprise IT professionals. TechTarget provides the infrastructure to facilitate this sharing of information. However, we cannot guarantee the accuracy or validity of the material submitted. You agree that your use of the Ask The Expert services and your reliance on any questions, answers, information or other materials received through this Web site is at your own risk.



Server Room Design - Planning, Cooling, Maintenance
HomeTopicsBlogsITKnowledge ExchangeTipsAsk the ExpertsMultimediaWhite PapersIT Downloads
About Us  |  Contact Us  |  For Advertisers  |  For Business Partners  |  Site Index  |  RSS
SEARCH 
TechTarget provides technology professionals with the information they need to perform their jobs - from developing strategy, to making cost-effective purchase decisions and managing their organizations' technology projects - with its network of technology-specific websites, events and online magazines.

TechTarget Corporate Web Site  |  Media Kits  |  Site Map




All Rights Reserved, Copyright 2004 - 2009, TechTarget | Read our Privacy Policy
  TechTarget - The IT Media ROI Experts