Using Xperf stack walking to pinpoint kernel performance issues

The Xperf tool helps admins diagnose performance issues, but how do you granularly determine what function calls are eating up precious resources?

Monitoring the kernel of the Windows operating system to diagnose performance issues can be a very challenging

endeavor. Sure Perfmon, PAL and Xperf can show that the OS is spending x amount of time executing in kernel mode, but how can one determine what portions of the kernel (function calls) are consuming significant amounts of time?

In the past, it was necessary to force multiple crash dumps in an effort to profile where the kernel was spending its time. By forcing a memory dump, the debugger is able to display the stack traces of the current threads to determine where the operating system is executing. This can help identify what kernel functions are being called by which threads in an effort to troubleshoot excessive kernel mode times. There has to be an easier way!

Stack walking
Fortunately, there is an easier way: Stack walking. This feature is part of ETW, Event Tracing for Windows. The Xperf tool is used to enable the stack walking functionality that is built into Windows and collect stack traces for threads that encounter various kernel functions or events. This allows the ability to configure which portions of the kernel collect stack traces for threads. Stack walking is available for primitive events such as process and thread creation, file and registry manipulation, and memory allocation. Issue the command Xperf –help stackwalk to see a list of kernel events (flags) that support stack walking.

For example, imagine the server is experiencing high kernel mode times that  seem to be caused by excessive registry updates. Xperf can reveal which threads are calling which registry functions. Xperf will also summarize the data to show  the percentage of time a particular function was executed, giving it a weight. The higher the weight, the more frequently the function is executed. This enables an admin to profile the kernel functions according to what threads are executing them and their frequency.

For example, the following Xperf flags are available for stack tracing when registry events occur:

RegQueryKey                         RegEnumerateKey                 RegEnumerateValueKey

RegDeleteKey                        RegCreateKey                         RegOpenKey

RegSetValue                           RegDeleteValue                     RegQueryValue

RegQueryMultipleValue        RegSetInformation                 RegFlush

RegKcbCreate                         RegKcbDelete                         RegVirtualize

RegCloseKeystackwalk

A typical series of Xperf commands for tracing registry events might look like:

  1. Start the event collection.
    xperf -on SysProf+REGISTRY –stackwalk

    RegQueryKey+RegEnumerateKey+RegDeleteKey+RegCreateKey+RegOpenKey+RegSetValue+

    RegDeleteValue+RegQueryValue+RegQueryMultipleValue+RegSetInformation+RegFlush+

    RegKcbCreate+RegVirtualize+RegCloseKey

  2. Reproduce the problem.
  3. Stop the collection.

xperf –d stacks.etl

The result will be a file called stacks.etl that is viewable with the Xperf command:

    xperf stacks.etl

Configuring and loading symbols
Before call stack information is viewable, it is necessary to establish the symbol path. The symbol path tells Xperf to reference Microsoft’s symbol server on the internet so the tool can lookup module and function names. This allows Xperf to summarize all the call stack information to show which functions are being executed by which threads.

There are several ways to accomplish this. The SET command or the System applet in the control panel can be used to establish the system environment variable _NT_SYMBOL_PATH. Or, establish the symbol path from within Xperf by using the Trace pull-down menu and selecting “Configure Symbol Paths”. Regardless of the process to establish the symbol path, it should point to the following URL:

  _NT_SYMBOL_PATH= srv*C:\symbols*http://msdl.microsoft.com/downloads/symbols

After establishing the symbol path, it is necessary to “load” the symbols. Xperf will connect to Microsoft’s symbol server and download any symbol files that are needed to resolve references to module and function names. There are multiple ways to load the symbols: Use the Trace pull-down menu to select “Load Symbols,” or just right-click a graph and choose “Load Symbols” as seen below:

Figure 1: Windows Performance Analyzer (click to enlarge)

Windows performance analyzer

Viewing the results
After configuring the symbol path and loading the symbols, view the stackwalk data to find what functions are being called by which processes. One of the charts will be the “CPU Sampling” graph where you can right-click the chart to select “Summary Table.” This will produce a table that summarizes processes and the functions they called related to the ETW providers and stackwalk flags that were specified during the event collection.

In the previous example, we used Xperf to collect data on registry functions, because we hypothetically suspected that was causing our high kernel mode time. That's not all Xperf can do; it's just as easy to collect data for any other component of the kernel that supports stackwalk flags such as the file system, process creation and memory allocation. In Figure 2, notice how the regedit.exe process is responsible for the vast majority of registry accesses. You can determine the actual registry functions that were called by expanding the “+” to reveal the call stacks.

Figure 2: Reveal call stacks (click to enlarge)
reveal call stacks

While the tool is fairly intuitive to use, there is plenty of help available. For assistance, reference the help file:

  C:\Program files\Microsoft Windows Performance Toolkit\WindowsPerformanceToolkit.chm

Troubleshooting kernel mode performance problems is never an easy task. But with tools like Xperf and Event Tracing for Windows, much of the guesswork can be taken out by isolating the issue with kernel stack walking.

ABOUT THE AUTHOR
Bruce Mackenzie-Low
, MCSE/MCSA, is a master consultant at HP providing 3rd level worldwide support on Microsoft Windows based products including Clusters and Crash Dump Analysis. With over 25 years of computing experience at Digital, Compaq and HP, Bruce is a well known resource for resolving highly complex problems involving clusters, SAN’s, networking and internals. He has taught extensively throughout his career always leaving his audience energized with his enthusiasm for technology.

This was first published in January 2012

Dig deeper on Windows Server Monitoring and Administration

Pro+

Features

Enjoy the benefits of Pro+ membership, learn more and join.

0 comments

Oldest 

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to:

SearchServerVirtualization

SearchCloudComputing

SearchExchange

SearchSQLServer

SearchWinIT

SearchEnterpriseDesktop

SearchVirtualDesktop

Close