There are a number of new features in Exchange 2013, but one feature admins should know about is Managed Availability....
It's built into Exchange as a monitoring and remediation platform -- and it's the remediation part that can be confusing. Once you've learned what Managed Availability is and how it looks on Exchange Server, you can understand where Managed Availability stores information and how to access it.
Managed Availability mainly stores information for configuration and logging in two different places. The better part of its configuration information is stored in a number of xml files in the <installdrive>:\Program
Whenever the Health Manager Service starts up, it reads the files and uses defined settings. While it's possible to alter these files, I don't recommend it. Not only is it dangerous -- an error might prevent the service from starting up -- but most of the settings are undocumented or poorly documented. These files should be a last resort when you need to make a change.
Active Directory & server registry
In addition to the local configuration files, Managed Availability also uses Active Directory and the server's local registry to store additional configuration information. More specifically, overrides are either stored in Active Directory -- so they're available to all servers -- or in the server's local registry, if they only apply to the local server.
The event logs' Crimson Channel stores information for logging. As mentioned, Managed Availability logs every action it takes. At startup, the probe-, monitor- and responder definitions are stored in the corresponding event logs under Microsoft/Exchange/ActiveMonitoring (Figure 1).
Managed Availability structures stored information into HealthSets, Probes, Monitors and Overrides. Here's a look at how each structures Managed Availability's information.
HealthSets. Managed Availability uses a set of hundreds of probes, monitors and responders to determine a server's health. To keep track of all of them, each set that relates to a specific component is grouped into a so-called HealthSet. Running the Get-ServerHealth cmdlet on an Exchange Server reveals these HealthSets (Figure 2).
For instance, there is a HealthSet named "ActiveSync." This HealthSet groups all of the probes, monitors and responders responsible for monitoring and mitigating the server's ActiveSync component. To view what monitors are part of this HealthSet, you can also use the Get-ServerHealth cmdlet and use the –HealthSet parameter to narrow down the results (Figure 3), as such:
Get-ServerHealth <servername> -HealthSet ActiveSync
Probes. Probes query or test a specific component on the server. There are a number of probe types, ranging from simple ones, which will fetch the value of a specific performance counter, to more complex ones, which will carry out a battery of tests like mimicking a user's behavior. These tests are also referred to as "synthetic transactions." Probes just fetch the information and carry out tests; they don't evaluate the results or values that are returned.
To see what probes run on a specific server, you can have a look at the ProbeDefinition event log in the Crimson Channel. That's where the Health Manager Service writes the probes that will run on the system when it starts. The easiest way to get the information is through the GUI, or you can use PowerShell (Figure 4).
The relevant portion of the information is written in XML format, but is a little more readable in Friendly View. Typically, there are two types of information you would get from probes: what probes are running on the system and what resources (e.g., Mailbox Database) they run on.
Having that information is the first step to understanding what actions Managed Availability might have taken to remediate a problem it found.
Monitors. The next step to decipher Managed Availability is looking at the different monitors that exist on the system. Similar to probes, these monitors are stored in the Crimson Channel's Monitor Definition Event log.
Monitors behave in different ways and have different stages. Every time a monitor fails, it will move to the next stage, which will call upon a different responder to solve the issue. If the issue isn't solved after a set number of failed attempts, it escalates to an admin. When a monitor shifts into a new stage, the TimeOutInSeconds property in the StateTransitionXml property of a monitor defines the shift (Figure 5).
By using the Get-WinEvent cmdlet on the MonitorDefinition Event Log, you can see the different stages that are configured for a specific monitor; this article describes the process. The GUI is also an option, but PowerShell is more efficient. Note that there are different stages that exist for the AutodiscoverProxyTest monitor (Figure 5).
A monitor can have multiple states, most of which aren't exposed to the admin. Typically, an admin would only see a monitor being Healthy or Unhealthy. The other states (Degrade, Unhealthy, Unrecoverable) are hidden and only visible through PowerShell.
Overrides. Sometimes you don't want a specific monitor to run -- because it causes more trouble than help or when one of the configured defaults doesn't meet your own requirements, for example. Using monitoring overrides, you can reconfigure a monitor to use other threshold values. For example, a popular monitoring override is used to change the threshold Managed Availability uses to determine if there's enough free disk space left on a database log disk.
You can use the following command to reset the value to 10 GB from the default value and configure it to last for 90 days. It's not possible to add an override that lasts indefinitely, so keep track of when an override expires so you can reconfigure it afterward.
Add-GlobalMonitoringOverride -Identity MailboxSpace\DatabaseSizeMonitor -ItemType Monitor -PropertyName ExtensionAttributes.DatabaseLogsThreshold -PropertyValue 10GB -Duration 90.00:00:00
This command will add a global monitoring override that applies to all servers in the environment. That's also why global overrides are stored in Active Directory. If you want the override to apply only to a single server, you should use the Add-ServerMonitoringOverride cmdlet instead.
About the author:
Michael Van Horenbeeck is a technology consultant, Microsoft Certified Trainer and Exchange MVP from Belgium, mainly working with Exchange Server, Office 365, Active Directory and a bit of Lync. He has been active in the industry for 12 years and is a frequent blogger, a member of the Belgian Unified Communications User Group Pro-Exchange and a regular contributor to The UC Architects podcast.
This is part two of a series on Managed Availability.
Part one introduces the Managed Availability feature and how it looks in Exchange 2013.
Stay tuned for part three, which discusses responders and explains how to retrieve what actions Managed Availability took.