alphaspirit - Fotolia
Managed Availability is one feature in Exchange 2013 that admins should take the time to learn. The built-in monitoring and remediation platform can be confusing, but taking the time to understand the inner workings can help clear up that confusion.
Exchange admins will have to learn what Managed Availability is and how the feature will look on Exchange and then learn where Managed Availability stores information and how to access it. Once those important aspects of the feature are covered, it's time to learn about responders and how to retrieve the actions Managed Availability took in a situation.
A monitor initiates responders, and each responder has a specific task. These tasks can include restarting a service and bugchecking (rebooting) a server.
There are too many responders to list here, but you should know there are several types of responders. These may include:
- Service Restart Responder
- Failover Responder
- Online/Offline Responder
- Reset AppPool Responder
- Bugcheck Responder
- Escalate Responder
By now, the names of these responders should make sense; they do exactly what their names indicate. Just like probes and monitors, responders are initiated during the startup of the Health Manager Service and can be found in the Crimson Channel in the Microsoft/Exchange/ActiveMonitoring/ResponderDefinition event log.
Just like monitors, responders are also throttled. This ensures that a given action is not indefinitely repeated. After all, you wouldn't want a server to be bugchecked over and over again, would you?
The throttling configuration for each responder is exposed in the information you can find in the event log (Figure 1).
Server Component States
Next to a server health report, Managed Availability exposes the current state of several workload components using Server Component States; these include ActiveSync or Transport components. These states can be viewed through PowerShell using the Get-ServerComponentState cmdlet (Figure 2).
Exchange Server uses this information, but other servers in the environment also use it to determine whether a server is able to serve a specific request.
If you take an earlier look at the responders, there is an Online/Offline responder. This takes care of changing the state of a component from Active to Inactive or the other way around.
Admins can also use the information from the Server Component States to control or check the behavior of a specific server. For instance, an admin could manually change the "ServerWideOffline" component to the Inactive state to place the server in a sort of maintenance mode. When this happens, all workloads on the server will no longer service requests until you change the state for the ServerWideOffline component back to Active.
Changing a server's component state is done through the Set-ServerComponentState cmdlet, for example:
Set-ServerComponentState –Identity e15-04 –Component FrontEndTransport –State Inactive –Requester Maintenance
When looking at the command, notice that there is a specified requester. This information is useful to determine who or what changed a component's state. There are several requesters used, including:
When a component's state has been changed, you immediately know that it's Managed Availability that ordered the state change. The other states are each used in different scenarios. Deployment is used when a server is upgraded.
Putting it together
We've already taken a look at the different components that make up Managed Availability. Separately, they provide little to no value. The only exception is the Server Component States, which can be helpful for putting a server into maintenance mode. But when you bring this information together, you can find out why certain actions took place. For instance, you should be able to answer why your server rebooted unexpectedly.
Every action Managed Availability takes is recorded in a different part of the event logs. Rebooting a server is something a responder triggers, so a good place to start looking would be in the following event log (Figure 3).
By using either PowerShell or the GUI, you can launch a search for recovery actions that took the ForceReboot action (Figure 4).
You'll notice that every Recovery Action should have an event indicating the start of the action and another one in which the completion is denoted (Figure 5).
Using the "Requester" from these events, you can backtrack which monitor and probe may have triggered specific behavior.
Searching for why a server reboots is only one of many reasons why someone would want to go through the events Managed Availability creates. But in general searching is more useful in the case where a specific behavior doesn't meet a certain expectation. A practical example would be the case of Exchange 2013 Cumulative Update 2, which changed certain thresholds in Managed Availability and caused unwanted reboots of a server. The new thresholds worked fine in Office 365, but they weren't ideal for on-premises customers.
Using the above approach through PowerShell, you could backtrack what caused the reboots and come up with the ActiveDirectoryConnectivityConfigDCProbe. In turn, that information could be used to create a Monitoring Override to temporarily disable the probe, thus preventing it from calling the bugcheck responder that resulted in a server reboot.
You really have all the tools at hand to modify Managed Availability's behavior, if you would want to. If you're interested in finding out more about this older issue and how the approach and information from this series can be put to use, have a look at this KB article. It isn't really relevant anymore since Exchange 2013 Cumulative Update 6's recent release, but it makes for good reading if you want to read up on Managed Availability and its use cases.
I personally don't come across many situations where I need to fiddle with the thresholds or settings, even though they can be unrealistic. I try to stay away from it unless there's a need to go in and find out more information.
On the other hand, I encourage you to use the Server Component States. These can be really helpful when trying to manage your Exchange deployment. On my website, you can find a script which will use the Server Component States to place a server automatically into maintenance mode.
About the author:
Michael Van Horenbeeck is a technology consultant, Microsoft Certified Trainer and Exchange MVP from Belgium, mainly working with Exchange Server, Office 365, Active Directory and a bit of Lync. He has been active in the industry for 12 years and is a frequent blogger, a member of the Belgian Unified Communications User Group Pro-Exchange and a regular contributor to The UC Architects podcast.
This is part three in a series on Exchange 2013's Managed Availability feature.
Part one introduced the feature and discussed how it looks on Exchange.
Part two explained where the feature stores information and how to access it.