Problem solve Get help with specific problems with your technologies, process and projects.

Examining Exchange 2013 load balancing options

Effective Exchange 2013 load balancing starts with understanding your options. We outline the pros and cons of the four different choices.

In a previous article, we looked at some load-balancing basics for Exchange 2013. So with those under your belt, now it's time to examine the different load-balancing options that are available for Exchange 2013, as well as the advantages and disadvantages each offers.

Scenario #1: Layer 4, single IP address

This is the most basic Exchange 2013 load-balancing scenario. When you implement it, all Exchange 2013 workloads are configured to use a single IP address. This means that although you can configure different host names for each workload (like OWA, OAB, EWS, etc.), they all essentially point to the same IP address.

The biggest limitation is the fact that it only allows for a single health check for all Exchange workloads. Therefore, choosing the right health check is very important. Let's examine this behavior via a simple example.

The logical setup of this scenario is much like what you see in Figure 1. If the health check for a workload fails, that entire server is disregarded from the pool and would not receive any new connections -- no matter what type of traffic -- until the problem is solved.

Layer 4, single IP address Exchange 2013 load balancing setup
Figure 1. A look at a Layer 4, single IP address Exchange 2013 load-balancing setup.

While it might be OK if the server is down, you're effectively wasting resources if there's only a single failed component on that server. This could, for example, be the case when the health check is configured to see whether or not the OWA virtual directory is up and running. When the OWA app pool crashes, it results in a health-check failure, even though other components like Outlook Anywhere are still functioning correctly.

On the bright side, this scenario keeps requirements for the load balancer quite low. It only needs to forward traffic to the destination, based on the virtual service that accepted the client's connection. Depending on the size of your environment, this means that you can either go with a lower-end load balancer or possibly even Windows Network Load Balancing (WNLB).

Scenario #2: Layer 7, single IP address

In scenario 2, we've swapped Layer 4 for Layer 7. Because most of Exchange's workloads use SSL, the load balancer needs to decrypt incoming traffic, read it and re-encrypt it on its way out. This means that WNLB is no longer an option and that you'll have to take the additional load into account when purchasing a load balancer.

Today, most load balancers let you make routing decisions based on traffic content. This means that the load balancer can differentiate traffic sent to a single IP address by reading the content stream. This is accomplished by looking at the URLs that a client sends traffic to.

As you can see, there's not much of a difference between the second load-balancing scenario and the first. However, because we're operating at Layer 7, we can configure a routing logic based on the content, which -- in most load balancers -- is accomplished by defining sub-virtual services. If you're new to the term, sub-virtual services are exactly what they sound like, virtual services within a larger virtual service.

Traffic is then forwarded to one of the sub-virtual services based on the content (URL). This is also known as content switching.

To get an idea of what I'm describing, look to Figure 2.

 Layer 7, single IP address Exchange 2013 load balancing setup
Figure 2. A look at a Layer 7, single IP address Exchange 2013 load-balancing setup.

The advantage of this setup is that each sub-virtual service has its own health check. If, for example, an OWA health check fails, its server would be disregarded from the pool for OWA, but would still receive traffic for Outlook Anywhere or Exchange Web Services (EWS). Another benefit here is that the load-balancing setup now also effectively represents how Exchange sees things.

For example, Managed Availability -- a built-in self-monitoring component in Exchange 2013 -- also notices when OWA crashes on a server, but it would still use that server for other workloads.

The disadvantage in this scenario is that you'll have to buy a more expensive load balancer, and its configuration will likely be considerably more complex than in the first scenario.

Scenario #3: Layer 4, multiple IP addresses

I like to describe this scenario as the best of both worlds. It combines the simplicity of Layer 4 with the advanced features of Layer 7 but doesn't require a high-end load balancer. It almost sounds too good to be true, doesn't it? Well, there is a small trade-off.

As you know, a virtual service is defined by a unique combination of an IP address and a TCP port. In this scenario, because we can't perform content-switching, we must create a virtual service for each Exchange workload. This means you'll need a different IP address for each workload.

This usually doesn't represent a problem for internal deployments, but the fact that you must use multiple external IP addresses means that you'll probably need to purchase additional ones. You'll also have to include all the additional names to your certificate, which also raises the cost of the certificate.

Layer 4, multiple IP addresses Exchange 2013 load balancing setup
Figure 3. A look at a Layer 4, multiple IP addresses Exchange 2013 load-balancing setup.

Figure 3 details this scenario:

This scenario still requires a single health check per virtual service. Given that each workload now has its own virtual service, they do not affect each other in any way. The result is the same granularity you get with Layer 7 load balancing, without needing to configure sub-virtual services or perform traffic decryption.

Scenario #4: DNS round-robin

The final option is to use domain name system (DNS) round robin as your Exchange 2013 load-balancing mechanism. Although fully supported, using DNS round robin implies that you're abandoning most of the benefits a load balancer offers; typically, it saves on cost.

The idea here is pretty simple: For each workload (in this case, host name), you create multiple entries in DNS, pointing to each of the servers in the array.

For example, look at Table 1.

Host name Server Type A A A
... ... ...

Table 1. DNS entries.

When the client hits the DNS to translate the host name into an IP address, the three IP addresses are returned. With every request that hits the DNS, each entry is shifted up a slot.

For example, the first request would look like the following:

  4. ...

And the second request would look like this:

  3. ...

As a result, each client will connect to a different client access server first, until the DNS cycles through the list of records and starts over again.

The problem here is what happens after a server failure or service failure occurs. Considering that DNS doesn't perform health checks, it doesn't know when to -- and when not to -- hand out a specific server's IP address. This results in clients potentially still connecting to a server even if one of the services is down. Inevitably, the result is downtime for the end user. The answer here is to manually remove the record from DNS and wait until the client's local cache has expired so that it will request the records again from the DNS server.

The amount of time it takes for the record to expire on the client is controlled by the record's Time-To-Live. The default is 3,600 seconds, but if you're using DNS round robin for actual load balancing, lowering it to 300 seconds is a good idea. If you don't, you risk clients waiting up to an hour before hitting DNS again.

That said, once the clients go back to DNS, the list the client received no longer contains the server with the failed service and all is good. This works, but it's not a very good user experience, is it?

Things get crazier when considering a full server outage. Every client that receives a response back from DNS (with that server as the first entry) will have a problem. Each server will try to connect to the server, but won't receive a response. The HTTP stack within the operating system (which is used to initiate the connection to the server) will see this and try the second server on the list.

Before attempting a second connection, however, it will wait for a timeout. This can take anywhere from a few seconds to a minute or even more. While awaiting the timeout, the end user remains disconnected from Exchange in Outlook. To make things worse, the administrator still has to go into DNS and remove the record for the failed server to avoid something similar from happening with other clients. After the server is recovered, the record must be manually re-added to DNS.

On a positive note, this setup does offer some sort of automatic failover mechanism. Also, there's no doubt that domain name system round robin is the cheapest way to achieve some higher availability for Exchange 2013, even if it is only useable in the smallest deployments.

All this said, this setup doesn't come close to the efficiency a load balancer can offer.

Final thoughts

Below is a table you can use as a quick reference for each of the discussed scenarios, as well as the advantages and drawbacks for each:

Topology Advantages Drawbacks
Layer 4, single IP
  • Easy
  • Only requires a single IP
  • Inexpensive
  • No granular health checks
Layer 7
  • Only a single IP required
  • Granular health checks
  • More expensive
  • Can be more complicated to set up
Layer 4, multiple IP
  • Granular health checks
  • Relatively simple
  • More names on certificates
  • Multiple IP addresses needed
  • Can prove more expensive than anticipated because of the certificate and IP requirements
DNS round robin
  • Free
  • Failovers not always seamless, manual interaction required
  • Depends on client-side features/logic (timeout)
  • No health checks

About the author:
Michael Van Horenbeeck is a technology consultant, Microsoft Certified Trainer and Exchange MVP from Belgium, mainly working with Exchange Server, Office 365, Active Directory and a bit of Lync. He has been active in the industry for 12 years and is a frequent blogger, member of the Belgian Unified Communications User Group Pro-Exchange and a regular contributor to The UC Architects podcast.

Dig Deeper on Exchange Server setup and troubleshooting