DNS troubleshooting is an absolutely vital process in Active Directory. It is important to keep DNS healthy and to know how to repair it when it breaks.
Let's take a look at some common DNS problems and the tools to use for DNS troubleshooting.
Typical DNS errors include the following:
- DNS Lookup Failure -- usually indicated on events in the system or Directory Service event logs.
- Unable to Find a Domain Controller for the Domain Logon Failure -- caused by unable to contact the domain. Typically the "unable to contact the domain" error means a DC for the domain can't be contacted, which usually is due to a DNS failure.
These and similar errors can show up in a variety of places -- often in the description of an event. For instance, the famous replication error, event 1311, often lists "DNS Lookup Failure" in the description. The repadmin/showrepl command may expose a DNS failure, or DCPromo may indicate "unable to contact domain." You get the idea.
AD Replication failure may indicate that replication failure to a DC failed, and the DC is identified by its alias or Cname record name, such as
Diagnosing the problem usually starts with simple tests. If the error occurs on some interactive command, such as a logon or DCPromo, then a quick ping of the fully qualified domain name is helpful. If pinging the FQDN fails, then ping the IP address. That will isolate the problem to either a DNS or a network problem. Remember, you can ping the domain name, and it will return the IP address of one of the DNS Servers:
If DCpromo fails with a DNS error, see if you can ping the domain name from that server.
NSLookup is a helpful tool. It is handy if you are trying to resolve Internet DNS names as well as local names. It lets you know if the name can be resolved. Remember that NSLookup requires defined reverse lookup zones in order to work properly. There are some useful Web sites to test DNS registrations, such as www.zoneEdit.com. This only works for external names known on the Internet.
One of the most common causes of DNS failure is the misconfiguration of the TCP/IP properties where the DNS server is defined. Be sure that the DNS servers in that list are only DNS servers that are authoritative for the domain. There is no reason to add the ISP's DNS server or other DNS servers.
The client will direct DNS requests for services to the DNS server or servers in that list. And that will cause name resolution failures.
You might be tempted to look in the DNS event log for clues, but that log mainly lists events concerning the DNS server itself and rarely provides any clues for name resolution errors.
One of the most powerful tools we have for DNS troubleshooting is an option in DCDiag. It gives you a general DNS health check.
DCDiag /test:DNS /e /v >dns.txt
Where: /e = every DNS server /v = verbose Output saved in dns.txt (or any file name)
DCDiag is a powerful tool because it queries each DNS server and runs seven tests, including authentication, basic connectivity, forwarders, delegation, dynamic registration enabled and resource records registered. There is an additional test for external name resolution but, by default, this part of the test does not run.
The resulting DNS troubleshooting report has two main sections. The first section is a detailed report showing where these tests are run on each DNS server. Here is an example:
DC: corp-dc1.corp.net Domain: corp.net
TEST: Authentication (Auth)
Authentication test: Successfully completed
TEST: Basic (Basc)
Microsoft Windows 2003
NETLOGON service is running
kdc service is running
DNSCACHE service is running
DNS service is running
DC is a DNS server
Network adapters information:
Adapter  Intel(R) 82544GC Based
The A record for this DC was found
The SOA record for the Active Directory zone was Found
The Active Directory zone on this DC/DNS server was found
(primary)Root zone on this DC/DNS server was not found
This is an abbreviated report, but you can see how helpful it is to get this information on each DNS server -- including any event log errors and warnings -- without searching event logs on each server. My favorite part is the summary table at the end:
Auth Basc Forw Del Dyn RReg Ext
Notice how much information this table gives. It lists all the DNS servers, organized by domain, and indicates which of the tests each passed or failed.
In Corp-DC5, starting with the Forwarding test, there is an n/a indicated. This means that because basic connectivity didn't work, there is no reason to run the other tests.
We also see that all of the DNS servers in the NA domain failed except for the authentication tests, so that domain is pretty broken. Once we see this summary, we can look elsewhere in the report to find the specific errors.
A much larger issue with DNS is having an efficient design. DNS is quite simple, but complex organizations with many needs for DNS services can make it complex. If you have frequent DNS problems, it is important that you engage independent consultants to evaluate your DNS structure. Sometimes it's hard to be objective about the infrastructure you work with every day.
Gary Olsen is a systems software engineer for Hewlett-Packard in Global Solutions Engineering. He authored Windows 2000: Active Directory Design and Deployment and co-authored Windows Server 2003 on HP ProLiant Servers. Olsen is a Microsoft MVP for Directory Services and formerly for Windows File Systems.