This document is to serve basic step-by-step method to initiate the first and primary action of the overall troubleshooting process when a domain controller is down.

1. Check recent changes

This can include patch management, services stopping unexpectedly, etc.

The approach you take is to look in the event logs in event viewer, system, application, security, and group policy logs are most common logs however a domain controller also has Active Directory, DNS, file services, AD web services logs as well as others that all can help point to the issue. Also, review the services.msc, make sure all required services are started, stopped or services getting crashed. Make sure your domain controllers NIC information is correctly stated.  If running IPSEC verify your policy and make sure on both ends of the policy that it is correct and valid.

 

2. Is everything on the domain controller working as intended?

To verify this you open an admin command prompt and run the following commands:

  • NET SHARE (This command will show if the sysvol is shared out or not)
  • IPCONFIG / FlushDNS (DNS caching may generate a false impression that DNS "round robin" is not taking place from the DNS server to the Windows client)
  • REPADMIN /KCC (KCC is Knowledge Consistency Check and this will recalculate the replication topology of your active directory infrastructure)
  • REPADMIN  /SYNCALL (This will force replicate with its replication partners)
  • REPADMIN /SYNCALL APED (This will force replicate to all domain controllers)
  • GPUPDATE /FORCE (This will enforce the group policy assigned to the domain controller)
  • DCDIAG /C /V >C:\dcdiag.txt (This will perform a verbose mode DCDIAG and save it to a text file)
  • Portqry.exe (This command-line utility is used to verify recommended AD port communication)
  • w32tm /query /status (Verify Time Synchronization)

Needless to say that if any of these commands and outputs show any errors then you need to mark down the error and research from there.

 

3. Were networking devices altered in any way?

This involves routers, switches, firewalls, etc. Too often a network engineer makes changes that can break an Active Directory infrastructure and the best way to troubleshoot this is the following:

  • Can you ping the device?
  • Can you resolve DNS queries using NSLOOKUP? (Using nslookup -a ipaddress)
  • Using LDP.EXE can you bind to the Active Directory store?
  • Use a port query tool verify the required AD ports are open/ listening and not closed/ filtered?
  • Talk with the network engineer. Ask if he or she changed the VLAN or ACL’s that your AD infrastructure uses?
  • Did he or she change/ update the intrusion prevention system?
  • Did the switch port security kick in?

 

References

The links below are troubleshooting references.

 

As stated in the beginning, this is a very basic approach to troubleshooting. Every infrastructure is different and as such the approach must be different, however, this guide is a baseline to the most basic of troubleshooting and can greatly help even the most seasoned administrator quickly and proficiently find the exact problem and get a solution implemented.