Server 2008 Domain Controller Hangs at Shutdown, Prevents User Logon

Answered Server 2008 Domain Controller Hangs at Shutdown, Prevents User Logon

  • Friday, May 11, 2012 6:57 PM
     
     

    I have a Server 2008 Domain controller which is one of 2 domain controllers in our domain. If I Reboot either of the domain controllers, they hang at shutting down... and I have to physically power them off. The problem is, logon requests are not being serviced by the Other domain controller which is still alive and healthy.

    How is this possible? Shouldnt the other domain controller service the logon request?????

    I can ping the server that is hung during shutdown, and I recieve a response. 

    I cannot open a remote MMC to services/event log on the hung server because the RPC server is not available...

    So What is the point of having 2 domain controllers if a hung one prevents any user logon?

All Replies

  • Friday, May 11, 2012 7:13 PM
     
     

    Servers freezing during shutdown may not have anything to do with they being domain controllers. So I want to start with basics.

    I see that there are 2 different problems you having, first is DCs hang during shutdown and second when one DC is down/offline other wont serve the logon requests. These 2 issues could be related or independent.

    To investigate first issues you need to look at event viewer and see what errors are logged when server hangs....you may see group policy related error for example that could point it us in the right direction. I would like to know what happens when you unplug the network cable and then shutdown the DCs, do you see the same problem? If no then there could be something over the network that freezes the server. Also check for any shutdown/logoff script/group policies configured. Disable third party services/software from the DCs, also try disabling AV/firewall and test.

    Second issues may be because both DCs are not synchronized properly, I mean to say the replication could be broken between them, DNS might be configured incorrectly. example if you have configured on DC to point to other and if other fails first DC will have DNS issues. I want you to provide me this information from both DCs, dcdiag /q and ipconfig /all. Then I can check for possible misconfiguration. 


    -
     Sachin Gadhave
    MCP, MCSA, MCTS

  • Friday, May 11, 2012 7:32 PM
     
     

    Hello,

    There last event I see when the server hangs is EventLogger service has stopped, so all further events would not be logged.

    Network cable test failed, same freeze occurs. The failure to service login requests happens regardless of which server I reboot (force hang).

    All FSMO roles are held by one server.

    The DCS are synced properly, repadmin /showrepl reurns success on all tests, and DCDIAG returns passed on all tests. 

    As soon as I physically remove power to the hung server, login requests are serviced properly.

    I have a workaround for this issue, it is a Dell R300 server which has embedded server management, if I set it to power cycle on detection of operating system hang after 120 seconds, it will force the server to reboot.

    However, this does not explain why a perfectly good domain controller is not used to service the login request.

    Thanks,

    -Bryan

  • Friday, May 11, 2012 7:45 PM
     
     

    Bryan,

     If one Domain controller hungs or shutdown, client should get authenticated by remote DC,provided there SRV records are present in DNS. How is your DNS setup? You need to make sure that All your DNS has SRV records of all the DC ( This is an automatic process and is due to automaticsitecovergae)

    Automatic site coverage

    http://policelli.com/blog/archive/2008/01/15/automatic-site-coverage-in-active-directory-domain-services/

    If any of the SRV Records are missing then I would recommed you to stop and restart the NETLOGON service on the DC whos SRV Record is missing

    You can follow below links from jorge to understand how DC locator process works in AD

    http://blogs.dirteam.com/blogs/jorge/archive/2007/07/02/dc-locator-process-in-w2k-w2k3-r2-and-w2k8-part-1.aspx

    http://blogs.dirteam.com/blogs/jorge/archive/2007/07/02/dc-locator-process-in-w2k-w2k3-r2-and-w2k8-part-2.aspx

    http://blogs.dirteam.com/blogs/jorge/archive/2007/07/02/dc-locator-process-in-w2k-w2k3-r2-and-w2k8-part-3.aspx

    Regards,

    _Prashant_


    MCSA|MCITP SA|Microsoft Exchange 2003 Blog - http://prashant1987.wordpress.com Disclaimer: This posting is provided AS-IS with no warranties/guarantees and confers no rights.

  • Friday, May 11, 2012 7:48 PM
     
     

    As soon as I physically remove power to the hung server, login requests are serviced properly.

    This makes me think that there is possible DNS misconfiguration. Make sure both DCs point to themselves for primary and to the other DC for secondary DNS (assuming that both DCs also run ADI DNS) under NIC properties.

    Did you try clean boot to narrow down server hang issue. - http://support.microsoft.com/kb/929135


    -
     Sachin Gadhave
    MCP, MCSA, MCTS

  • Friday, May 11, 2012 8:20 PM
     
     

    DNS is configured properly, SRV records exist for both DCs in both DCs DNS config. 

    ADI DNS looks to be configured properly. 

    Is there any way to test that Auto- Site Convergence is working properly?

    I havent tried cleanboot, I have a workaround for the OS hang issue that I need to follow up with Dell on, I dont think that is a microsoft thing, however, I am very concerned that while the os is hung, my other perfectly good DC is essentially useless because it is not servicing login requests

  • Friday, May 11, 2012 8:42 PM
     
     

    Have you configured sites and subnets in your AD Envoirment?

    Regards,

    _Prashant_


    MCSA|MCITP SA|Microsoft Exchange 2003 Blog - http://prashant1987.wordpress.com Disclaimer: This posting is provided AS-IS with no warranties/guarantees and confers no rights.

  • Saturday, May 12, 2012 5:57 AM
     
     

    DNS is configured properly, SRV records exist for both DCs in both DCs DNS config. 

    ADI DNS looks to be configured properly. 

    Is there any way to test that Auto- Site Convergence is working properly?

    I havent tried cleanboot, I have a workaround for the OS hang issue that I need to follow up with Dell on, I dont think that is a microsoft thing, however, I am very concerned that while the os is hung, my other perfectly good DC is essentially useless because it is not servicing login requests

    My understanding was that you have both DCs in one single site, correct me if i'm wrong. Are these DCs in different sites/subnets?

    -
     Sachin Gadhave
    MCP, MCSA, MCTS

  • Monday, May 14, 2012 3:56 AM
     
     

    Yes, both DCs are in one single site, not connected to the internet.

    DomainX.Local

  • Monday, May 14, 2012 5:45 AM
     
     Answered

    Yes, both DCs are in one single site, not connected to the internet.

    DomainX.Local

    Bryan,

      Are both the DC's are Global Catalog? Can you please post unedited IPconfigu /all results here.

    From your post it is clear that SRV records of Both the DC exists in DNS server but clients are not able to locate the other DC if the primary DC goes down

    You should go through below link to understand this better

    If one DC or DNS server goes down, why can't I logon to the other DC or not use the second DNS address to find another DC?

    Refer the link by Ace - http://msmvps.com/blogs/acefekay/archive/2009/11/29/dns-wins-netbios-amp-the-client-side-resolver-browser-service-disabling-netbios-direct-hosted-smb-directsmb-if-one-dc-is-down-does-a-client-logon-to-another-dc-and-dns-forwarders-algorithm.aspx

    Tourbleshooting AD Availability

    http://www.tech-faq.com/troubleshooting-active-directory-availability.html

    Refer the DC Locator process by jorge and try to run the command lets which are provided int he link to undestand which DC and site clients are using for authenticating purpose

    http://blogs.dirteam.com/blogs/jorge/archive/2007/07/02/dc-locator-process-in-w2k-w2k3-r2-and-w2k8-part-1.aspx

    Also check DynamicSiteName in HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Netlogon\Parameters on client sysetm and verify correct site name exists

    Regards,

    _Prashant_


    MCSA|MCITP SA|Microsoft Exchange 2003 Blog - http://prashant1987.wordpress.com Disclaimer: This posting is provided AS-IS with no warranties/guarantees and confers no rights.

  • Monday, May 14, 2012 8:21 PM
     
     

    Prashant,

    Yes, both DCs have the global catalog. I will post ipconfig data as soon as I have access to the system. Thanks for the links!