none
while pdc is down its causing user authentication issues and other login problems

    Question

  • we currently have 3 AD sites

    azure - 2x 2012r2 dc
    datacenter A - 3x 2016 dc
    datacenter b - 3x 2016 dc

    all with ad integrated dns.

    The pdc(dc with all the fsmo roles) sits in datacenter a

    we had an issue where a backup caused the domain controller to hang and as such we had to shut it down.

    we had the server down for 30+mins, and during that time:

    no one could login. exchange was down. citrix servers couldn't authenticate etc.

    We are not sure what could be causing this as we were able to recreate the issue alter on by restarting that same pdc
    the setup for the nics on the dc is as follows:
    nic dns entry 1 - another dc
    nic dns entry 2 - its own local ip
    nic dns entry 3 - 127.0.0.1

    we have an f5 which is load balancing dns between all the dcs.  All dhcp subnets are using the 1, f5 VIP as the dns server configured for its scope option.  As well as all hardcoded server dns entries are set to that f5 VIP.

    we did notice that if you rebooted a server or a pc you could log back in.
    but we have over 350 servers and 3k pcs and i doubt i should have to reboot all these servers plus the workstations.

    We have subnets in AD for most if not all (definitely the server vlans which is on the same ad site as the pdc) vlans on our network and they are associated with an ad site.

    From what i've read some people say this is expected behavior and some people say it should not work like that. I am leaning towards agreeing with the latter as i cant imaging having to always restart everything because the pdc went down.

    any help would be appreciated.

    Tuesday, February 14, 2017 4:45 AM

All replies

  • Get rid of the DNS load balancer. It's not needed and only introduces complexity to something that should be really simple.

    Your DNS settings don't make sense. Entries 2 & 3 are doing the same thing.

    Your DNS settings in Datacenter A for DCs should be
    1st - The other domain controller in datacenter A
    2nd - A domain controller in remote datacenter
    3rd - Local 127.0.0.1

    All other devices in datacenter A should point to the domain controllers in data center A for primary/secondary and a remote location for 3rd DNS entry ( you can have more than 2 )

    Repeat for your other 2 locations and you'll have a DNS infrastructure that can withstand multiple server reboots without any clients ever noticing.

    MJ


    MJ

    • Proposed as answer by Mike Jenne Tuesday, February 14, 2017 5:28 AM
    Tuesday, February 14, 2017 4:58 AM
  • I will try on a test pc what your saying and bypass the lb dns.

    Do you agree that the fsmo pdc going down should not cause users and servers authentication issues?

    Tuesday, February 14, 2017 4:27 PM
  • Hi,

    I am checking how the issue going, if you still have any questions, please feel free to contact us.

    And if the replies as above are helpful, we would appreciate you to mark them as answers, and if you resolve it using your own solution, please share your experience and solution here. It will be greatly helpful to others who have the same question.

    Appreciate for your feedback.

    Best regards,

    Wendy


    Please remember to mark the replies as answers if they help.
    If you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.com

    Friday, February 17, 2017 1:37 AM
    Moderator
  • The only impact for users during such a short outage of the DC with the PDC Emulator role is password chaining. When users change their password, the change cannot be forwarded to the PDC Emulator. But the change will synchronize when the DC restarts.

    But bad password attempts are also not forwarded to the PDC Emulator, so the total count of bad password attempts on all DC's can get messed up. In my experience, this can actually cause accounts to get locked out sooner than normal, apparently as several attempts are made to forward to the PDC Emulator and the count increments too many times on the DC that authenticates the user. But users should have no problems if they authenticate with their correct password. And I would expect one or a few bad password attempts to not be a problem, unless your policy is to lock out after just a few attempts (less than 6).


    Richard Mueller - MVP Enterprise Mobility (Identity and Access)

    Friday, February 17, 2017 2:08 AM