none
Maintenance complete datacenter for some hours + split brain mode RRS feed

  • Question

  • Hi,

    We have a planned maintenance this weekend for which the networking team will completely disconnect both datacenters for some hours.

    It's foreseen that Exchange will not be reachable in this timeframe, so we'll be in a split brain mode during this time.

    Can someone tell the impact we'll have, and that Exchange will recover automatically once both datacenters are reachable again?
    If so, can someone tell me which actions we should take before/after the intervention?




    • Edited by Lyncer2013 Wednesday, May 29, 2019 11:22 AM
    Wednesday, May 29, 2019 11:21 AM

Answers

  • Hi,
    Thanks for the info.
    Both datacentres will be unreachable during the network intervention, so if i understand correctly this is what we should do?

    * migrate all DB's to one of both datacentres.
    * put all servers in maintenance mode in the datacentre that does not have active mailbox DB's anymore
    * When network is back, remove the servers from maintenance mode
    * redistribute DB's

    So if this is correct, i do not need to use the cmdlets Stop/restore/start -DatabaseAvailabilityGroup?

    You may need to read this: Part 7: Datacenter Activation Coordination: When to run start-databaseavailabilitygroup to bring members back into the DAG after a datacenter switchover…

    After a datacenter switchover has occurred, unless the original datacenter is gone or otherwise unrecoverable, eventually services in the primary datacenter will be restored.  When services are restored, including full network connectivity, database availability group (DAG) administrators can begin the switchback process by using the Start-DatabaseAvailabilityGroup cmdlet.

    Regards,

    Manu Meng


    Please remember to mark the replies as answers if they helped. If you have feedback for TechNet Subscriber Support, contact tnsf@microsoft.com.

    Click here to learn more. Visit the dedicated forum to share, explore and talk to experts about Microsoft Teams.

    • Marked as answer by Lyncer2013 Monday, June 17, 2019 12:32 PM
    Tuesday, June 4, 2019 10:08 AM
    Moderator

All replies

  • First we should enable DAC mode to prevent the split brain, run the following command:

    Set-DatabaseAvailabilityGroup -Identity <DAG_Name> -DatacenterActivationMode DagOnly

    Then we need to put the DAG members in maintenance mode in batches. That is to say, to begin with, disconnect nodes in the first datacenter and switch the copies to activate on the second datacenter. Once the power is restored, since you have enabled the DAG for DAC mode, the nodes in the first datacentor that have quorum won't mount their databases, then you should manually switch the copies to activate on the first datacenter and disconnect nodes in the second datacenter. Finally, restore the power in the second datacentor and the DAG is now working as before.

    For reference: Datacenter Activation Coordination mode

    Regards,

    Manu Meng


    Please remember to mark the replies as answers if they helped. If you have feedback for TechNet Subscriber Support, contact tnsf@microsoft.com.

    Click here to learn more. Visit the dedicated forum to share, explore and talk to experts about Microsoft Teams.

    Thursday, May 30, 2019 7:02 AM
    Moderator
  • Hi,
    Thanks for the info.
    Both datacentres will be unreachable during the network intervention, so if i understand correctly this is what we should do?

    * migrate all DB's to one of both datacentres.
    * put all servers in maintenance mode in the datacentre that does not have active mailbox DB's anymore
    * When network is back, remove the servers from maintenance mode
    * redistribute DB's

    So if this is correct, i do not need to use the cmdlets Stop/restore/start -DatabaseAvailabilityGroup?
    Monday, June 3, 2019 7:50 AM
  • Hi,
    Thanks for the info.
    Both datacentres will be unreachable during the network intervention, so if i understand correctly this is what we should do?

    * migrate all DB's to one of both datacentres.
    * put all servers in maintenance mode in the datacentre that does not have active mailbox DB's anymore
    * When network is back, remove the servers from maintenance mode
    * redistribute DB's

    So if this is correct, i do not need to use the cmdlets Stop/restore/start -DatabaseAvailabilityGroup?

    You may need to read this: Part 7: Datacenter Activation Coordination: When to run start-databaseavailabilitygroup to bring members back into the DAG after a datacenter switchover…

    After a datacenter switchover has occurred, unless the original datacenter is gone or otherwise unrecoverable, eventually services in the primary datacenter will be restored.  When services are restored, including full network connectivity, database availability group (DAG) administrators can begin the switchback process by using the Start-DatabaseAvailabilityGroup cmdlet.

    Regards,

    Manu Meng


    Please remember to mark the replies as answers if they helped. If you have feedback for TechNet Subscriber Support, contact tnsf@microsoft.com.

    Click here to learn more. Visit the dedicated forum to share, explore and talk to experts about Microsoft Teams.

    • Marked as answer by Lyncer2013 Monday, June 17, 2019 12:32 PM
    Tuesday, June 4, 2019 10:08 AM
    Moderator
  • Just checking in to see if above information was helpful. Please let us know if you would like further assistance.

    Regards, 

    Manu Meng


    Please remember to mark the replies as answers if they helped. If you have feedback for TechNet Subscriber Support, contact tnsf@microsoft.com.

    Click here to learn more. Visit the dedicated forum to shareexplore and talk to experts about Microsoft Teams.

    Tuesday, June 11, 2019 10:11 AM
    Moderator
  • I brought down all servers within 1 datacentre to be sure.
    I did have some issues afterwards on the active site > some DB's unable to mount, and 3 servers were evicted from the cluster. They were able to join again automatically about 1 hr later.
    Monday, June 17, 2019 12:32 PM