locked
Cluster quorum best practice for SQL AlwaysON RRS feed

  • Question

  • Hello,

    I have a SQL alwaysON setup between on-prem and Azure SQL server (IaaS). Its a 2 node cluster and have configured file share witness for the quorum( On one of the Azure server). There is a site-to-site VPN configured to facilitate all these. 

    Of late , we see our primary DB going down due to instability in cluster. It was noticed that our network goes down there by making secondary node and file share in accessible and cluster gets affected . Due to which all the primary DBs becomes in accessible and application downtime. 

    This setup was primarily built for DR purpose and accordingly we have placed the quorum on Azure side.

    How can we avoid this situation. I assume all this is due to unavailability of Azure node and File share witness ( this too on Azure) are in accessible at the same time whenever network goes down. 

    Is there any configuration changes I can make to make the on-prem stable irrespective of the status of secondary node. 

    what quorum method I should adapt for these kind of situation. Please help. 

    Tuesday, September 3, 2019 7:24 AM

All replies

  • If you are looking for 'best practices' for configuring highly available SQL environments, you are best served by asking the SQL HA experts in their forum - https://social.technet.microsoft.com/Forums/en-US/home?forum=sqldisasterrecovery



    tim

    Tuesday, September 3, 2019 1:03 PM
  • Hi Tim, I had a thread opened there, I was asked to open a new one here :) 

    Looking for help.. 

    Tuesday, September 3, 2019 2:10 PM
  • Hi,

    I agree with Tim, this is more a question for the SQL HA forums.

    This might also provide some information:
    Recommendations and Best Practices When Deploying SQL Server AlwaysOn Availability Groups in Microsoft Azure (IaaS)

    Best regards,
    Leon


    Blog: https://thesystemcenterblog.com LinkedIn:

    Tuesday, September 3, 2019 2:25 PM
  • Hi,

    Good day!

    >>I had a thread opened there, I was asked to open a new one here.

    Why them ask you post a new thread in here? This is more related to SQL HA.

    Our forum focuses more on question of general Windows Server failover cluster, for the best practice for SQL cluster quorum, experts in SQL will have more say than we do.

    Thanks for your understanding!

    Best Regards,

    Daniel


    Please remember to mark the replies as answers if they help.
    If you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.com.

    Wednesday, September 4, 2019 2:11 AM
  • If this is specific to SQL, it doesn't make sense to ask here.  This forum is for general clustering issues not related to any particular application.

    From a general standpoint, if I am reading your problem statement correctly, you are experiencing issues when your "network goes down".  Do you have just a single network connection to Azure?  If so, that is a problematic configuration.  You should never design a cluster that has a single point of failure.  If you have a dual path to Azure, and you lose both paths to cause this failure, then the cluster is operating as expected.  A cluster is not designed to ride through a dual failure.  If you need the ability to ride through a dual network failure, then you need to configure your network with at least three paths.


    tim

    Wednesday, September 4, 2019 1:20 PM