none
Cluster Validation Tool need downtime

    Question

  • Hi,

    Do we need downtime to run cluster validation Tool. (Windows Server 2012 R2)

    8 node cluster - all nodes on azure - SQL DB on Local disk - no Cluster disk - SQL 2014 DB on always sync on all nodes - Quorum is shared folder from one server. (File share Witness)

    I do know if Storage cluster disk is configured you need downtime or avoid storage test. But for local disk which are not configured as cluster disk. I don't see it in failover cluster manager console in cluster disk tab.

    Please help ...

    Monday, September 10, 2018 4:08 AM

All replies

  • Hi,

    Thanks for your question.

    When the “Validate a Configuration” wizard is launched it offers the choice to run all tests or a subset of tests.  With this granularity it is possible to select all the tests which do not impact the cluster and skip those which can impact high-availability for a group.  In fact almost all of the tests can be run while the cluster is online, without impacting anything running on the cluster, other than a slight performance hit due to processing the tests themselves. 


    Across these dozens of tests, only a few of them will impact running cluster workloads and these are all within the storage category, so skipping this entire category is an easy way to avoid disruptive tests.  Listing All Disks and Potential Cluster Disks will not impact anything.  Validating Disk Access Latency, File System, Microsoft MPIO-based disks, and SCSI device Vital Product Data (VPD) can impact the disk’s performance as tests are performed against the disk which is in use, however they will cause no downtime unless the disk latency becomes so slow that it triggers an alert.  Several tests will actually trigger failovers and move the disks and groups to different cluster nodes which will cause downtime, and these include Validating Disk Arbitration, Disk Failover, Multiple Arbitration, SCSI-3 Persistent Reservation, and Simultaneous Failover.  So if you want to test a majority of the functionality of your cluster without impacting availability, exclude these tests.

    Failover Clustering does come with a built-in safeguard to prevent accidental downtime when running the storage tests in Validate.  If the cluster has any online groups when Validate is initiated, and the storage tests remain selected, it will prompt the user for confirmation whether they want to run all the tests (and cause downtime), or to skip testing the disks of any online groups to avoid downtime.  If the entire storage category was excluded from being tested, then this prompt is not displayed.  This will enable cluster validation with no downtime, but of course it is not complete as some of the tests have been skipped, yet according to the Microsoft Support Policy “the proposed solution must pass the full Validate test.” 

    This can be done by simply creating a new cluster disk from the same storage array, exposing it to all nodes and running all tests against just that disk.  This gives you the benefits of running Validate against that type of disk to ensure that it will work while not risking any downtime to production workloads.  This can be done by running Validate, selecting all the tests, but keeping any running services or applications online.

    For more details, we can refer to the following blog,

    https://blogs.msdn.microsoft.com/clustering/2011/06/28/validating-a-cluster-with-zero-downtime/

    For the Validation Wizard to test your clustered storage it must have storage to test and more importantly, storage that is in an offline state. For example, if you have a SQL cluster the odds are you will have to take your SQL Service offline to test any associated storage hardware. The average cluster generally has no free storage to test.

    https://blogs.technet.microsoft.com/askpfeplat/2016/10/17/windows-failover-cluster-storage-quick-test/

    Best regards,

    Michael


    Please remember to mark the replies as an answers if they help.
    If you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.com

    Tuesday, September 11, 2018 2:13 AM
    Moderator
  • Hi,

    Just checking in to see if the information provided was helpful. Please let us know if you would like further assistance.

    Best Regards,

    Michael


    Please remember to mark the replies as an answers if they help.
    If you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.com

    Friday, September 14, 2018 10:09 AM
    Moderator
  • Hi,

    Could the above reply be of help? If yes, you may mark it as answer, if not, feel free to feed back.

    Best Regards,

    Michael


    Please remember to mark the replies as an answers if they help.
    If you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.com

    Monday, September 17, 2018 1:10 PM
    Moderator