none
Server 2019 S2D Nested Resiliency - Storage Pool Quorum RRS feed

  • Question

  • On this page - https://docs.microsoft.com/en-us/windows-server/storage/storage-spaces/nested-resiliency - a scenario is detailed where a 2-node S2D cluster loses one entire node plus a disk on the second node. This results in 3 out of 8 of the clusters disks remaining online. The S2D quorum documentation - https://docs.microsoft.com/en-us/windows-server/storage/storage-spaces/understand-quorum - still indicates that to maintain storage pool quorum a cluster must have 50% + 1 disks available (with the pool resource owner counting as a disk vote). This is definitely the case in Server 2016... was this changed in Server 2019 and not documented? How does Storage Pool Quorum work in S2D on Server 2019 that it allows this nested resiliency failure scenario?

    Thank you.

    Thursday, November 15, 2018 10:35 PM

All replies

  • Hi,

    Thanks for posting in our forum!

    Please see my reply below:

    1. In windows server 2019, the cluster must still have 50%+1 disk available.

    2. How does storage pool quorum work in windows server 2019?

    When drives fail, or when some subset of drives loses contact with another subset, surviving drives need to verify that they constitute the majority of the pool to remain online. If they can’t verify that, they’ll go offline. The pool is the entity that goes offline or stays online based on whether it has enough disks for quorum (50% + 1). The pool resource owner (active cluster node) can be the +1.

    But pool quorum works differently from cluster quorum in the following ways:

    • the pool uses one node in the cluster as a witness as a tie-breaker to survive half of drives gone (this node that is the pool resource owner)
    • the pool does NOT have dynamic quorum
    • the pool does NOT implement its own version of removing a vote

    For more information, please refer to the link below, this official document applies to windows server 2019, windows server 2016.

    https://docs.microsoft.com/en-us/windows-server/storage/storage-spaces/understand-quorum

    Thanks for your time, if you have any question, please feel free to let me know.

    Best Regards,

    Daniel


    Please remember to mark the replies as answers if they help.
    If you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.com.

    Friday, November 16, 2018 3:23 AM
    Moderator
  • Hi Daniel,

    Thanks for the response. That is my understanding of the Storage Pool Quorum documentation as well.

    Can I then conclude that the documentation on this page is inaccurate? https://docs.microsoft.com/en-us/windows-server/storage/storage-spaces/nested-resiliency

    It very clearly shows scenarios with S2D 2019 where more than half the disks have been lost and indicates that the storage would still be available in those scenarios - which very clearly contradicts the Storage Pool Quorum documentation as well as your statements above.

    Thanks,

    Dan

    Monday, November 19, 2018 12:22 AM
  • Hi Dan,

    Thanks for your reply!

    There is nothing wrong with the statement in this document, maybe you misunderstood what it means.

    https://docs.microsoft.com/en-us/windows-server/storage/storage-spaces/nested-resiliency

    The official documentation shows that it can only be used in the case of two nodes. If there are more than two nodes, this situation is not possible.

    According to my understanding, there are two nodes in a cluster, we need to configure a witness for it, so when a server offline, the cluster can still survive. In addition, RAID5 allows us to lose a disk, so although we lost 5 of the 8 cluster disks, it can still survives.

    Hope this information can help you, if you have any other question, please feel free to let me know.

    Best Regards,

    Daniel


    Please remember to mark the replies as answers if they help.
    If you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.com.

    Monday, November 19, 2018 3:30 AM
    Moderator
  • Hi Daniel,

    Thanks for your response. I still don't quite understand however. The image you provided above is indeed what I was referring to. However I am not sure what RAID5/single-parity have to do with anything -- Storage Pool Quorum doesn't have anything to do with the resiliency scheme chosen on any Virtual Disks as far as I understand. In that picture Storage Pool Quorum would very clearly be lost per the quorum documentation. I believe the quorum documentation also indicates that the witness (whether cloud or file share) does NOT play a role in Storage Pool Quorum - only Cluster Quorum. What am I missing?

    -Dan

    Monday, November 19, 2018 3:41 PM
  • Hi Dan,

    Thanks for your reply!

    Please understand that this situation only happened on two nodes cluster, if we have three nodes or more, this situation will not happen, maybe you misunderstood the meaning of the document. you can read the document again when you are convenient.

    If you have any question, please let me know.

    Best Regards,

    Daniel


    Please remember to mark the replies as answers if they help.
    If you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.com.

    Tuesday, November 20, 2018 7:16 AM
    Moderator
  • Hi Dan,

    This is Daniel and wish you all the best!

    I just want to confirm the current status of the problem, if you have any other question, please feel free to let me know.

    Best Regards,

    Daniel


    Please remember to mark the replies as answers if they help.
    If you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.com.

    Thursday, November 22, 2018 2:52 AM
    Moderator
  • Hi Dan,

    Wish you all the best!

    I just confirmed the current status of the issue, if you have any question, please feel free to let me know.

    Best Regards,

    Daniel


    Please remember to mark the replies as answers if they help.
    If you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.com.

    Monday, November 26, 2018 3:56 AM
    Moderator
  • Hi Daniel,

    My question still hasn't really been answered. The Storage Pool Quorum page and the Nested Resiliency page very clearly contradict each other. The scenarios described on that Nested Resiliency page that it indicates are "survivable" would actually have a loss of Storage Pool Quorum per the quorum documentation.

    Thank you,

    Dan

    Friday, November 30, 2018 4:10 PM
  • Hi Dan,

    Thanks for your reply!

    Please understand that this can only happen in a clustered environment with two nodes. So there is no contradiction between the two official documents.

    Thanks for your time and have a nice day!

    Best Regards,

    Daniel


    Please remember to mark the replies as answers if they help.
    If you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.com.

    Saturday, December 1, 2018 12:40 AM
    Moderator
  • Hi,
    This is Daniel and wish you all the best!
    Was your issue solved?
    If you have any question, please feel free to let me know.
    Best Regards,
    Daniel

    Please remember to mark the replies as answers if they help.
    If you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.com.

    Monday, December 3, 2018 1:24 AM
    Moderator
  • Hi,

     

    Hope you are doing well!

     

    I haven't received your reply for a while, how is the problem now, if your problem is solved, would you please share your experience with us? if there is anything need to help, please let me know.

     

    In addition, if anyone’s reply is helpful to your problem, please mark it as an answer at your convenience so that other people wo have similar problem can refer to. I appreciate your cooperation!

     

    Thanks for your time and have a nice day!

     

    Best Regards,

    Daniel


    Please remember to mark the replies as answers if they help.
    If you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.com.

    Wednesday, December 5, 2018 1:39 AM
    Moderator