locked
Storage Spaces Direct (S2D) in a 3-node Switchless Topology RRS feed

  • Question

  • I was running a two node S2d cluster with crossovers on 10GbE cards (ConnectX-3 with RDMA), however I have now extended it to three nodes with three dual port cards in a ring topology. I did this to avoid having a DCB/PFC switch in a homelab environment.

    This involves dual port cards, each server connected to each other (1<->2, 1<->3, 2<->3) over three subnets. Clearly this means each subnet only involves two of the hosts at a time.

    Everything seems great both with Windows clustering and S2D except for one little warning:

    > Get-StorageSubSystem *Cluster* | Debug-StorageSubSystem

    Severity: Minor

    Reason         : The server 'hv1' has missing network adapter(s) connected to cluster network 'Cluster Network 217'.
    Recommendation : Connect the server to the missing cluster network.
    Location       : Server 'hv1', LENOVO, ThinkServer TS140, XXX, Network Adapter connected to 'Cluster Network 217'
    Description    : Network adapter connected to cluster network 'Cluster Network 217'

    Reason         : The server 'hv2' has missing network adapter(s) connected to cluster network 'Cluster Network 216'.
    Recommendation : Connect the server to the missing cluster network.
    Location       : Server 'hv2', LENOVO, ThinkServer TS140, XXX, Network Adapter connected to 'Cluster Network 216'
    Description    : Network adapter connected to cluster network 'Cluster Network 216'

    Reason         : The server 'hv3' has missing network adapter(s) connected to cluster network 'Cluster Network 215'.
    Recommendation : Connect the server to the missing cluster network.
    Location       : Server 'hv3', LENOVO, ThinkServer TS140, XXX, Network Adapter connected to 'Cluster Network 215'
    Description    : Network adapter connected to cluster network 'Cluster Network 215'

    It looks like these Minor severity warnings are non-impactful to the setup, just a warning that there is a non-identical configuration on each cluster node.

    For a homelab environment, this shouldn't be impactful, correct? Safe to ignore I hope?

    Tuesday, May 16, 2017 3:34 AM

Answers

  • We never designed nor tested for any such configuration... so we can't provide any guarantee's to it's functionality
    Sunday, May 21, 2017 3:03 PM

All replies

  • Hi rwojo,

    >This involves dual port cards, each server connected to each other (1<->2, 1<->3, 2<->3) over three subnets. Clearly this means each subnet only involves two of the hosts at a time.

    Not sure if the error is related to this. Could you please test to use switch for the cluster services. Then try to do again and check the results.

    Best Regards,

    Mary


    Please remember to mark the replies as answers if they help.
    If you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.com.

    Tuesday, May 16, 2017 7:05 AM
  • The minor severity item does go away if i use a switch instead with all severs on one subnet.

    My current setup is like this though with SFP Twinax cables in a ring topology:

    Cluster Network 215 - 10.0.215.0/24, contains hv1 (10.0.215.1) & hv2 (10.0.215.2) -- note hv3 is missing

    Cluster Network 216 - 10.0.216.0/24, contains hv1 (10.0.216.1) & hv3 (10.0.216.3) -- note hv2 is missing

    Cluster Network 217 - 10.0.217.0/24, contains hv2 (10.0.217.2) & hv3 (10.0.217.3) -- note hv1 is missing

    Deciphering one of the error messages, for example the one for the subnet 10.0.217.0/24 (Cluster Network 217), the cmdlet is saying that the subnet 10.0.217.0/24 has one server, hv1, which is missing. This makes sense, that is the connection between hv2 and hv3, so hv1 does not exist on that point-to-point connection.

    Repeat that for each subnet. 215 is missing hv3 and 216 missing hv2. The cmdlet warnings match these exactly warning that each of those subnets is missing those servers not in the subnet.

    Tuesday, May 16, 2017 11:42 AM
  • Hi,

    Could you please check about the configuration about Windows Failover Cluster Networks

    https://blogs.technet.microsoft.com/askcore/2014/02/19/configuring-windows-failover-cluster-networks/

    For now I couldn't find such error from the Microsoft. And according to your description, from my personal viewe, it may related to your  Network configuration.

    Maybe also confirm this issue with cluster forum for more suggestion.

    https://social.technet.microsoft.com/Forums/windowsserver/en-US/home?forum=winserverClustering

    Best Regards,

    Mary


    Please remember to mark the replies as answers if they help.
    If you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.com.

    Thursday, May 18, 2017 8:45 AM
  • It is related to the non-uniform subnet/network layout, of course. My question is directed towards the S2D subsystem and if that is an issue or not. The cluster is fine.
    Thursday, May 18, 2017 10:59 AM
  • Hi rwojo,

    Maybe you could also refer to the thread about such error.

    https://social.technet.microsoft.com/Forums/windowsserver/en-US/4e417747-a119-4cad-a013-5401335f42bc/windows-2016-cluster-repoted-incorrect-information-in-validate-system-health-section?forum=winserverClustering

    Best Regards,

    Mary


    Please remember to mark the replies as answers if they help.
    If you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.com.

    Friday, May 19, 2017 8:56 AM
  • That is similar, but not what I'm running into.

    In fact, I expect that in 3 nodes, that each node can only see the other nodes via one of the networks, not both. This is expected with a ring topology.

    The question is more specifically: is this okay for operation of S2D as the storage subsystem is bringing up a warning. It functions correctly, I just want to make sure it won't cause problems down the road.

    Friday, May 19, 2017 3:09 PM
  • We never designed nor tested for any such configuration... so we can't provide any guarantee's to it's functionality
    Sunday, May 21, 2017 3:03 PM
  • Fair enough. FWIW to others that see this, it seems to run well in my lab environment.
    Monday, May 22, 2017 1:22 AM
  • Hey rwojo, 

    As far as I know, solutions such as HPE VSA or StarWind Virtual SAN could be configured in 3-node switchless cluster. StarWind could utilize grid architecture, which will give you higher fault tolerance rate. 

    https://www.starwindsoftware.com/starwind-virtual-san-hyper-converged-3-nodes-scenario-with-hyper-v-cluster


    Cheers,

    Alex Bykovskyi

    StarWind Software

    Blog:   Twitter:   LinkedIn:  

    Note: Posts are provided “AS IS” without warranty of any kind, either expressed or implied, including but not limited to the implied warranties of merchantability and/or fitness for a particular purpose.

    Wednesday, May 24, 2017 5:07 PM
  • I was using a 3-node setup with direct connect cables and i never had any network related errors like you mentioned. I went to a switch configuration when mellanox did a promotion of $5k off their spectrum switches. I was using the mellanox 100gb cards (directly connected). 1->2,1->3,2->3.
    Wednesday, October 11, 2017 1:49 PM
  • I think all 3 nodes should be on the same IP network so that each node can talk to the other 2 nodes over the same IP network.  Example: Node 1 IP 10.0.0.1/24, Node 2 IP 10.0.0.2/24 and Node 3 IP 10.0.0.3/24 for East/West Traffic
    Monday, November 4, 2019 8:05 PM