none
Failover Cluster - heartbeat on a separate VLAN?

    Question

  • Hi everyone,

    I'm making a documentation about creating failover clusters in Windows Server 2012 for our company.
    Now I'm a bit uncertain about using a different VLAN for heartbeat communication.
    We're using HP ProLiant c-Class Bladesystems connected via PassThru to Cisco 2k Switches and from there to Cisco Nexus 5k. Each side (so each of the two onboard NICs of the Blades) is totally redundant up to the Nexus 5k (or even a step further, I'm not sure...) and we're using Network Teaming (Network Fault Tolerance only - Active/Passive) on the blades.
    So, since each and every VLAN I'll configure on the blades will go through the exact same logical routes, I was wondering why I should even use the separate Heartbeat VLAN anymore?! (Yes, separate Heartbeat VLAN are for reducing single point of failure, but since every VLAN is using the same logical route I don't think that this will provide me more redundancy or am I wrong?)

    My question:
    Is there a reason to still use separate unrouted VLANs for Heartbeat communications? Is there maybe unencrypted communication through the Heartbeat with some information which someone could use to harm the system / network...? (should heartbeat communication be secured from spoofing/sniffing?)

    Thanks in advance and best regards,

    Ville

    Thursday, February 14, 2013 2:47 PM

Answers

  • The recommendation was really to use a separate *physical* LAN for heartbeats to reduce, SPoF.  VLANs really will not do so - I guess one could argue having a separate VLAN prevents administrative errors, but that is thin.  When Failover Clustering across routed networks became supported the argument for a separate heartbeat network became even weaker, as few companies would invest in separate network connectivity for heartbeats between two physical sites.  In a typical private cloud deployment such as yours with aggregated links through redundant switching fabrics, the point of separating heartbeats is almost non-existent.  You may still be concerned with issues such as packet storm on your data VLAN interrupting heartbeats (which are UDP), or again, administrative error.  You can modify cluster settings to change frequency and threshold if you are concerned with missed heartbeats due to congestion, so maybe this isn't a big risk for you.  Your own decision to use a separate VLAN will be driven primarily by an analysis of your infrastructure and your own risk models, but personally I'd take a lowered administrative complexity unless I had a specific requirement not to.  

    Heartbeat communications are essentially UDP pings.  The information isn't sensitive.  There may be a risk of spoofing or ARP poisoning to force missed heartbeats, but these risks can also be mitigated.  It may be something to consider in your risk model, however it is probably not a significant risk for most datacenters.

    • Marked as answer by POCIT Friday, February 15, 2013 7:24 AM
    Thursday, February 14, 2013 5:15 PM

All replies

  • The recommendation was really to use a separate *physical* LAN for heartbeats to reduce, SPoF.  VLANs really will not do so - I guess one could argue having a separate VLAN prevents administrative errors, but that is thin.  When Failover Clustering across routed networks became supported the argument for a separate heartbeat network became even weaker, as few companies would invest in separate network connectivity for heartbeats between two physical sites.  In a typical private cloud deployment such as yours with aggregated links through redundant switching fabrics, the point of separating heartbeats is almost non-existent.  You may still be concerned with issues such as packet storm on your data VLAN interrupting heartbeats (which are UDP), or again, administrative error.  You can modify cluster settings to change frequency and threshold if you are concerned with missed heartbeats due to congestion, so maybe this isn't a big risk for you.  Your own decision to use a separate VLAN will be driven primarily by an analysis of your infrastructure and your own risk models, but personally I'd take a lowered administrative complexity unless I had a specific requirement not to.  

    Heartbeat communications are essentially UDP pings.  The information isn't sensitive.  There may be a risk of spoofing or ARP poisoning to force missed heartbeats, but these risks can also be mitigated.  It may be something to consider in your risk model, however it is probably not a significant risk for most datacenters.

    • Marked as answer by POCIT Friday, February 15, 2013 7:24 AM
    Thursday, February 14, 2013 5:15 PM
  • Hi Neil,

    thank you! That's about everything I wanted to know :)

    Best regards,

    Ville

    Friday, February 15, 2013 7:24 AM