none
HyperV Service behavior during iSCSI storage failover

    Question

  • We have a 4 node 2008 R2 cluster using Clustered Shared volumes for HyperV guests connected to iSCSI storage which is also clustered.  The storage is a 2 node SUN 7410.  When I failover the SUN storage, the guests perform differently each time.  Sometimes they pause until the storage comes back online, migrate to a different 2008 R2 node or turn off and then boot.  This seems to be random since the 2 guests I'm testing with do different things each time I fail between the SUN storage heads.   

    What is the typical behavior of a clustered application when its connection to iSCSI storage is temporarily lost to all the nodes? 

    Is there a way to modify this behavior?

    The storage drops between 2 to 6 pings before it responds when failing between the heads.  It does take a minute or two before the iSCSI connection comes back up.

    Before the servers were in cluster, the guests would just pause 5 to 60 seconds until the iSCSI connections reconnected. 


    Jon
    Monday, July 26, 2010 7:39 PM

Answers

  • Hi Jon

    you may like to check if the settings of both storage nodes are identical. I think you should involve your storage vendor to check the settings. [this may not be the right forum to discuss this]

    sorry for the generic answer


    Gaurav Anand Visit my Blog @ http://itinfras.blogspot.com/
    Saturday, July 31, 2010 4:32 AM
  • It turns out that all MS clustering requires that a storage identifier be assigned to it.  If the connection is restored, it will use this key to verify that it is the correct an only owner of the storage.  If this value does not match, it baulks and causes the HyperV guests to crash.  Below is a better explanation from MS Tech;

    "Cluster in 2008 uses a Persistent  Reserve In command for a keep alive. This will return the reservation key from the drive. Then cluster compares the key returned with the internal key that it last used when the reservation was set. If the keys do not compare, the “aborted error appears”.  This is because cluster sends a request into clusdisk and when the IRP completes with an error means something went wrong with reservations.  When the keys do not compare the request completes with that error.

    When the IP addresses are switched all I/O is held up in MPIO in including the reserve in commands. When the commands are resumed on the new array controller no reservation key exists so cluster fails.   This is by design and cannot be changed. "

    This feature was supported in the early versions of the SUN 7000 storage but at some point wat left out.  They have a fix for it that is not released but hopefully soon, it will be rolled into the latest builds. 

    Thanks for everyone’s help!

     

    • Marked as answer by Jon_AD_Admin Wednesday, February 02, 2011 7:35 PM
    Wednesday, February 02, 2011 7:33 PM

All replies

  • Most multi-head SANs have a helper application that uses the Multi-Path IO APIs to smoothe this transition. It sounds like you are using the standard iSCSI initiator without any helper application.  Check with SUN and see what MPIO drivers they have. 

    Clustering is a bit sensitive to storage failure since that is a key dependency for almost any cluster-aware application.


    Geoff N. Hiten Principal Consultant Microsoft SQL Server MVP

    Thursday, July 29, 2010 12:48 AM
  • Hi Jon

    what DSM is being used; MSdsm.sys or vendor based DSM and what policy is set for MPIO

    also about these

    PathVerifyEnabled: This flag enables path verification by MPIO on all paths every N seconds (where N depends on the value set in PathVerificationPeriod).

    PathVerificationPeriod: This setting is used to indicate the periodicity (in seconds) with which MPIO has been requested to perform path verification. This field is only honored if PathVerifyEnabled is TRUE.

    I will suggest to engage your SAN vendor.

    Gaurav Anand Visit my Blog @ http://itinfras.blogspot.com/
    Thursday, July 29, 2010 5:05 AM
  • Geoff,

    On their Unified Storage 7000 series, they do not offer a windows based MPIO driver.  Within their Fiber channel doc they state "ALUA multipathing is supported by native Windows 2008/R2 MPIOonly".

    Thanks for the info.

    Jon

    Thursday, July 29, 2010 1:53 PM
  • So after the previous post from Geoff, I happed to stumble on SUN FC config which also uses MPIO.  Within the doc it states that "the default MS DSM timer counters may be insufficient to ensure that I/O continues".  It also defines which settings are recomend to modify and the values.  They are as follows;

    PathVerifyEnabled--> Enable

     

     

     

     

    PathVerificationPeriod--> 5 Sec

    RetryInterval--> 5 Sec

    RetryCount--> 300

    PDORemovePeriod--> 1500 Sec

    I also wonder if I need to tune the Advanced Policies basic and thorouh heath check time periods.  Any advice?  Since the virtual machine dependency is on the iSCSI MPIO enabled Cluser Shared Volume, will the health checks pause for the above settings?

    I'll try the DSM settings and see what happens.

    Gaurav and Geoff, thanks for your help!

    Jon

    Thursday, July 29, 2010 2:10 PM
  • Gaurav,

    Will I have to reboot the HyperV host after making the DSM changes?

    Thanks!

    Jon


    Jon
    Thursday, July 29, 2010 2:57 PM
  • Hi Jon

    I am not sure but i guess you need not if you change those DSM settings. you may like to read this

    http://itinfras.blogspot.com/2010/05/what-is-mpio-and-best-practices-of-mpio.html

     


    Gaurav Anand Visit my Blog @ http://itinfras.blogspot.com/
    Thursday, July 29, 2010 3:44 PM
  • I rebooted the HyperV servers after making the DSM policy changes.  The virtual guest servers stay up and running when I fail the SUN storage from head A to B which takes approx 30 to 35 sec.  When I fail from B to A, approx 45 to 50 sec for the target to respond to ping requests, the test servers blue screen.  Any ideas? 

    Jon


    Jon
    Friday, July 30, 2010 4:20 PM
  • Hi Jon

    you may like to check if the settings of both storage nodes are identical. I think you should involve your storage vendor to check the settings. [this may not be the right forum to discuss this]

    sorry for the generic answer


    Gaurav Anand Visit my Blog @ http://itinfras.blogspot.com/
    Saturday, July 31, 2010 4:32 AM
  • Sorry it took so long to respond.  Sun has already looked in and everyting is good.  I've opened a case with MS.  So far, there has not been any progress.  They did have me try this hotfix but it did not resolve the issue, http://support.microsoft.com/kb/981379/en-us.

    If anyone has any ideas, let me know.  Hopefully the MS support group can resolve this.  I'll post the resolution when it's up and running as it should. 

    Thanks!


    Jon
    Friday, August 06, 2010 5:30 PM
  • It turns out that all MS clustering requires that a storage identifier be assigned to it.  If the connection is restored, it will use this key to verify that it is the correct an only owner of the storage.  If this value does not match, it baulks and causes the HyperV guests to crash.  Below is a better explanation from MS Tech;

    "Cluster in 2008 uses a Persistent  Reserve In command for a keep alive. This will return the reservation key from the drive. Then cluster compares the key returned with the internal key that it last used when the reservation was set. If the keys do not compare, the “aborted error appears”.  This is because cluster sends a request into clusdisk and when the IRP completes with an error means something went wrong with reservations.  When the keys do not compare the request completes with that error.

    When the IP addresses are switched all I/O is held up in MPIO in including the reserve in commands. When the commands are resumed on the new array controller no reservation key exists so cluster fails.   This is by design and cannot be changed. "

    This feature was supported in the early versions of the SUN 7000 storage but at some point wat left out.  They have a fix for it that is not released but hopefully soon, it will be rolled into the latest builds. 

    Thanks for everyone’s help!

     

    • Marked as answer by Jon_AD_Admin Wednesday, February 02, 2011 7:35 PM
    Wednesday, February 02, 2011 7:33 PM