none
Loss of VMs when server is restarted but its roles had already been drained to another server

    Question

  • Hi,

    We have a failover cluster of two Windows Server 2012 Datacenter hyper-v host which are NOT managed by VMM. All management of them is done via Failover Cluster Manager.  They use cluster shared volumes (CSVs) for their storage.

    When the patching admin wants to patch the servers he drains roles on the first host in failover cluster manager and the VMs get live migrated to the second host.  The VMs are running and accessible at this time, however when he performs patching on the first server and restarts it, the VMs on the second server go down.

    I was of the understanding that since the storage is CSVs and is accessible to both hosts and the live VM workload is running on the second server, this shouldn't happen?  But it seems like the VMs lose access to storage and things go awry and you can no longer ping them VMs on host 2 and failover cluster manager on host 2 freezes until the 1st host finishes rebooting.  Once the first host comes back up, the VMs go live again.

    I've had a look at Failover Cluster Manager and cant see anything amiss. I did note that the disk's Owner Node in the disks section is the 1st host.  Do we need to manually change the Owner Node to the 2nd host with the live vm load before we take down the 1st ?  I would've thought that control of the storage would be, well.., seamless. I thought because it was cluster shared volumes they both had access to the disk at all times.

    Appreciating any assistance!

    Wednesday, May 15, 2019 11:55 PM

All replies

  • Hi,

    Any error in Event Viewer on the host 1?

    Did you patch the host through install update package by manual?

    Please check the danamicquorum if it is enabled.

    >>you can no longer ping them VMs on host 2 and failover cluster manager on host 2 freezes until the 1st host finishes rebooting.


    Please check the CPU and Memory in task manager when it freezes.

    >>I've had a look at Failover Cluster Manager and cant see anything amiss. I did note that the disk's Owner Node in the disks section is the 1st host.

    In general, the disk owner would automatically change to another node,when the node down.

    So try to change the owner node by manual to check if it success.

    In addition, CAU(Cluster Aware Updating) is good tool for patching host.

    https://docs.microsoft.com/en-us/windows-server/failover-clustering/cluster-aware-updating

    Best Regards,

    Frank


    Please remember to mark the replies as an answers if they help.
    If you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.com



    Thursday, May 16, 2019 2:32 AM
    Moderator
  • Hi,

    I havent been able to find any relevant errors in the windows event logs (system/application/FailoverCluster-Manager).

    Patches are being pushed out via SCCM.

    I checked dynamic quorum and it responds with    Dynamic Quorum : 1

    I tried to manually move the owner node of the disks from the first node to the second node and literally nothing happens.  It stays as node 1 as the owner and no errors occur. 



    • Edited by sdewar83 Friday, May 17, 2019 12:14 AM
    Friday, May 17, 2019 12:12 AM
  • Hi,

    It is a strange situation.

    Is there have no error event related with issue in the follow location?

    I suggest you run a cluster validate to check the health.

    Note:

    a few of test will impact running cluster workloads and these are all within the storage category

     Cluster Validation Tool need downtime

    https://social.technet.microsoft.com/Forums/en-US/e546ffaf-94af-40fd-8daa-dfc91fb467ed/cluster-validation-tool-need-downtime?forum=WinServerPreview


    Best Regards,

    Frank


    Please remember to mark the replies as an answers if they help.
    If you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.com

    Friday, May 17, 2019 7:58 AM
    Moderator
  • Hi,

    Just check the situation about your issue.

    Best Regards,
    Frank

    Please remember to mark the replies as an answers if they help.
    If you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.com

    Monday, May 20, 2019 8:58 AM
    Moderator
  • HI Frank,

    Failover Clustering-CsvFs doesnt show up as an option in event viewer.  I only have failoverCLustering, failoverCluster-manager, and failoverclustering-wmiprovider.  Possibly because this cluster is only running Server 2012, not 2012R2 ?

    No events show at all under Cluster Events in failover manager.

    Given a full cluster validation will require an outage I've scheduled one and see what comes out of the validation report.

    Thanks!

    Tuesday, May 21, 2019 1:33 AM
  • Hi,

    Thanks for your reply.Appreciate your patience and support.

    Best Regards,

    Frank


    Please remember to mark the replies as an answers if they help.
    If you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.com

    Tuesday, May 21, 2019 7:01 AM
    Moderator