none
Event Id 1069 The error code was '0x139f' ('The group or resource is not in the correct state to perform the requested operation.'). RRS feed

  • Question

  • We have a 2 node cluster with S2D Server 2019 that live migrates when you do it nicely perfectly fine

    Cluster validation shows no problems

    When we power off a server for testing the migration gets to 59% hangs for a bit then stops and the following is logged:

    Event iD 1069 Cluster resource 'Virtual Machine Configuration TEST-VM' of type 'Virtual Machine Configuration' in clustered role 'TEST-VM' failed. The error code was '0x139f' ('The group or resource is not in the correct state to perform the requested operation.'). The Migration times out then the TEST-VM restarts on the other node.

    Further testing shows that this only happens when we have 8 or 10 CPU's and 32GB of RAM on the TEST-VM.
    If we only use 1 CPU and 1024MB RAM when we test the power off the live Migration works as it should.

    Dell PowerEdge R640's with 2x 8core CPU and 128GB Ram each.

    Getting stuck here as 1069 is a generic error for something isn't available, we think the virtual machine configuration file is not able to be found / move. the only other events are to do with the networking going down, but as we turned a node off this is to be expected.

    Tuesday, April 9, 2019 11:02 AM

All replies

  • Hi,

    Please give more information about your setup

    2 node cluster for storage space? or 2 node cluster for Hyper-V virtual machine?

    Or 2 node guest cluster?

    what was powered off for testing? the VM or the Hyper-V server node?

    If i understand right, you are not able to migrate the VM with higher configuration. Please check for the resource crunch in the destination node. Memory over commitment is not supported in Hyper-V i.e. If a host has 128 GB of RAM, it can accommodate VMs, whose sum of the maximum memory of all the VMs, is less than 128GB.

    Regards,
    Bala N

    Tuesday, April 9, 2019 11:33 AM
  • Hi Bala,

    2 node cluster for Hyper-V and contains S2D as well, There is only one VM on the Cluster at present. and both Server Nodes are identical R640 with 128GB Ram with 2 x 8 core CPUs

    The Hyper-V Server Node was Powered Off to simulate failure.

    It migrates fine if we shutdown a or restart from SConfig from within either of the Hyper-V Server Nodes,

    We can live migrate between Hyper-V server nodes fine as well.

    so to sumerise

    If we power off a Node abruptly it will migrate if we have 1 CPU and 1 GB of RAM.

    If we do the same with 10 or 8 CPU's and 32 GB of RAM then it fails with the mentioned errors

    Thanks


    Tuesday, April 9, 2019 3:31 PM
  • hi,

    I strongly suspect the resource crunch as i said earlier. To isolate, keep the CPU as it is and reduce only the RAM to lesser limits and try

    I also suggest to go through the logs in both the nodes during this failure

    Eventvwr --> Applications and services --> Microsoft --> Windows --> Hyper-V VMMS, Hyper-V high availability and there are few more Hyper-V related event logs

    This should definitely give some idea about the issue being faced

    Regards,
    Bala N

    Tuesday, April 9, 2019 3:40 PM
  • Hi BenzylAlkonium

    I agree with Balasubramanian Naathan, please reduce the RAM of the VM and retry the live migration.


    Furthermore, if you’re moving clustered virtual machines, the Cluster Events node in Failover Cluster Manager usually collects all the relevant events. If they’ve been reset or expired from that display, you can still use Event Viewer at these paths:

    Applications and Service Logs\Microsoft\Windows\Hyper-V-High-Availability\Admin

    Applications and Service Logs\Microsoft\Windows\Hyper-V-VMMS\Admin

    The “Hyper-V-High-Availability” tree usually has the better messages, although it has a few nearly useless ones, such as event ID 21111, “Live migration of ‘Virtual Machine VMName’ failed.” Most Live Migration errors come with one of three statements:

    • Migration operation for VMName failed
    • Live Migration did not succeed at the source
    • Live Migration did not succeed at the destination

    These will usually, but not always, be accompanied by supporting text that further describes the problem. “Source” messages often mean that the problem is so bad and obvious that Hyper-V can’t even attempt to move the virtual machine. These usually have the most helpful accompanying text. “Destination” messages usually mean that either there is a configuration mismatch that prevents migration or the problem did not surface until the migration was either underway or nearly completed. You might find that these have no additional information or that what is given is not very helpful. In that case, specifically check for permissions issues and that the destination host isn’t having problems accessing the virtual machine’s storage.

    Furthermore, here're blogs discussed troubleshooting Hyper-V VM Live Migration in detail, hope these can help you. 

    https://www.altaro.com/hyper-v/troubleshooting-hyper-v-live-migration/

    https://searchservervirtualization.techtarget.com/tip/Troubleshoot-common-VM-live-migration-errors

    Please Note: Since the web site is not hosted by Microsoft, the link may change without notice. Microsoft does not guarantee the accuracy of this information.

    Highly appreciate your effort and time. If you have any question or concern, please feel free to let me know.

    Best regards,

    Michael


    Please remember to mark the replies as an answers if they help.
    If you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.com

    Wednesday, April 10, 2019 7:46 AM
    Moderator
  • Hi Bala,

    If it is the crunch, why does it migrate fine if we swap nodes when it is all up, and if we restart in the proper way it swaps the nodes fine.

    In the Hyper-V VMMS logs there are these errors:

    serveral Event iD 1106
    vm\service\fr\frutilities.cpp(11496)\vmms.exe!00007FF7AB66D32A: (caller: 00007FF7AB808231) Exception(9) tid(b28) 80070002 The system cannot find the file specified

    Thanks




    Wednesday, April 10, 2019 8:43 AM
  • Hi Michael,

    Update to testing

    Added a second TEST VM with 1GB RAM and 1 CPU

    Dropped first VM to 8GB of RAM 8 CPU's 

    Live Migration between nodes works Fine no issues (as it did with 32GB of RAM and 10 CPU's)

    Power off a Server Node and the First VM (8GB RAM and 8 CPU's)  Fails over and keeps running losing only one ping the second VM (1GB RAM and 1 CPU) sits Queued until timeout then restarts on Server Node. When we removed the second VM (1GB and 1 CPU) then do the test with only the original VM (8GB RAM and 8 CPU's) it live migrates fine between nodes, but when we power off the server node the VM does not fail over and then times out and restarts on the node that is still online.

    Just dropped the VM (8GB RAM and 8 CPU's) to just (1GB RAM and 8 CPU's) and the failover completes fine VM stays up.

    all the articles we have read including those that you linked seems to be aimed at no migration at all works rather than just the power off / node dies scenario / Simulation of a total host failure.

    What really seems to not make any sense at all is that the Live migration works fine when both nodes are up or when we gracefully shutdown / restart a Server Node.

    Thanks for looking at this as we are really struggling to find a solution.

     


    Wednesday, April 10, 2019 10:49 AM
  • I am confused by the description of your problem.  When you say that you are 'powering off' a node, how are you doing that?  Are you requesting a graceful power off or are you asking for an immediate shutdown (to simulate a power fail situation).  If the latter, it should not be expected to have any sort of migration from the host that was shut off.  The VM should simply start on the second node.  There is no way for a VM to migrate because that requires interaction between the nodes, and if there is no power on the one node, it can't cooperate.  

    However, if you are initiating a controlled power down on a node after a VM migration was initiated, then it makes sense that a small VM might complete its migration while a larger one would not.  Larger VMs take more time to migrate than do small VMs.  Powering down is not going to wait for the completion of the migration.  It would be expected that an in process migration of a larger VM would fail and then the VM would be restarted on a surviving node of the cluster.


    tim

    Wednesday, April 10, 2019 1:05 PM
  • hi,

    When you live migrate, only this VM migrates and I guess this would be your last VM to migrate while initiating live migration. By then there may not be a resource crunch on the destination. But when the node goes down abruptly, all the VMs try to migrate to the secondary node. So it may cause a resource crunch in the destination node. 

    I would also suggest one more thing, try to increase the priority of this VM in the failover cluster and other VMs to low or medium. In this case, there are chances this VM will successfully migrate but one or few other VMs may fail with the same.

    So if this is the case then it is clear that there is a resource crunch

    You said when the resources are in reduced limits it was smooth, so to get the exact issue, suggested that.

    But if you want to a manual calculation, consider the following

    Single node has 128GB as you said in your first post. So, what is the total amount of RAM for all the VMs in the cluster? The sum of maximum RAM allocated to all the VMs in a cluster should be less than 128GB to accommodate all the VMs during failover. Memory over-commit is not available in Hyper-V. So even if the VMs are allocated with dynamic RAM, the maximum memory allocated should be less than the physical memory available in the hyper-v server.

    Regards,
    Bala N

    Wednesday, April 10, 2019 1:52 PM
  • Hi Bala,

    We are only testing the cluster at present, so there is only the one VM and it only works if it has 1GB of RAM when I mentioned 2 VM's this was due to us adding another test VM, which had an even weirder result as the 8GB Ram one Failed Over but the 1GB one didn't.

    So to sum it up the most ram we have ever had is 32GB on one single VM.

    Thanks


    Wednesday, April 10, 2019 2:07 PM
  • Hi,

    Could you please check for further more events in the Hyper-V VMMS on both the server nodes (source and destination)

    you may see 1106, 21111, 1840

    Please share those logs so that it would be easy to find the cause

    Regards,
    Bala N

    Wednesday, April 10, 2019 2:50 PM
  • Hi Bala,

    Yes we are getting 1106 in the operation log, I said about it in a previous post

    we get the same 1106 errors even if the live migration completes

    Error,10/04/2019 15:11:38,Microsoft-Windows-Hyper-V-VMMS,1106,None,"vm\service\ethernet\vmethernetswitchutilities.cpp(124)\vmms.exe!00007FF73129CD27: (caller: 00007FF7313D70D9) ReturnHr(7) tid(ff0) 80070002 The system cannot find the file specified.
        Msg:[vm\service\ethernet\vmethernetswitchutilities.cpp(82)\vmms.exe!00007FF7312E58CF: (caller: 00007FF7311C3DDE) Exception(16) tid(ff0) 80070002 The system cannot find the file specified.

    "
    Error,10/04/2019 15:11:38,Microsoft-Windows-Hyper-V-VMMS,1106,None,"vm\service\ethernet\vmethernetswitchutilities.cpp(82)\vmms.exe!00007FF7312E58CF: (caller: 00007FF7311C3DDE) Exception(16) tid(ff0) 80070002 The system cannot find the file specified.
    "
    Error,10/04/2019 15:11:36,Microsoft-Windows-Hyper-V-VMMS,1106,None,"vm\service\fr\frctutilities.cpp(949)\vmms.exe!00007FF73141C35E: (caller: 00007FF7315EAC55) Exception(13) tid(ff0) 80070002 The system cannot find the file specified.
    "
    Error,10/04/2019 15:11:36,Microsoft-Windows-Hyper-V-VMMS,1106,None,"vm\service\fr\frutilities.cpp(11422)\vmms.exe!00007FF73146E1AE: (caller: 00007FF7316082D9) LogHr(5) tid(ff0) 80070002 The system cannot find the file specified.
        Msg:[vm\service\fr\frutilities.cpp(11388)\vmms.exe!00007FF73146DF45: (caller: 00007FF7316082D9) Exception(12) tid(ff0) 80070002 The system cannot find the file specified.

    "
    Error,10/04/2019 15:11:36,Microsoft-Windows-Hyper-V-VMMS,1106,None,"vm\service\fr\frutilities.cpp(11388)\vmms.exe!00007FF73146DF45: (caller: 00007FF7316082D9) Exception(12) tid(ff0) 80070002 The system cannot find the file specified.
    "
    Error,10/04/2019 15:11:36,Microsoft-Windows-Hyper-V-VMMS,1106,None,"vm\service\fr\frutilities.cpp(11526)\vmms.exe!00007FF73146D4F2: (caller: 00007FF731608231) LogHr(4) tid(ff0) 80070002 The system cannot find the file specified.
        Msg:[vm\service\fr\frutilities.cpp(11496)\vmms.exe!00007FF73146D32A: (caller: 00007FF731608231) Exception(11) tid(ff0) 80070002 The system cannot find the file specified.

    "
    Error,10/04/2019 15:11:36,Microsoft-Windows-Hyper-V-VMMS,1106,None,"vm\service\fr\frutilities.cpp(11496)\vmms.exe!00007FF73146D32A: (caller: 00007FF731608231) Exception(11) tid(ff0) 80070002 The system cannot find the file specified.
    "

    Thanks

    Wednesday, April 10, 2019 3:04 PM
  • Hi,

    Is your S2D clustered? or depending on only one node?

    Does the path exist in both the nodes? 

    Get-VMHostCluster -ClusterName <yourclustername>

    check the path is available in both the cluster nodes

    Also request you to check more on the S2D configuration and the disks configuration. Does the secondary node have enough space or the node can able to access the path in case of the primary node's failure?

    Regards,
    Bala N

    Wednesday, April 10, 2019 5:18 PM
  • Hi,

    Is your S2D clustered? or depending on only one node?  - yes S2D is clustered

    Does the path exist in both the nodes?  

    Get-VMHostCluster -ClusterName <yourclustername> not showing anything undershared storage path - No idea why not

    ClusterName     SharedStoragePath
    -----------     -----------------
    NAME

    check the path is available in both the cluster nodes

    Also request you to check more on the S2D configuration and the disks configuration. Does the secondary node have enough space or the node can able to access the path in case of the primary node's failure? - There is more than enough pace on both

    Regards,
    Bala N


    Thursday, April 11, 2019 7:38 AM
  • Hi

    where are your VHDs placed and what is the location of the config file of the VM? something like C:\cluster\storage.....

    Is that location available on the secondary node?

    Are the virtual switches configured with the same name in both the hosts?

    Regards,
    Bala N

    Thursday, April 11, 2019 10:35 AM
  • Yes in - C: \ ClusterStorage \ VM-NAME\Hyper-V

    Which is the same on both Nodes.

    Yes All Virtual Switches have the same names.

    Thursday, April 11, 2019 3:26 PM
  • Hi,

    Thanks for your update.

    This is a quick note to let you know that I am currently performing research on this issue and will get back to you as soon as possible. I appreciate your patience.

    If you have any updates during this process, please feel free to let me know.

    Best regards,

    Michael


    Please remember to mark the replies as an answers if they help.
    If you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.com


    Monday, April 15, 2019 9:44 AM
    Moderator
  • Hi Michael,

    I was on holiday last week and have just come back to the office today so I will spend some time on it today.

    We have seen this though in windows Admin Center

    that memory usage for the cluster shows 100% (I did a screen shot but cannot post it here till I am verified, not sure hwo to verify myself)

    when we look at each Host Node server they are only using a few GB if not the hosting server and a few plus the 32GB for the Virtual machine on the the one hosting the guest,

    Very odd



    Tuesday, April 23, 2019 7:46 AM
  • Hi,

    Sorry for my delay. How are things going on?

    For now, I couldn't get detailed information about your current situation, I'd confirm that the primary issue is that Live Migration failed and one of the host node occurred high Memory, is that?

    Would you mind typing the following cmdlets with Administrator rights and show me a result?

    tasklist /v

    So that we can check all the task information on this node. Please refer to this article for the details,

    https://www.digitalcitizen.life/how-print-list-running-processes-windows-7-windows-8

    Please Note: Since the web site is not hosted by Microsoft, the link may change without notice. Microsoft does not guarantee the accuracy of this information.

    Besides, we could check the following link to verify your TN account. Later you can upload something like screenshots. Please remember to cover up the information if it is sensitive related to your privacy for your security.  

    https://social.msdn.microsoft.com/Forums/en-US/090972cb-b81f-498f-b718-948caca975c4/verify-account-41?forum=reportabug

    Highly appreciate your effort and time. If you have any question or concern, please feel free to let me know.

    Best regards,

    Michael


    Please remember to mark the replies as an answers if they help.
    If you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.com

    Monday, April 29, 2019 10:43 AM
    Moderator