none
Live Migration of VM's failing after moving to Server 2016 Hyper-V Hosts

    Question

  • Hi everyone, and I am hoping that someone can help me out here.

    I have a peculiar issue after moving all my 2012 R2 Hyper-V based Virtual Machines from some really old Dell Rack Server hardware to a brand new Cisco UCS.  On the existing 2012 R2 Cluster, each host had a Virtual Switch called "VM-TEAM" and each VM were connected to this external Switch.  I then built 3 new Server 2016 Servers on the Cisco UCS, and added them into the existing 2012 R2 Hyper-V Cluster (thank you Microsoft for enabling mixed mode!!).

    In order to be able to live migrate these VMs across to the new Server 2016 Servers , I had to name the external VM Switch on the Cisco UCS to VM-TEAM, which wasn't the new naming convention I wanted but I thought I could rename this at a later stage.  Once the VMs were Live Migrated across to the new Server 2016 Servers, I then evicted the 2012 R2 Servers from the cluster and upgraded the cluster mode etc.  I then renamed the Hyper-V Switch to my new standard, as the Cisco UCS blades now have two external Switches called VM-SWITCH-A and VM-SWITCH-B.  All the even numbered VMs go on SWITCH-A and all odd numbers VMs go on SWITCH-B, so that there is some load balancing between the two Fabric Interconnects.

    After renaming the external Switch from VM-TEAM to VM-SWITCH-B, I am unable to Live Migrate VM's between the Server 2016 hosts as it seems to be still looking for the VM-TEAM Switch even though the VM is connected to the VM-SWITCH-B.  So this in appearing in the event log:

    Live migration of 'Virtual Machine VMNAME' failed.

    Virtual machine migration operation for 'VMNAME' failed at migration destination 'HOSTNAME'. (Virtual machine ID 97606A03-1EF0-40B1-9D03-XXXXXXXX)

    The virtual machine 'VMNAME' is not compatible with physical computer 'HOSTNAME'. (Virtual machine ID 97606A03-1EF0-40B1-9D03-XXXXXXXX)

    Could not find Ethernet switch 'VM-TEAM'.

    This is the VM Switch that the VM is connected to:

    How can I resolve this issue without renaming the Hyper-V Switch back to VM-TEAM?  I would have thought that because the VM is now seeing the Virtual Switch name as VM-SWITCH-B it would be looking for that name on the other hosts?

    Any help is greatly appreciated!


    Robert Milner | Website: http://www.remilner.co.uk | Twitter: @robm82



    • Edited by robm82 Thursday, March 30, 2017 2:38 PM
    Thursday, March 30, 2017 2:36 PM

All replies

  • Hi Sir,

    I'd suggest you to run "cluster validation" first for 2016 cluster (exclude the storage validation ).

    Any further information please feel free to let us know .

    (I only tried to change switch name for my 2012R2 cluster environment , LM works fine )

    Best Regards,

    Elton


    Please remember to mark the replies as answers if they help.
    If you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.com.

    Friday, March 31, 2017 6:22 AM
    Moderator
  • Hi Elton_Ji,

    I have run the cluster validation wizard as you suggested, and there are some warnings which I have reviewed but it is warning due to some VM's being powered off and due to a couple of patches missing off one of the hosts.  If you have any further ideas on how I can get this fixed, it would be greatly appreciated.

    Many thanks.


    Robert Milner | Website: http://www.remilner.co.uk | Twitter: @robm82

    Friday, March 31, 2017 7:38 AM
  • Stay with VM-TEAM instead of using two virtual switches.  By you defining two virtual switches and manually assigning to the VMs, you are trying to do what UCS does normally.  It is Cisco's recommendation to create a single vNIC that is enabled for failover to both fabrics and then create your virtual switch on this instead of creating.  You will find a document on Cisco's web site defining recommended practices for configuring Hyper-V on Cisco UCS.  http://www.cisco.com/c/dam/en/us/solutions/collateral/borderless-networks/advanced-services/Whitepaper_C11-730725.pdf  I am the author of this document.


    tim

    Friday, March 31, 2017 2:08 PM
  • Hi Tim,

    Thanks for the reply - I agree in this case, and past deployments I have only issued a single vNIC for VM traffic, and I did have failover enabled within the Cisco UCS.  However, after a bit more reading on the Internet I decided to have two vNICs on the Cisco UCS for VM traffic - one on Fabric A with failover enabled and one on Fabric B with failover enabled.  So if a Fabric does fail, then the vNICs will failover.   The idea behind deploying it in this way, meant that I could load balance the VM traffic between the two FI's a bit better rather than all traffic going through a single FI and down a single 10GB link.

    I wouldn't have thought this would cause a problem from a Cisco UCS standpoint nor a Hyper-V standpoint.  As previously stated, the error appears to be with some of the VM's.  There are only about 5 or 6 VM's that are refusing to Live Migrate, and gives the error as stated above.  To me it feels like the Virtual Machine configuration just hasn't refreshed...!


    Robert Milner | Website: http://www.remilner.co.uk | Twitter: @robm82

    Friday, March 31, 2017 3:47 PM
  • No, if you are going to use physical NIC teaming, DO NOT use UCS fabric failover.  That is NOT the way Cisco recommends. The FIs are already in active/active mode, so your efforts to try to make it more efficient means you are trying to 'out-think' the engineering work the UCS engineers have already done.  If you want to use NIC teaming, do not use fabric failover.  Using two forms of failover on the same network is a way to run into issues.

    You will find many documents on Cisco's site that were developed by Cisco engineers explaining their recommended configurations for Hyper-V.  You will not find one that uses the configuration you are trying to put together.  In fact, I wrote the document I referenced earlier (I recently retired from my engineering position at Cisco) in part to get the word out to the Cisco field on the recommended way to configure VM networks on Hyper-V.  We engineers were taking too many calls to explain the same thing over and over.

    As for why the change in the virtual switch is not taking, I'm not sure.  You say you 'renamed' the switch.  I think if you had created a new switch and made the change, it should have taken.  That is how I generally did it.


    tim

    Saturday, April 1, 2017 1:44 PM
  • Hi Tim,

    Thanks for the feedback.  I am not using physical NIC teaming - what I am doing is presenting a vNIC from Fabric A and Fabric B from the UCS to the Hyper-V nodes.  The even numbered VM's are getting assigned to Fabric A vNIC, and the odd numbered VM's are assigned to Fabric B.  I am letting the Cisco UCS handle all the failover, as I previously tried NIC teaming in my first Cisco UCS deployment and I had a lot of problems with it.  I read on a blog that you should let the Cisco UCS do all the clever stuff, and just keep the NIC deployment simple which is hopefully what I have done.  The reason for doing it this way, is so that I can try and have an even amount of Virtual Machine traffic going down each FI rather than having a single FI overloaded with VM traffic.

    Yes I renamed the Switch in Hyper-V Manager on each Hyper-V host, but only a couple of VM's are having the issue.  It certainly is a strange one to say the least!

    Thanks for all the input so far


    Robert Milner | Website: http://www.remilner.co.uk | Twitter: @robm82

    Friday, April 7, 2017 2:50 PM