Virtual Machine lose network connectivity during live migration

Unanswered Virtual Machine lose network connectivity during live migration

  • mercoledì 16 maggio 2012 08:23
     
     

    Hi all,

    I've a two node Hyper-V cluster with Windows Server 2008R2 SP1 Datacenter Edition. For each node I've 4 NICs

    1 - management

    2 - heart beat

    3 - live migration and CSV

    4 - HyperV networking

    when I live migrate whatever virtual machine between the two hosts I lose network connectivity to the VM for few seconds just at the end of the live migration process (after 99%).

    What can I do?

Tutte le risposte

  • mercoledì 16 maggio 2012 08:47
     
     
    Please check out your event viewer, hyper-v, VMM logs for related information and let us know what you find in order to troubleshoot the issue any further.

    Mahair Ashaboon MCSE & Network+ www.extra-it.com

  • mercoledì 16 maggio 2012 09:09
     
     

    Hi!

    I don't have any official documentation links to present to you on this, so I can't guarantee MSFT authenticity on the below statement. This is simply what I've discovered:

    At the final stage of live migration, it's common to lose one ping to the VM. This is when the VM disconnects from the source hosts virtual switch and connecting to the detinations hosts virtual switch. It's like unplugging a network cable and quickly plugging it into another port.

  • mercoledì 16 maggio 2012 09:54
     
     

    Hi all,

    I've a two node Hyper-V cluster with Windows Server 2008R2 SP1 Datacenter Edition. For each node I've 4 NICs

    1 - management

    2 - heart beat

    3 - live migration and CSV

    4 - HyperV networking

    when I live migrate whatever virtual machine between the two hosts I lose network connectivity to the VM for few seconds just at the end of the live migration process (after 99%).

    What can I do?

    Hi Andrea,

    Du u using virtual LAN identification or only virtual network, and dont forget to unmount media.

    And du u using SAN storage and connectiong to it by fiber or u using iSCSI?


    • Modificato Ultra_Cool mercoledì 16 maggio 2012 11:23
    •  
  • mercoledì 16 maggio 2012 15:30
    Moderatore
     
     

    At the very end of a live migration there are moments where the VM is actually saved, the copy is finished and the VM is resumed.

    This produces an outtage that can affect a ping to a few - depending on the speed of your Live Migraiton network.

    The common applications that have problems with this are client dies applications that connect to a backend database server - where the client application nevr included any logic to reconnect to the database if the connection to the database was interrupted in any way, for even a moment.

    The best solution is to fix the client side of the application to support reconnections, or to tweak its settings to allow for retries, or to increase connection timeout settings.


    Brian Ehlert
    http://ITProctology.blogspot.com
    Learn. Apply. Repeat.
    Disclaimer: Attempting change is of your own free will.

  • giovedì 17 maggio 2012 06:55
     
     

    Hi,

    event viewer -> what section?

    hyper-v -> where are logs?

    VMM log's -> where are?

  • giovedì 17 maggio 2012 06:57
     
     
    I do not use LAN identification. No media mounted. SAN storage is fibre channel with 8GB/s dual active-active controller managed by Microsoft MPIO. The problem occurs with whatever VM either one with no activity!
  • giovedì 17 maggio 2012 13:15
    Moderatore
     
     

    We are all used to the System, Security, Application logs at the root. 

    You need to go to: "Applications and Services" -> Microsoft

    Under there you will find logs for Clustering, Hyper-V, etc.


    Brian Ehlert
    http://ITProctology.blogspot.com
    Learn. Apply. Repeat.
    Disclaimer: Attempting change is of your own free will.

  • giovedì 17 maggio 2012 14:07
     
     

    Have you followed the guidance in this article - http://technet.microsoft.com/en-us/library/dd446679(WS.10).aspx - for defining the network to be used for Live Migration?  If you have not done anything in your configuration to ensure that you are using the network you want for Live Migration, the Cluster might be using a network for Live Migration that is already in heavy use for other functions.  That could impact the Live Migration, too.

    Here is another article - http://technet.microsoft.com/en-us/library/ff428137(v=WS.10).aspx - that gives some more detail about ensuring the proper configuration of networks for Hyper-V.

    tim


    timcerl

  • giovedì 17 maggio 2012 15:09
     
     
    No errors were found in the above event log
  • giovedì 17 maggio 2012 15:10
     
     
    Yes I've followed both the article step by step
  • giovedì 17 maggio 2012 15:18
    Moderatore
     
     

    Both Mike and myself have pointed out that the VM will lose netowork connectivity for a moment at the end of the migration and this is normal and expected.

    http://blogs.msdn.com/b/virtual_pc_guy/archive/2009/01/29/hyper-v-r2-live-migration-overview-architecture.aspx

    http://blogs.msdn.com/b/virtual_pc_guy/archive/2010/03/01/hyper-v-whitepapers-guides-live-migration-configuration-storage-performance.aspx

    I guess what I don't understand is if this is new behavior for you? 

    Do you expect that you can design this out?  (Server 2012 with RDMA hardware can make this not noticible)

    Have you installed the latest Integration Components into the VMs?  (so that the ICs in the VMs match the patch level of the host.)

    This does not happen by itself or through Windows Update - if you have been patching the host and not manually upgrading the ICs in the VMs they will be out of sync.  This generally manifests first as problems at the network layer.


    Brian Ehlert
    http://ITProctology.blogspot.com
    Learn. Apply. Repeat.
    Disclaimer: Attempting change is of your own free will.

  • giovedì 17 maggio 2012 15:36
     
     

    Hi Brian,

    how can I check that the Integration Components match the host's patch level?

    Secound: This is not my first W2008R2-Sp1 cluster I know that at the end of the live migration there is a very short lost of connectivity. This denial of service is noticeable only by a icmp ping, If I've a remote desktop session opened to a live migrating VM I do not lose connectivity to the remote desktop session, but in this case I lose connectivity, for 2 o 3 seconds but the session goes down!

  • giovedì 17 maggio 2012 15:56
    Moderatore
     
     

    Okay.  And RDP session direct to the client - the Remote Desktop Client has all the conneciton logic on its side and knows how to reconnect and deal with the outage.  The 7.1 client usually does not report a disconnect unless it is about 10 seconds, it caches up.

    So, you know all about naming the virtual networks, and refreshing the VM configuraiton in the Cluster Manager if it is changed, etc.  Excellent.

    The easy way is to simply manually update the ICs in the VM using the Hyper-V Manager console window.  Just plan for an outage to handle the VM reboot.  This covers you if it is needed or not.  If not, they are not upgraded, if needed, they are.

    But hte versions can also be checked, and this is not as straightforward as it once was as driver versions are now all over the place.  Try to focus on the virtual network service on the host and the virutal NIC driver in the VM - since that is where your issue is.

    http://itproctology.blogspot.com/2008/04/determining-versions-under-hyper-v.html


    Brian Ehlert
    http://ITProctology.blogspot.com
    Learn. Apply. Repeat.
    Disclaimer: Attempting change is of your own free will.

  • giovedì 17 maggio 2012 19:42
     
     

    Can u turn off VM and make quick migration and after that start VM on another host? If u cannot start up vm on another host than is the problem with SAN.

    We had the problem in our environment similar to yours (when i try live migrate VM it comes to 99% and then fails) and i solved it with reconfiguring SAN. It was problem with quorum and storage disks. Something was wrong with configuration and when i delete both disks and after that when i create them again everything was working fine.


    • Modificato Ultra_Cool venerdì 18 maggio 2012 06:40
    •