none
NIC Team - Live Migration performance

    Question

  • I've got a 2 node Server 2012 Hyper-V cluster, and each node has 6 gigabit NIC ports. 2 are dedicated to iSCSI traffic, but the other 4 are in a Team. When I Live Migrate 20+ VMs, it seems to me that I should be able to achieve at least 2-3Gbps (adjusting for network overhead and such) transfers. I realize there are a great number of ways to configure a team in Server 2012, and I've experimented with all of them: Static Teaming, Switch Independent, LACP, and each of these can either be Address Hash or Hyper-V port. I've occasionally seen brief spikes up to 1.5Gbps, but that's it. The rest of the time it's at a very steady 1Gbps.

    I've got these servers connected to 2 ProCurve 2910al switches with a 10gb backplane, and the switches are not being taxed with any other workloads. When I use the LACP teaming, I have tried both LACP-enabled trunks and enabling Active LACP on the relevant ports. 

    My biggest question is: is this even feasible? Has anyone achieved 2Gbps+ Live Migration speeds without using 10gb NIC cards?

    Anyone? Bueller?

    mercredi 20 février 2013 22:34

Réponses

  • This was actually one of my questions to clarify with my MVP peers (only for 10GBit team). So the answer got is:

    No not with a 2 node cluster

    yes with a 2+ node cluster

    Because all Livemigration traffic between two hosts uses one TCP stream and teaming needs more than one stream. 

    But I didn't test it yet :-)


    Grüße/Regards Carsten Rachfahl | MVP Virtual Machine | MCT | MCITP | MCSA | CCA | Husband and Papa | www.hyper-v-server.de | First German Gold Virtualisation Kompetenz Partner ---- If my answer is helpful please mark it as answer or press the green arrow.

    vendredi 22 février 2013 18:14
  • JakesterPDX:

    Not only can you achieve better than 1 Gbps in live migration, you should be able to approach 4Gbps in your 4 NIC team subject to the following restrictions:

    1. The Live Migrations must not all be destined for the same remote machine.  Live migration will only use one TCP stream between any pair of hosts.  Since both Windows NIC Teaming and the adjacent switch will not spread traffic from a single stream across multiple interfaces (because it causes potential out-of-order delivery of the TCP packets) live migration between host A and host B, no matter how many VMs you're migrating, will only use one NIC's bandwidth.  To get more bandwidth try migrating your 20 VMs to 5 other hosts (e.g., 4 to each other host) and see how well it does.

    2. You must use Address Hash (TCP ports) for the NIC Teaming. HyperVPorts mode will put all the outbound traffic, in this case, on a single NIC. 

    How do I know this?  I'm the PM for NIC Teaming in Windows Networking.  All of this is covered, by the way, in the NIC Teaming User's Guide (search for Windows Server 2012 NIC Teaming Deployment and Management).  Feedback on the document is welcome.  Email me at don.stanwyck -at- microsoft.com.  I'm not usually on this forum.

    Don Stanwyck

    jeudi 28 février 2013 18:44
  • "In total" I think is the key you were leaving out before. Yes, of course you can hit those speeds (cumulatively). They way you were describing it before sounded like you meant on a single transfer (regardless of how many were going simultaneously). Anyway, yes.
    samedi 23 février 2013 05:16
  • Frustrating for us, too.  I'm with the others in that I interpreted your initial explanation in a totally different manner from what you meant to say.  Now we are getting shot at when trying to figure out what might be the issue.

    Yes, you can get aggregate throughput greater than the speed of a single NIC.  Yes, I have 10 GE connections, but the same principles apply.  I get nearly 20 Gbps of throughput while two two or more live migrations.

    Are you using SCVMM?  If so, have you set it up to allow for multiple live migrations to occur simultaneously?  By default, there is a maximum number of live migrations that can occur simultaneously.  How do you start the simultaneous LMs?  How many do you see running simultaneously?  How are you measuring your throughput?


    .:|:.:|:. tim

    mercredi 27 février 2013 13:27
  • I would read this (and the surrounding sections) to understand the different teaming methods. In fact, I would read the whole doc (as should anyone who doesn't fully understand teaming). Using address hash (when setup correctly), you should expect to utilize more than one member in a team when migrating to separate hosts.

    http://www.aidanfinn.com/?p=14032

    jeudi 28 février 2013 17:15
  • Aha! It finally all makes sense. Thanks so much, Don. Now I understand why I've gotten seemingly-contradictory answers: if you've got a lot of nodes, and you simultaneously migrate to all of them, then you can utilize the full bandwidth of your team. If you're like me, and you've only got 2 nodes, you're never going to see more than 1Gbps.

    So, now I have to decide the best way to use my 8 NICs. Instead of my original plan of teaming them all together, I'm now thinking of something like this:

    2 for iSCSI
    3 in a team for VMs
    2 in a team for Live Migration and Heartbeat
    1 for management

    I really have a 9th NIC for out of band management, so if the 1 for management dies I can still get to the server. 

    jeudi 28 février 2013 22:31

Toutes les réponses

  • I can't say that I've seen 2Gbps on live migration. But I did find I had to update my NIC drivers before I could get descent speed with the migrations (due to a bug in the RX buffer) What type of NICs are you using and what drivers are being used? I has using broadcom NICs in a Dell PowerEdge 420

    John Moore

    mercredi 20 février 2013 22:53
  • I've got a Dell R310 and an HP DL360p Gen8. For NICs, I've got the latest drivers for all the NICs. For the HP, it's all HP (331T and 331 FLR), and for the Dell it's a mix of Broadcom and HP (BCM5716C and 331T). What speed did you see on your migrations? Any idea why we can't exceed 1Gbps?
    mercredi 20 février 2013 23:01
  • I only have one 1gb nic for the live transfer so I've never seen it any faster, but wasn't even getting that fast until my drivers were updated. Are all four nics in one team? are they all broadcom or hp? 

    John Moore

    jeudi 21 février 2013 00:05
  • Hi,

    > I've got a 2 node Server 2012 Hyper-V cluster, and each node has 6 gigabit NIC ports.
    > 2 are dedicated to iSCSI traffic, but the other 4 are in a Team.

    For the network teaming, some network bandwidth is used by virtual machine access and Management.

    As best practice, for each node of the failover cluster, use more than one network adapter and configure at least one network adapter for the private network. We recommend that you configure separate dedicated networks with gigabit or faster speed for live migration traffic and cluster communication, and these networks should be separate from the network used by the management operating system, and from the network used by the virtual machines.

    You may set 1 NIC dedicated for Virtual Machine access and Management, Management will use bandwidth capped at 10%. Then use the rest 3 NICs to create a teaming for live migration.

    For more information please refer to following MS articles:

    Hyper-V: Live Migration Network Configuration Guide
    http://technet.microsoft.com/en-us/library/ff428137(v=WS.10).aspx
    Hyper-V : How many network cards do I need?
    http://blogs.technet.com/b/gavinmcshera/archive/2011/03/27/3416313.aspx

    Hope this helps!

    TechNet Subscriber Support

    If you are TechNet Subscription user and have any feedback on our support quality, please send your feedback here.

     



    Lawrence

    TechNet Community Support

    jeudi 21 février 2013 05:25
    Modérateur
  • I'm still looking for a simple yes or no answer to my basic question, so I'll repeat it here: Can you achieve 2Gbps+ Live Migration speeds without using 10gb NIC cards?
    jeudi 21 février 2013 20:51
  • Are you using jumbo frames on the NICs?  There is a lot less overhead on jumbo frames for LiveMigration.  Of course, the fact that you have CSV, Management, and cluster communications on the same NIC is not a best practice, either.  I can't say that I have ever tried with 1GE NICs, as I'm fortunate enough to have 10GE whenever I want it.

    .:|:.:|:. tim

    jeudi 21 février 2013 21:46
  • That's really great for you, Tim. I sincerely doubt that either jumbo frames or some piddly Management and cluster communications would take up 3Gbps of my traffic, thus explaining why I'm only getting 1Gbps. My CSV traffic is over my iSCSI network, which is separate, and has jumbo frames enabled.

    Again, I would really like an answer to this question: Can a multiple gigabit NIC Team achieve 2Gbps+ Live Migration speeds without using 10gb NIC cards?

    jeudi 21 février 2013 21:49
  • Hi,

    > Can a multiple gigabit NIC Team achieve 2Gbps+ Live Migration speeds without using 10gb NIC cards?

    Currently, I don’t have such a physical lab environment in my hand, I can’t confirm that. But I think the answer is Yes.

    My point of view is that:

    NIC teaming itself aggregate bandwidth from multiple network adapters, so four 1-gigabit network adapters’ teaming can provide maximum of 4Gbps throughput.

    For your scenario, I think the bottleneck is not the 4 NICs live migration network teaming, but the 2 gigabit NICs for iSCSI. Since they can hold maximum 2Gbps data read and write, and I think that is the reason you have 1.5 Gbps live migration speed.

    Again, we recommend that configure separate dedicated networks for live migration traffic and cluster communication, and these networks should be separate from the network used by the management operating system, and from the network used by the virtual machines.

    For more information please refer to following MS articles:

    Live Migration Network Configuration Guide
    http://technet.microsoft.com/en-us/library/ff428137(v=WS.10).aspx
    Requirements for Using Cluster Shared Volumes in a Failover Cluster in Windows Server 2008 R2
    http://technet.microsoft.com/en-us/library/ff182358(WS.10).aspx
    Network Interface Card (NIC) Teaming
    http://technet.microsoft.com/en-us/library/hh997031.aspx

    Hope this helps!

    TechNet Subscriber Support

    If you are TechNet Subscription user and have any feedback on our support quality, please send your feedback here.

     



    Lawrence

    TechNet Community Support

    vendredi 22 février 2013 07:25
    Modérateur
  • This would almost be funny if it wasn't so important that I get a good answer.

    I'll try asking again: Can a REAL WORLD ENVIRONMENT (not a theoretical one) with a 2 or more nodes each having FOUR (or more) gigabit NIC cards in a Team achieve 2Gbps+ Live Migration speeds (when multiple VMs are migrating simultaneously)?

    Unhelpful answers include: "Gee, I've got 10gb, and it works great there," or "In theory, yes," or "Best practices dictate blah blah blah"

    An answer I'm desperately trying to get would be: "I've actually done this and I see x Gbps transfers." Honestly, if you don't have something similar to that to say, please stop responding here, because it's not helping. I've got a SANDBOX with absolutely no traffic of any significance in it, so trying to blame this on needing a separate network for cluster communication is just ridiculous.

    vendredi 22 février 2013 16:34
  • This was actually one of my questions to clarify with my MVP peers (only for 10GBit team). So the answer got is:

    No not with a 2 node cluster

    yes with a 2+ node cluster

    Because all Livemigration traffic between two hosts uses one TCP stream and teaming needs more than one stream. 

    But I didn't test it yet :-)


    Grüße/Regards Carsten Rachfahl | MVP Virtual Machine | MCT | MCITP | MCSA | CCA | Husband and Papa | www.hyper-v-server.de | First German Gold Virtualisation Kompetenz Partner ---- If my answer is helpful please mark it as answer or press the green arrow.

    vendredi 22 février 2013 18:14
  • Why would TCP streams be capped at 1Gbps? 

    What did you mean by "clarify with my MVP peers (only for 10GBit team)"?

    samedi 23 février 2013 00:00
  • What do you espect by a team? Magic? One stream only uses NIC.


    Grüße/Regards Carsten Rachfahl | MVP Virtual Machine | MCT | MCITP | MCSA | CCA | Husband and Papa | www.hyper-v-server.de | First German Gold Virtualisation Kompetenz Partner ---- If my answer is helpful please mark it as answer or press the green arrow.

    samedi 23 février 2013 00:31
  • That's just how TCP works... I think there is some confusion on what you expect the team to actually do. Yeah you can run 1 stream only at 1Gbps, but you can run 3 simultaneous streams now... and don't forget the obvious link failure tolerance advantages of a team.
    samedi 23 février 2013 01:33
  • I was hoping that each Live Migration could start its own stream.
    samedi 23 février 2013 04:54
  • That is what it does. Each migration would be a separate stream between the source and destination hosts (that's a very basic explanation). Therefore, if you had 3 migrations going, each one would most likely choose a different NIC from your team (perfect world) and you would get 3x1Gbps transfers.
    samedi 23 février 2013 05:08
  • OMG, are you all trying to drive me insane??? I've written this about 5 times in the previous posts above, but I'll post it again here, since apparently no one can read anything except the previous post:

    Can a REAL WORLD ENVIRONMENT (not a theoretical one, not how it should work) with a 2 or more nodes each having FOUR (or more) gigabit NIC cards in a Team achieve 2Gbps+ Live Migration speeds in total when multiple VMs are migrating simultaneously?

    samedi 23 février 2013 05:12
  • "In total" I think is the key you were leaving out before. Yes, of course you can hit those speeds (cumulatively). They way you were describing it before sounded like you meant on a single transfer (regardless of how many were going simultaneously). Anyway, yes.
    samedi 23 février 2013 05:16
  • Glad to hear it's possible, so now the obvious question is: why am I capped at 1Gbps. Does anyone have any specific configurations that they know to work here? For example, LACP, Hyper-V teams, or Switch Independent Address Hash? 
    samedi 23 février 2013 22:22
  • Now I'm confused on where your problem is. After going back to your original post, are you saying you run multiple transfers with a total speed capped at 1Gbps?
    samedi 23 février 2013 22:58
  • YES! That's what I'm saying. I'm deliriously happy that I've finally communicated the problem. Woohoo!!!

    It doesn't matter if I simultaneously Live Migrate 1 VM or 20 VMs, I can only ever utilize a total of 1Gbps of my 4 NIC team.

    dimanche 24 février 2013 18:38
  • Let's check your team status first. If you open server manager and click on the "NIC Teaming" link, you can check the status of your team's adapters in the lower right. Do they all say active?

    dimanche 24 février 2013 18:57
  • Yes they do.
    dimanche 24 février 2013 18:59
  • Hello,

    same problem here ... 2 x 1 Gbps Team (HyperV, Switch independent) -> allways max. speed 1 Gbps (1, 2, 3, ... simultanous LM's).

    We also tried vNIC (converged network), same limit to 1 Gb.

    mardi 26 février 2013 09:32
  • Apparently we should be able to go faster than 1Gbps, but no one seems to know more than that. Very frustrating.
    mardi 26 février 2013 20:54
  • Frustrating for us, too.  I'm with the others in that I interpreted your initial explanation in a totally different manner from what you meant to say.  Now we are getting shot at when trying to figure out what might be the issue.

    Yes, you can get aggregate throughput greater than the speed of a single NIC.  Yes, I have 10 GE connections, but the same principles apply.  I get nearly 20 Gbps of throughput while two two or more live migrations.

    Are you using SCVMM?  If so, have you set it up to allow for multiple live migrations to occur simultaneously?  By default, there is a maximum number of live migrations that can occur simultaneously.  How do you start the simultaneous LMs?  How many do you see running simultaneously?  How are you measuring your throughput?


    .:|:.:|:. tim

    mercredi 27 février 2013 13:27
  • Yes, you can get aggregate throughput greater than the speed of a single NIC.  Yes, I have 10 GE connections, but the same principles apply.  I get nearly 20 Gbps of throughput while two two or more live migrations.

    Tim are you live migration to the same host or to more than one host when you saturate 20 GBit?

    Grüße/Regards Carsten Rachfahl | MVP Virtual Machine | MCT | MCITP | MCSA | CCA | Husband and Papa | www.hyper-v-server.de | First German Gold Virtualisation Kompetenz Partner ---- If my answer is helpful please mark it as answer or press the green arrow.

    mercredi 27 février 2013 13:33
  • I have VMM 2012 SP1, but haven't really used it for much yet. I created the cluster using FCM, and have set the max # of migrations to be 14 using HVM on each node of the cluster. SCVMM detects that maximum number, though, and displays it.
    mercredi 27 février 2013 17:03
  • I measure the speed by using the new-and-improved Task Manager in Server 2012 on both nodes.
    jeudi 28 février 2013 16:49
  • What team mode are your teams configured with now?
    jeudi 28 février 2013 17:03
  • I've tried them all, but at the moment I'm using LACP/Address Hash on both nodes. I've created LACP trunks on my switches.
    jeudi 28 février 2013 17:04
  • I would read this (and the surrounding sections) to understand the different teaming methods. In fact, I would read the whole doc (as should anyone who doesn't fully understand teaming). Using address hash (when setup correctly), you should expect to utilize more than one member in a team when migrating to separate hosts.

    http://www.aidanfinn.com/?p=14032

    jeudi 28 février 2013 17:15
  • Yeah, I have read it, and I do understand teaming, and it's still not working.

    I've also tried telling the servers that they should work - so far that hasn't helped.

    jeudi 28 février 2013 17:20
  • As I understand you have a two node Cluster => so it is working you get 1 GBit/s live Migration. Only with tree ore more nodes you can actual get more than 1 GBit live Migration. As I said bevore it is no magic like SMB3.

    Grüße/Regards Carsten Rachfahl | MVP Virtual Machine | MCT | MCITP | MCSA | CCA | Husband and Papa | www.hyper-v-server.de | First German Gold Virtualisation Kompetenz Partner ---- If my answer is helpful please mark it as answer or press the green arrow.

    jeudi 28 février 2013 17:52
  • JakesterPDX:

    Not only can you achieve better than 1 Gbps in live migration, you should be able to approach 4Gbps in your 4 NIC team subject to the following restrictions:

    1. The Live Migrations must not all be destined for the same remote machine.  Live migration will only use one TCP stream between any pair of hosts.  Since both Windows NIC Teaming and the adjacent switch will not spread traffic from a single stream across multiple interfaces (because it causes potential out-of-order delivery of the TCP packets) live migration between host A and host B, no matter how many VMs you're migrating, will only use one NIC's bandwidth.  To get more bandwidth try migrating your 20 VMs to 5 other hosts (e.g., 4 to each other host) and see how well it does.

    2. You must use Address Hash (TCP ports) for the NIC Teaming. HyperVPorts mode will put all the outbound traffic, in this case, on a single NIC. 

    How do I know this?  I'm the PM for NIC Teaming in Windows Networking.  All of this is covered, by the way, in the NIC Teaming User's Guide (search for Windows Server 2012 NIC Teaming Deployment and Management).  Feedback on the document is welcome.  Email me at don.stanwyck -at- microsoft.com.  I'm not usually on this forum.

    Don Stanwyck

    jeudi 28 février 2013 18:44
  • Aha! It finally all makes sense. Thanks so much, Don. Now I understand why I've gotten seemingly-contradictory answers: if you've got a lot of nodes, and you simultaneously migrate to all of them, then you can utilize the full bandwidth of your team. If you're like me, and you've only got 2 nodes, you're never going to see more than 1Gbps.

    So, now I have to decide the best way to use my 8 NICs. Instead of my original plan of teaming them all together, I'm now thinking of something like this:

    2 for iSCSI
    3 in a team for VMs
    2 in a team for Live Migration and Heartbeat
    1 for management

    I really have a 9th NIC for out of band management, so if the 1 for management dies I can still get to the server. 

    jeudi 28 février 2013 22:31
  • Why would Live Migration be restricted to a single TCP stream between two hosts?

    In my reading, I missed that, and was working of the incorrect interpretation of 'simultaneous migrations' as being separate processes (and include simultaneous, multiple TCP streams.)

    I am in the same sort of configuration as the original poster- small shop, and transfers from one host would very likely all be headed for the same physical host, maybe two different ones at best. So HyperV being able to use multiple TCP streams for live migrations would be a *significant* benefit for us.

    Is this enabled in 2012 R2 perhaps?

    Thanks, Bill

    lundi 1 juillet 2013 15:57
  • Hi Bill,

    I think there are two improvements in R2: compression and SMB 3, but it sounds like they don't work in conjunction - you pick one or the other, and in small shops (such as ours), compression is going to yield the most benefit. For some bizarre reason this forum isn't allowing me to post links so I'll have to get creative about the link I read on this. It's on aidanfinn dot com, then add /?p=14907 to the end.

    lundi 1 juillet 2013 16:55