Hyper-V Network connection dies randomly
-
Wednesday, May 12, 2010 11:42 PMI have 3 virtual hosts (Dell R610s) using an MD3000i SAN and to Dell switches. Windows 2008 R2 is used as the OS for the host and virtual operating systems. There are usually 3-4 virtual servers running on a virtual host. All systems run great, but occasionally, every couple of weeks one of the two database servers (MSSQL 2008) the virtual NIC becomes unresponsive. When logged in through cluster manager to the affected machine the NIC shows no errors, but can not ping. A repair of the network connection has no effect. If we reboot the virtual server it works fine again. The other virtual servers on the virtual host have no loss of connectivity. Also, moving the virtual server to another host doesn't resolve the issue. Any advice would be appreciated.
Answers
-
Thursday, May 13, 2010 2:34 AMModerator
Hi,
Please refer to the following posts to check whether you can resolve the issue.
Remote Desktop issue? When Virtual network created (PC only have one NIC adaptor)
Hyper-V VM (Guest OS Win 2008) weird intermittent connection loss to Oracle database or unable to maintain steady connection to Oracle database
Vincent Hu
- Marked As Answer by Vincent HuMicrosoft Contingent Staff, Moderator Tuesday, May 18, 2010 9:03 AM
All Replies
-
Thursday, May 13, 2010 2:34 AMModerator
Hi,
Please refer to the following posts to check whether you can resolve the issue.
Remote Desktop issue? When Virtual network created (PC only have one NIC adaptor)
Hyper-V VM (Guest OS Win 2008) weird intermittent connection loss to Oracle database or unable to maintain steady connection to Oracle database
Vincent Hu
- Marked As Answer by Vincent HuMicrosoft Contingent Staff, Moderator Tuesday, May 18, 2010 9:03 AM
-
Thursday, May 13, 2010 3:59 AM
We also have few host with 4-5 VM on each.
And one VM behave exactly as you described.
I suspected MAC collision first, but that was not the case.
Take a look at ping error messages.
From any computer except host, it’ll be “timeout”
For host itself it was “host not reachable”
I’m sure something is getting messed up on virtual switch in about two weeks (coincidence?)
And strangely enough only one VM is affected – everything else run smoothly.
The VM has two NICs – only one is affected.
Only VM reboot or migrating another host restores connectivity.
And
Next time I want try:
Disable/enable VM NIC
Switch it to different virtual network and back again.
IMO, looks like a subtle bug…
To Vincent:
Is there a way to query virtual switch, like what MAC address it knows, etc?
Or flash MAC table?
-
Thursday, May 13, 2010 5:42 AMModerator
Hi,
You can refer to the following post.
Script to get Hyper-V virtual switch info - NVSPINFO.JS
http://social.technet.microsoft.com/Forums/en-US/ITCG/thread/c358f4ad-fd04-44f0-b77f-af824e53445b
Vincent Hu
-
Monday, May 17, 2010 7:23 AM
Same issue here: 2 Dell R710 servers as hosts, MD3000i, Dell switches. 4 Hyper-V guests (3x 2008 R2, 1x 2003). All of the 2008 guests show those random connection losses (usually after 1-4 weeks). Deactivating the NIC on the host (dedicated NICs for every VM) or changing the external NIC for the guest does not help. In the guest OS the NIC stays active and does not show any errors. However I am unable to deactivate it (no response from the GUI). After a reboot of the VM everything is back to normal.
Just one thing to add: the 2003 VM has never shown any problems.
-
Wednesday, May 19, 2010 12:16 AM
Hi,
You can refer to the following post.
Script to get Hyper-V virtual switch info - NVSPINFO.JS
http://social.technet.microsoft.com/Forums/en-US/ITCG/thread/c358f4ad-fd04-44f0-b77f-af824e53445b
Vincent Hu
Vincent,
Just now got a VM with one NIC lost connectivity.
I run as you suggested
$NameSpace = "Root\Virtualization"
$HyperV = "."
$VM = "1CA9125D-EDDA-4BF8-8715-BCAF546E1A46"
get-wmiobject -Name $NameSpace -Com $HyperV -Class Msvm_SyntheticEthernetPort | ? { $_.SystemName -eq $VM } | Out-File "C:\$VM" -encoding ASCII
and than
$VM = "5513D1FD-3866-4080-BAA8-532A0F5AEC4C"
I’ve compared output but found nothing different (except MAC/GUIDs)
Pinging from host to VM: Reply from XYZ: Destination host unreachable.
Do you want me to run some diagnostic get more info?
I can access VM still through host and external NIC.
If I reboot or failover it – it’ll recover…
-
Wednesday, May 19, 2010 12:33 AM
One more thing I want to add:
If I ping internal IP (the lost one) from any of two hosts I got:
Destination host unreachable
And “arp –a” doesn’t have record for the IP...
But if I ping the IP from another VM on same host or different one, I got:
Request timed out.
But “arp –a” does show IP/MAC record for the IP!
I’m puzzled…
-
Tuesday, June 29, 2010 9:00 AM
Just had the same problem.
A Hyper-V server with 1 guest running off the local hard disk and 2 guests running as part of a cluster off an iSCSI array. All guests are connected to the same virtual switch. The guests are all Win2008R2.
Local guest - no connection until reboot. 1 clustered guest - no connection until live migration to another host (or reboot). 1 clustered guest - OK.
I looked through the event logs on the Hyper-V server and the guest, but can't find anything relating to network issues.
-
Wednesday, August 04, 2010 2:27 AMI have the same problem, Same hardware and SAN. Did you ever figure it out?
-
Thursday, August 05, 2010 6:58 AM
Unfortunately not. At the moment we do weekly reboots of the host servers, which seems to help. Dell has asked us to download and install the latest firmware DVD (doh...) which we will try in some time in the near future.
Any real help is still welcome!
-
Sunday, August 08, 2010 9:38 PM
Hi,
I had a similar problem in a two node cluster where it ended up that the guests on one host couldn't ping the guests on the other host, and one physical server couldn't ping those guests (pinging to/from other physical servers was fine, no idea what's different about that one physical server). Shifting all guests onto one host 'fixed' the problem, but the problem would then start on the 'good' host. My hosts use identical hardware, the teams created for the VM virtual switch are on Intel Pro 1000 quad port PT LP adapters.
Here's what I did to fix it, live migrated the guests off so I could muck around with a host, deleted the virtual switch, deleted the team that the switch was using, updated the network drivers to the latest package, and on all of the network adapters that belonged to the team (my adapter is a quad port so I did this for all 4 ports) went into the Advanced tab on each and disabled 'Large Send Offload (IPv4)', 'Large Send Offload (IPv6)' and went into TCP Offloading Options on the same tab and disabled 'TCP Checksum Offload (IPv4)', 'TCP Checksum Offload (IPv6)', 'UDP Checksum Offload (IPv4)', 'UDP Checksum Offload (IPv4)'. Rebooted after that and recreated the team, recreated the virtual switch. I then live migrated the guests to that host and did the same to the other host. I also disabled spanning-tree on the switches for the ports in the VM team, but I don't think I had a problem there.
This fixed it for me. I only worked this out 2 days ago though, so time will tell.
Here's a couple of links that helped me decide to disable the offloading feature
http://hyper-v.nu/blogs/hans/?p=162
I hope this helps someone, if it does please post back with your results as I'm keen to know if disabling TCP Offload is a longterm fix.
-
Tuesday, August 10, 2010 11:58 AM
I',ve the same problem with a singole cluster on a Dell PowerEdge R710.
Randomly the Guest's Machine Network Card becomes unreachable, and the only way to solve the problem is restart the guest machine.
Thanks in advance to everyone could help me.
-
Monday, August 16, 2010 3:32 PM
Here's what I did to fix it, live migrated the guests off so I could muck around with a host, deleted the virtual switch, deleted the team that the switch was using, updated the network drivers to the latest package, and on all of the network adapters that belonged to the team (my adapter is a quad port so I did this for all 4 ports) went into the Advanced tab on each and disabled 'Large Send Offload (IPv4)', 'Large Send Offload (IPv6)' and went into TCP Offloading Options on the same tab and disabled 'TCP Checksum Offload (IPv4)', 'TCP Checksum Offload (IPv6)', 'UDP Checksum Offload (IPv4)', 'UDP Checksum Offload (IPv4)'. Rebooted after that and recreated the team, recreated the virtual switch. I then live migrated the guests to that host and did the same to the other host. I also disabled spanning-tree on the switches for the ports in the VM team, but I don't think I had a problem there.
Guys,
If a “host NIC or TEAM” is conncted to Virtual Switch, the “host NIC or TEAM” will be bound to “Microsoft Virtual Network Switch Protocol”. It won’t have any binding to TCP or TCP6.
Could somebody explain how “TCP/UDP Offloading Options” can interfere than?

