Network problem within Hyper-V (DNS)
-
Tuesday, December 16, 2008 12:18 PMQuick info
We have a small environment, with one physical server, hosting 4 virtual servers, within Hyper-V. We experience network problems, ie. the connection drops occasionally, causing damaged network packets. This is especially a problem for the DNS server, as it is sometimes unable to lookup DNS names on the Internet. Only network traffic through Hyper-V is causing problems, traffic between the physical server and a hardware gateway works fine.
Further, more detailed Info
The physical server has no problems pinging the gateway 10.0.1.1. Further, NSLOOKUP to an Internet DNS server always succeeds, with fast responses.
Only traffic to/from virtual servers, cause problems; Running ping on the DC/DNS server at 10.0.1.9, makes most pings to the gateway at 10.0.1.1 succeed, but some fail. I’ve also runned NSLOOKUP, locally on the virtual DC/DNS server, where I discovered, that all name lookup attempts, to the DNS server, for which the DNS server is authoritative, always replies fast. However, when the DNS server has to forward a query to other DNS Servers on the Internet, problems sometimes occurs.
In NSLOOKUP, every time a DNS query fail, repeating the same query one or two times, makes the lookup succeed. I have dumps from NSLOOKUP showing this:
--
> www.microsoft.fr.
Server: xxx.intern.xxx.dk
Address: 10.0.1.9------------
SendRequest(), len 34
HEADER:
opcode = QUERY, id = 30, rcode = NOERROR
header flags: query, want recursion
questions = 1, answers = 0, authority records = 0, additional = 0QUESTIONS:
www.microsoft.fr, type = A, class = IN------------
DNS request timed out.
timeout was 2 seconds.
timeout (2 secs)
SendRequest failed
------------
SendRequest(), len 34
HEADER:
opcode = QUERY, id = 31, rcode = NOERROR
header flags: query, want recursion
questions = 1, answers = 0, authority records = 0, additional = 0QUESTIONS:
www.microsoft.fr, type = AAAA, class = IN------------
------------
Got answer (61 bytes):
HEADER:
opcode = QUERY, id = 31, rcode = NOERROR
header flags: response, want recursion, recursion avail.
questions = 1, answers = 1, authority records = 0, additional = 0QUESTIONS:
www.microsoft.fr, type = AAAA, class = IN
ANSWERS:
-> www.microsoft.fr
type = CNAME, class = IN, dlen = 15
canonical name = m110.pictime.fr
ttl = 86398 (23 hours 59 mins 58 secs)------------
*** Request to xxx.intern.xxx.dk timed-out
> www.microsoft.fr.
Server: xxx.intern.xxx.dk
Address: 10.0.1.9------------
SendRequest(), len 34
HEADER:
opcode = QUERY, id = 32, rcode = NOERROR
header flags: query, want recursion
questions = 1, answers = 0, authority records = 0, additional = 0QUESTIONS:
www.microsoft.fr, type = A, class = IN------------
------------
Got answer (77 bytes):
HEADER:
opcode = QUERY, id = 32, rcode = NOERROR
header flags: response, want recursion, recursion avail.
questions = 1, answers = 2, authority records = 0, additional = 0QUESTIONS:
www.microsoft.fr, type = A, class = IN
ANSWERS:
-> www.microsoft.fr
type = CNAME, class = IN, dlen = 15
canonical name = m110.pictime.fr
ttl = 86393 (23 hours 59 mins 53 secs)
-> m110.pictime.fr
type = A, class = IN, dlen = 4
internet address = x.x.x.x
ttl = 1193 (19 mins 53 secs)------------
Non-authoritative answer:
------------
SendRequest(), len 34
HEADER:
opcode = QUERY, id = 33, rcode = NOERROR
header flags: query, want recursion
questions = 1, answers = 0, authority records = 0, additional = 0QUESTIONS:
www.microsoft.fr, type = AAAA, class = IN------------
------------
Got answer (117 bytes):
HEADER:
opcode = QUERY, id = 33, rcode = NOERROR
header flags: response, want recursion, recursion avail.
questions = 1, answers = 1, authority records = 1, additional = 0QUESTIONS:
www.microsoft.fr, type = AAAA, class = IN
ANSWERS:
-> www.microsoft.fr
type = CNAME, class = IN, dlen = 15
canonical name = m110.pictime.fr
ttl = 86393 (23 hours 59 mins 53 secs)
AUTHORITY RECORDS:
-> pictime.fr
type = SOA, class = IN, dlen = 36
ttl = 1195 (19 mins 55 secs)
primary name server = edison.pictime.fr
responsible mail addr = root.pictime.fr
serial = 2008110707
refresh = 600 (10 mins)
retry = 600 (10 mins)
expire = 3600000 (41 days 16 hours)
default TTL = 1200 (20 mins)------------
Name: m110.pictime.fr
Address: 193.238.151.9
Aliases: www.microsoft.fr--
My discoveries indicate, that the DNS server works correctly, but networking through Hyper-V is unstable.
When executed on on one of the other virtual machines, pings to the hardware gateway at 10.0.1.1 work fine. NSLOOKUPs towards Internet DNS servers work fine. However, DNS queries against the local DNS Server at 10.0.1.9 sometimes fail. Again, it seems that the local DNS server has problems forwarding DNS queries to the Internet.
Yesterday I tried these steps, without luck:
- Changed the virtual NIC on the virtual server running DNS. I removed the synthetic Hyper-V NIC, with IP 10.0.1.10, and added a Legacy Network Adapter, with a new address of 10.0.1.9. And of course modified the DNS server address on the remaining virtual servers.
- upgraded the drivers for both the Intel NICs and the Broadcom NICs.
- Disabled Large Send Offload v2 (IPv4) on the physical Intel NIC
…but same behavior as before.
The setup is this:
Physical setup.
One physical server:
a Dell Power Edge 2950, running Windows Server 2008 x64 with Hyper-V. No antivirus.
NIC information:
This physical server has 6 NICs (4 Intel Pro/1000 PT Dual Port Server Adaptors, and 2 Broadcom BCM5708c NetXtreme II GigE). Only one of the 6 NICs are in use, it’s one of the Intel NICs, the Intel NIC #4, connected to a hardware firewall/NAT router, that connects to the Internet.
IP addresses:
- The Intel NIC that’s in use: 10.0.1.5 (this address has actually been taken over by one of Hyper-Vs virtual NICs)
- The hardware firewall/NAT router: 10.0.1.1. This is the only network connection from the physical server.Virtual environment within Hyper-V:
4 virtual servers:
running Windows Server 2008 x64. Each has one virtual NIC. All virtual NICs are connected to the only physical NIC that’s in use.Hyper-V has created two virtual NICs, on the host server, only one of these is in use. The one in use, is connected to the physical NIC #4, that’s connected to the gateway router.
IP addresses for virtual machines:
10.0.1.9: Domain Controller, DNS Server, TS License server
10.0.1.11: SQL server
10.0.1.12: Exchange 2007
10.0.1.13: Terminal Server
It seems like people experienced problems like this with the beta releases. But we have the problem in the finan version. Anyone with a clue on how to resolve this problem? We have send the problem to Microsoft Support but all they did was to tell us that it was not a bog but a configuration error and they invoiced us. So it seems like we are doing something wrong here. But what?
All Replies
-
Wednesday, December 17, 2008 7:55 AM
We have simmilar problem with same server. We start Wireshark utility (http://www.wireshark.org/) and start dumping the network packets.
We found that disabling TCP/Offload Engine on all network cards fix the issue
Do you have packet losts on you network?
Are you sure that your switch is not causing the problem ?- Proposed As Answer by M FawziMVP Wednesday, December 17, 2008 10:31 PM
- Marked As Answer by Jens Persson Friday, December 19, 2008 3:27 PM
-
Wednesday, December 17, 2008 1:08 PMModerator
Hi,
As Mtrx suggested, this issue may be caused by "TCP/Offload Engine", please try to disable them on all network cards to test.
XP/2003, please refer the steps below.
904946 You experience intermittent communication failure between computers that are running Windows XP or Windows Server 2003
http://support.microsoft.com/default.aspx?scid=kb;EN-US;904946
On Vista/2008, please use the following command instead:
netsh int ip set global taskoffload=disabled
Meanwhile, please RSS and chimney to test again:
netsh int tcp set global RSS=disabled
netsh interface tcp set global autotuninglevel=disabled
netsh int tcp set global chimney=disabled
If Internal Network or Private Network work properly, this issue may also be caused by some switches that are incompatible with Hyper-V.
For example, some switches do not accept two different MAC addresses from the same switch port. This will cause some problems. To isolate the switch/router issue, I’d like to suggest you connecting the host server and another computer to a hub and perform test.
Thanks.
- Proposed As Answer by M FawziMVP Wednesday, December 17, 2008 10:32 PM
- Marked As Answer by Jens Persson Friday, December 19, 2008 3:27 PM
-
Wednesday, December 17, 2008 11:35 PM
Where is this done? On the physical host, not on the NICs in the virtual machines?
If you only change this setting on the physical Dell host machine, then I see several options; on the Hyper-V NIC or on the Intel Pro 1000 and Broadcom NICs? More specifically which one of these:
For the Hyper-V NIC, there are several options, at properties -> Configure -> Advanced:
-IP Checksum offload (IPv4)
-TCP Checksum Offload (IPv4)
-TCP Large send offload (IPv4)For the physical Intel NICs, there are several options, at properties -> Configure -> Advanced:
-Large Send Offload v2 (IPv4)
-Large Send Offload v2 (IPv6) (I don’t use IPv6)
-TCP/IP Offload Options -> IPv4 Checksum Offload, TCP Checksum offload (IPv4), UDP Checksum Offload (IPv4)For the physical Broadcom NICs, there are several options, at properties -> Configure -> Advanced:
-IPv4 Checksum Offload
-IPv4 Large Send OffloadIt seems that offloading can be configured several places ...
Sorry guys for being 'just' a developer/SQL DBA caught up in some messy MCSA/MCSE-work .. Some day I will have to learn to keep my nose for myself :-) -
Thursday, December 18, 2008 1:55 AMModerator
Hi,
Please disable "TCP/Offload Engine" on all physical and Virtual Machines.
1. If the VMs are XP/2003, please refer the steps below. To disable "TCP/Offload Engine" .
904946 You experience intermittent communication failure between computers that are running Windows XP or Windows Server 2003
http://support.microsoft.com/default.aspx?scid=kb;EN-US;904946
2. On Physical machine and Virtual has Vista/2008 system, please use the following command instead:
netsh int ip set global taskoffload=disabled
Meanwhile, please RSS and chimney to test again:
netsh int tcp set global RSS=disabled
netsh interface tcp set global autotuninglevel=disabled
netsh int tcp set global chimney=disabled
Thanks.
- Proposed As Answer by Mervyn ZhangModerator Thursday, December 18, 2008 2:01 AM
- Marked As Answer by Jens Persson Friday, December 19, 2008 3:27 PM
-
Friday, December 19, 2008 3:33 PMThank you both very much for your replies :-) It was spot on and the problem were solved.
Seasons Greetings to everybody from Denmark!

