Virtual Machines and very odd network issues

Unanswered Virtual Machines and very odd network issues

  • Thursday, July 05, 2012 6:25 AM
     
     

    We have one Server (IBM x3400) running Windows 2008 R2 Enterprise (Called PDC1).  It is our Primary Domain controller (A/D, DNS, DHCP, File Server) and runs Hyper-V (yes, I know, this is not recommended but we will be upgrading to new servers in the coming months).    It has 2 NIC's but we only use 1 (the onboard one - Gigabit) for all traffic because I am couldn't get the 2nd (ZyXel 10/100) one to work properly.  The IPv4 IP address of this is static 192.168.1.10.

    My gateway/wireless router is static IPv4 address: 192.168.1.50.

    I have 3 Virtual Machines up-and-running in Hyper-V.

    • VM01 - Windows Server 2008 R2 Enterprise/McAfee ePO - static IPv4 address:  192.168.1.11
    • VM02 - Windows Server 2008 R2 Enterprise/SQL Server 2008 - static IPv4 address: 192.168.1.12
    • VM03 - Windows Server 2008 R2 Enterprise/Microsoft Dynamics AX - static IPv4 address: 192.168.1.19
    • VM04 - Windows XP Pro (runs our old accounting program) - static IPv4 address:  192.168.1.20

    Up until a couple days ago, there were no problems.  I could remote desktop and they all talked to each other and everything was ok.

    3 Days ago, out of the blue, my virtual machines are starting to have some serious and crazy network issues.

    I couldn't RDP into any machine but I can still go and see all the Hyper-V and use all the Virtual Machines.

    From PDC1,

    1. Cannot ping VM01 (192.168.1.11) - Request timed out
    2. Successfully ping VM02 (192.168.1.12)
    3. Successfully ping VM03 (192.168.1.19)
    4. Cannot ping VM04 (192.168.1.20) - Request timed out
    5. Successfully ping gateway (192.168.1.50)

    From VM01

    1. Successfully ping PDC1 (192.168.1.10)
    2. Successfully ping VM02 (192.168.1.12)
    3. Successfully ping VM03 (192.168.1.19)
    4. Cannot ping VM04 (192.168.1.20) - Request timed out
    5. Cannot ping Gateway (192.168.1.50) - Destination Host Unreachable

    From VM02

    1. Successfully ping PDC1 (192.168.1.10)
    2. Cannot ping VM01 (192.168.1.11) - Request timed out
    3. Successfully ping VM03 (192.168.1.19)
    4. Cannot ping VM04 (192.168.1.20) - Request timed out
    5. Cannot ping Gateway (192.168.1.50) - Destination Host Unreachable

    From VM03

    1. Successfully ping PDC1 (192.168.1.10)
    2. Cannot ping VM01 (192.168.1.11) - Request timed out
    3. Successfully ping VM02 (192.168.1.12)
    4. Cannot ping VM04 (192.168.1.20) - Request timed out
    5. Cannot ping Gateway (192.168.1.50) - Destination Host Unreachable

    From VM04

    1. Successfully ping PDC1 (192.168.1.10)
    2. Cannot ping VM01 (192.168.1.11) - Request timed out
    3. Successfully ping VM02 (192.168.1.12)
    4. Successfully ping VM03 (192.168.1.20)
    5. Cannot ping Gateway (192.168.1.50) - Request timed out

    So, as you can see, I gather these details:

    Each Virtual Machine cannot see the gateway and all of the servers cannot see the XP Pro virtual Machine even though the XP Pro Virtual Machine can see 3 of 4 other servers.

    I understand this isn't setup optimally, if you have better opinions on optimizing my server setup, please feel free to give me your advice.  For example, setting up the 2nd (and unused) NIC for the Virtual Machines only.

    Any ideas on what could've happened?  Is my onboard NIC going bad, even though the server is running fine on it?

    I am sitting here scratching my head....

All Replies

  • Thursday, July 05, 2012 7:43 AM
     
     
    same problem here, i lost connection to my gateway which is hosting on a VM. but i can ping all the VMs from the host machine by their Valid IP !! but can not access them or RD to the machines !!
  • Thursday, July 05, 2012 12:36 PM
     
     
    For the single NIC that you are using on the host, is it defined as an external virtual network?  When you ping, are you using the machine name or the IP address?  You say you have a gateway/wireless router.  Is this wired into your environment or do you have a wireless connection from your server?

    tim

  • Friday, July 06, 2012 2:31 AM
     
     

    Q1: For the single NIC that you are using on the host, is it defined as an external virtual network?

    A: It says "External" in the virtual network manager.  And the primary Gigabit Ethernet card is selected.

    Q2: When you ping, are you using the machine name or the IP address?

    A: Yes, I use the IP address.  I just now and it works on some and not on the others.  I have been having some weird DNS issues lately.  Could this be the root problem?  For example, I could RDP by machine name for a while, then on some it stopped working.  Then I could still log in by IP address.  BUT this was before this network problem came around.

    Q3: You say you have a gateway/wireless router.  Is this wired into your environment or do you have a wireless connection from your server?

    A: Yes, it is directly wired into a Cisco switch which subsequently connects directly into a wireless router/gateway.  All other computers and the PDC are functioning successfully with Internet.  Just the VM's.

    The more I think about this, it could be the hardware starting to fail.  The server is 2+ years old and with our power outages here, it has sometimes had some hard shut downs.   Obviously, this isn't good on the computer.  I probably will spend this weekend rebuilding the server just to see if that corrects the issue.

  • Saturday, July 07, 2012 8:01 AM
     
     

    Ok, so today I did the following:

    1. Exported all VM's and removed them from the Hyper-V Host.
    2. Removed the Hyper-V Role from the host machine.
    3. Rebooted (which did a couple times)
    4. Re-added the Hyper-V Role to the host machine.
    5. Rebooted (as same as above)
    6. Added 1 VM to the server and reset the Network preferences.
    7. Same problem.

    It appears that this may be a problem with the onboard Network port itself.  Since Hyper-V and the Virtual Machines are acting exactly the same.

    Does anyone have any other advice before I am force to rebuild the OS?  I really don't have the time or energy to rebuild the server but it looks like what I will have to do. Please advise!

  • Saturday, July 07, 2012 3:42 PM
     
     

    If it is a problem with the physical network port, rebuilding the OS is not going to do anything for you.

    Are you having any network problems with the physical host in regards to accessing other resources other than your VMs?


    tim

  • Saturday, July 07, 2012 5:18 PM
     
     
    Actually, it works great! I have little problems with the host Network port. And this strikes me odd because I would expect some issues but I cannot detect anything that would be problematic. But I did put a 2nd NIC into the machine (after the initial problems started to occur and to see if I can use it to fix the problems) and I have had a lot of problems with this, it worked for 1/2 hour then it says cable is uplugged. Well, the cable is fine (I tried 2 good cables - same result) but no lights come on the NIC. It says driver is good and is operating functionally. I think the Hardware is starting to die on this machine because I am getting too many odd ball issues. I am hoping a complete rebuild gives me some clues on what could be wrong but if it doesn't, then it only tells me that we need to buy a new server. Yea, that would be great news.