none
2008R2 cluster having problems reaching DCs

    Question

  • I've recently setup a new Server 2008R2 Fail over Cluster running SQL 2008R2.  I have been starting to see some intermittent networking issues and can't figure out how to fix it.  the following are the available nics and their binding order.  
    Lan Team - 172.1.1.x/255.255.255.0
    Lan 0
    Lan 1
    Local Area Connection* 9 - 169.x.x.x/255.255.0.0
    HeartBeat - 192.168.x.x/255.255.255.0
    San 0 - 172.1.2.x/255.255.255.0
    San 1 - 172.1.2.x/255.255.255.0
    San 2 - 172.1.2.x/255.255.255.0
    mpio is configured to only use the SAN Netowrk.

    On the active node, i'm intermittently loosing connection with the DC's and other lan servers.  if i type nslookup from the command prompt, i get the usual "DNS request timed out" and DNS Server: Unknown.  if i try to ping the ip of the DNS server or any other server on the lan network, i get "Destination Unreachable." 

    if i try to ping a public website, it works.  the passive node doesn't seem to have these symptoms.  i have not tried failing over yet, to see if the issues follow to the passive node. 

    anyone have any ideas what could cause this?

    Sunday, March 04, 2012 1:14 AM

Answers

  • Unfortunately, after rebooting both nodes, i'm still having the same issues. i'll be calling Microsoft for help soon probably.
    Saturday, March 10, 2012 3:47 PM

All replies

  • Hi,

    Can you please post ipconfig /all results for both nodes and IP of your DC's so we can have more information about DNS configuration?

    Regards


    MCTS, 2008R2 Server Virtualization http://www.erkanaksoy.com

    Sunday, March 04, 2012 1:35 PM
  • here is the ipconfig /all results for both nodes.  the DC/DNS servers are 172.16.0.101 & 102.  The lan team is configured as a Failover/Standby, so only 1 nic is active at a time, i even disabled the port on the switch to make sure and it didn't make a difference.  there are 2 lan switches, each server has a nic connected to each.  I just noticed that from other servers, the node ip's are pingable, but the cluster ip's are not?  another thing i noticed, is that when doing a route print command, the metric on the 9* cluster adapter is the lowest metric, not sure if that makes a difference.  lastly, from the switch interface, i can ping the cluster ip's.  hope that helps?  any info would be greatly appreciated.

    Sunday, March 04, 2012 4:54 PM
  • Thanks for the information you have provided. I did not noticed any problems with your network connections but having a persistant route on a APIPA adapter does not sounds any good for me. I have also checked my test cluster's routes and i will share it with you. Note that normally I do not have any APIPA configured adapter, I disable them before creating cluster. I just enabled that adapter to produce an environmet that is close to your's. But there is still no route with APIPA.

    IPv4 Route Table
    ===========================================================================
    Active Routes:
    Network Destination        Netmask          Gateway       Interface  Metric
              0.0.0.0          0.0.0.0      10.6.88.126       10.6.88.67    261
           10.6.88.64  255.255.255.192         On-link        10.6.88.67    261
           10.6.88.66  255.255.255.255         On-link        10.6.88.67    261
           10.6.88.67  255.255.255.255         On-link        10.6.88.67    261
          10.6.88.127  255.255.255.255         On-link        10.6.88.67    261
            127.0.0.0        255.0.0.0         On-link         127.0.0.1    306
            127.0.0.1  255.255.255.255         On-link         127.0.0.1    306
      127.255.255.255  255.255.255.255         On-link         127.0.0.1    306
          169.254.0.0      255.255.0.0         On-link     169.254.1.141    261
          169.254.0.0      255.255.0.0         On-link    169.254.224.16    261
          169.254.0.0      255.255.0.0         On-link    169.254.72.214    266
        169.254.1.141  255.255.255.255         On-link     169.254.1.141    261
       169.254.72.214  255.255.255.255         On-link    169.254.72.214    266
       169.254.224.16  255.255.255.255         On-link    169.254.224.16    261
      169.254.255.255  255.255.255.255         On-link     169.254.1.141    261
      169.254.255.255  255.255.255.255         On-link    169.254.224.16    261
      169.254.255.255  255.255.255.255         On-link    169.254.72.214    266
          192.168.1.0    255.255.255.0         On-link      192.168.1.67    266
         192.168.1.67  255.255.255.255         On-link      192.168.1.67    266
        192.168.1.255  255.255.255.255         On-link      192.168.1.67    266
            224.0.0.0        240.0.0.0         On-link         127.0.0.1    306
            224.0.0.0        240.0.0.0         On-link      192.168.1.67    266
            224.0.0.0        240.0.0.0         On-link     169.254.1.141    261
            224.0.0.0        240.0.0.0         On-link        10.6.88.67    261
            224.0.0.0        240.0.0.0         On-link    169.254.224.16    261
            224.0.0.0        240.0.0.0         On-link    169.254.72.214    266
      255.255.255.255  255.255.255.255         On-link         127.0.0.1    306
      255.255.255.255  255.255.255.255         On-link      192.168.1.67    266
      255.255.255.255  255.255.255.255         On-link     169.254.1.141    261
      255.255.255.255  255.255.255.255         On-link        10.6.88.67    261
      255.255.255.255  255.255.255.255         On-link    169.254.224.16    261
      255.255.255.255  255.255.255.255         On-link    169.254.72.214    266
    ===========================================================================
    Persistent Routes:
      Network Address          Netmask  Gateway Address  Metric
              0.0.0.0          0.0.0.0      10.6.88.126  Default
              0.0.0.0          0.0.0.0      10.6.88.126     256
    ===========================================================================

    Anyway, I say you should disable *9 and delete the associated persistant route.


    MCTS, 2008R2 Server Virtualization http://www.erkanaksoy.com

    Sunday, March 04, 2012 6:28 PM
  • Thanks for the info, but not sure disabling that adapter is a good idea.  i found this article that explains what that adapter is and why it's there. http://blogs.technet.com/b/askcore/archive/2009/02/13/what-is-a-microsoft-failover-cluster-virtual-adapter-anyway.aspx

    however, this did lead me to look a little deeper into this adapter.  it looks like this adapter has taken the mac address of one of the SAN nics, which is on a non-routable switch.  none of our SAN switches have any access to the lan, which might explain why i'm getting errors that the server can't find our DC's and other servers. 

    also found this article http://support.microsoft.com/kb/973838, which showed me where the mac address is stored in the registry.  i changed it to the Team adapter mac address on the passive node, and now the virtual adapter has the mac address of the team adapter after a reboot.  unfortunately, the active node has some processes running on it and i can't reboot it yet.  i'll post my results once i get a chance to reboot.

    thanks for your help.

    Sunday, March 04, 2012 8:59 PM
  • I did not realized *9 was Failover Virtual Adapter, i thought it was another physical card after reading your first post. So, it should not be disabled for sure.

    MCTS, 2008R2 Server Virtualization http://www.erkanaksoy.com

    Tuesday, March 06, 2012 6:29 AM
  • Unfortunately, after rebooting both nodes, i'm still having the same issues. i'll be calling Microsoft for help soon probably.
    Saturday, March 10, 2012 3:47 PM