Windows Server TechCenter > Windows Server Forums > Clustering > Failover Cluster Validation fails Firewall Configuration for unknown reason
Ask a questionAsk a question
 

AnswerFailover Cluster Validation fails Firewall Configuration for unknown reason

  • Thursday, September 10, 2009 10:14 PMroitsupport Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    I have three nodes, all brand new fresh R2 installs with nothing but Hyper-V role installed and failover clustering feature enabled. I am trying to run the failover cluster validation tests and everything passes but it fails at the Firewall Configuration section with a weird error (see below). The firewall hasn't been touched on any of the machines except by installers to enable iSCSI and iSCSI MPIO. If anyone can help I would great appreciate it...I haven't a clue in the world what this means. Thanks!

    An error occurred while executing the test.
    There was an error verifying the firewall configuration.
    An item with the same key has already been added.

Answers

  • Tuesday, September 22, 2009 12:35 AMElden ChristensenMSFT, OwnerUsers MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     Answer
    When running Validate you may see the following error:

    Validate Windows Firewall Configuration
    Validate that the Windows Firewall is properly configured to allow failover cluster network communication.
    Validating that Windows Firewall is properly configured to allow failover cluster network communication.
    An error occurred while executing the test.
    There was an error verifying the firewall configuration.
    An item with the same key has already been added.


    This has been seen where two nodes in the cluster have network adapters with the same GUID.  The validation tool, expects globally unique identifiers to be, well.. unique.  When the validation tool attempts to create an entry in the table with the same key (adapter GUID) a second time, a duplicate key error is thrown.
     
    If you are seeing this error please confirm this is the issue you are encountering by running the following PowerShell cmdlet on every node, then compare the output to look for duplicate GUID's:
              Get-WmiObject Win32_NetworkAdapter | fl Name,GUID
     
    This issue has only been seen on cloned systems...   it also appears primarily on blade/chassis based systems.   If you have a deployment blocked by this issue, as a workaround you can run the normal OS setup process instead of cloning.  Note that with Win2008 setup is imaged based anyway, so setup is quite quick these days.  Or you can open a support case with Microsoft Product Support Services for additional assistance.

    Some questions:
    -  What type of hardware did you encounter the issue?  This has primarily been seen on blade/chassis based systems...
    -  How did you clone the OS?  This has been seen with some unsupported cloning techniques...
    -  If you SysPrep'd the image, what switches did you leverage?

All Replies

  • Friday, September 11, 2009 3:03 AMDavid Bermingham Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    I've had two customers report the same problem to me.  I'm not sure what is causing the problem, but the workaround that seems to fix it everytime is to remove the server from the domain and re-add it.  Maybe someone else has a better idea or maybe it is worth opening a ticket with Microsoft to get to the root cause of the problem.
    David A. Bermingham, Director of Product Management, SteelEye Technology
  • Friday, September 11, 2009 4:37 AMMcKiller Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    Ditto for me!  I even spent time restarting two machines back from scratch as a test bed, just in case I had some weird configuration on the ones I had been trying before, but even the simple machines failed with the same error as above.  Two Windows 2008 R2 Enterprise, on the same subnet.  Vanilla installation of the OS, with IPv6 disabled, DC/DNS installed and configured on both.  I haven't tried the trick of leaving the domain yet (a little trickier on DCs).  What in the world man!
  • Friday, September 11, 2009 1:31 PMDavid Bermingham Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    You will want to avoid making a cluster node a DC.
    David A. Bermingham, Director of Product Management, SteelEye Technology
  • Friday, September 11, 2009 3:27 PMroitsupport Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    I unjoined all three nodes from the domain, deleted their AD accounts and then rejoined them to the domain. Next I reran the cluster validation wizard and it had no effect whatsoever on the success of the validation wizard. It still stopped with the exact same firewall error. Very frustrating as everything else is passing!
  • Saturday, September 12, 2009 2:40 AMSean Halihan Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    I am also having the same issue. I've installed two 2008 R2 servers with ISCSI networks and 2 network cards. The only feature I added was the failover cluster. I created the cluster and started to install SQL 2008 but it would not installed due to a failure in the cluster validation. I ran the validation and got the same exact error you did. Everything else passed.

    I also removed from the domain and rejoined but it did not work. I also reinstalled windows and only added the secondary nic and failover cluster feature, no iscsi networks this time and no sql server install. During the precluster creation I get the the same error.
    Very frustrating.

    Please post back if you find a solution.
  • Sunday, September 13, 2009 5:10 AMElden ChristensenMSFT, OwnerUsers MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     

    How did you install the OS?  Are the clean installs off the Windows DVD?  or are these SysPrep'd / cloned installs?

  • Monday, September 14, 2009 1:10 AMSean Halihan Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     

    Elden I think your on to something. Both my cluster nodes were sysprep'd. I did a reinstall over the weekend from cd and its working fine now. It seems there might be something wrong with syspreping and the clustering. Thanks for the tip.

  • Monday, September 14, 2009 2:01 PMDavid Bermingham Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    Elden,

    Is there any guidance on what can and cannot be sysprep in a cluster?
    David A. Bermingham, Director of Product Management, SteelEye Technology
  • Monday, September 14, 2009 3:58 PMroitsupport Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    Hello,

    Thanks for all the help. I ended up calling Microsoft on Friday and after a 3 hour and 41 minute troubleshooting session, the problem is still not resolved. However, to answer everyone's questions and also to get more information about the others experiencing this problem, I have the following comments/questions...

    1. All clusters servers are diskless iSCSI clones from a base image that is sysprepped.

    2. We are using Broadcom NetXteme II 5708C NICs. One pair of the NICs are used for iSCSI w/MPIO and the other two are teamed using the BACS utility. The team type I'm using is 802.3ad link aggregation (LACP).

    3. Is anyone using the TOE on their NICs with this setup?

    4. Is anyone that is having this problem using sysprepped servers?

    5. Is anyone that is having this problem using teamed NICs on their non iSCSI VLAN?

    The MS engineer is under the impression that it is an issue with the TOE, NIC team and/or some sort of networking issue because we are seeing quite a few TCP retransmit and some TCP packet sequencing issues as well.
  • Monday, September 14, 2009 6:02 PMroitsupport Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    I've done some additional testing as follows. I disabled all TOE features and receive side scaling on all NICs. This did take care of all of the TCP retransmits that I was seeing. I also broke the NIC team and broke the link aggregation and set the switch back to standard access ports on the DATA VLAN. After doing that and assigning the same IP address that I had on the team back to one of the NICs, the error still presists.

    I'm wondering if this may be a bug or something where the sysprep isn't clearing out a registry key that needs to be cleared out. The error message "An item with the same key has already been added" seems to possibly point to something of that nature.

    Anyone else have any further input on this issue?
  • Thursday, September 17, 2009 11:04 PMChuck Timon [MSFT]MSFT, ModeratorUsers MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    The biggest issue we see is sysprep being run with the Failover cluster feature installed...that would be a no, no as the MAC address for the NETFT driver (cluster network driver) would be the same on all nodes and that would cause a problem.
    Chuck Timon Senior, Support Escalation Engineer (SEE) Microsoft Corporation
  • Monday, September 21, 2009 5:26 PMroitsupport Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    Hello Chuck,

    Thank you for your response. I read your post and agree that doing a sysprep after installing failover clustering would not be a good idea. This is not our process.

    The basic process that I use is as follows (keep in mind all of our servers are diskless and boot from our iSCSI SAN):

    1. Create the base image on our SAN. This image is created by completing the installation process for Server 2008 R2. No features, roles, software, or drivers of any sort are installed.

    2. Shut the server down.

    3. Clone the volume on the SAN and attach to a different server.

    4. Start up the cloned image and immediately sysprep the server.

    5. Install Hyper-V role, restart as necessary.

    6. Install Broadcom NIC drivers and video drivers, restart as necessary.

    7. Install Dell/Equallogic Host Integration Toolkit, restart as necessary.

    8. Configure network adapters, including disabling all TOE features and set all specific iSCSI items such as jumbo frames, flow control, MPIO, etc.

    9. Install Failover Cluster feature.


    We are currently using Dell PowerEdge R805 servers and Dell/Equallogic iSCSI SANs.

    Let me know if I can provide any further information that may help with this issue.

    I ended up having to manually install the OS on each server in the cluster rather than using my cloning/sysprep method and everything worked flawlessly.

    Thanks,
    Ken Kropp
  • Monday, September 21, 2009 5:54 PMDavid Bermingham Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    You don't mention exactly when you join the domain.  I assume you do this sometime after you sysprep the image and not before, correct?
    David A. Bermingham, Director of Product Management, SteelEye Technology
  • Monday, September 21, 2009 5:58 PMroitsupport Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    That is correct. Sorry, I inadvertently left that out. I join the servers to the domain after step 8 above and before step 9. The base image server that is used for cloning/sysprep is never joined to the domain at any point in time.
  • Tuesday, September 22, 2009 12:35 AMElden ChristensenMSFT, OwnerUsers MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     Answer
    When running Validate you may see the following error:

    Validate Windows Firewall Configuration
    Validate that the Windows Firewall is properly configured to allow failover cluster network communication.
    Validating that Windows Firewall is properly configured to allow failover cluster network communication.
    An error occurred while executing the test.
    There was an error verifying the firewall configuration.
    An item with the same key has already been added.


    This has been seen where two nodes in the cluster have network adapters with the same GUID.  The validation tool, expects globally unique identifiers to be, well.. unique.  When the validation tool attempts to create an entry in the table with the same key (adapter GUID) a second time, a duplicate key error is thrown.
     
    If you are seeing this error please confirm this is the issue you are encountering by running the following PowerShell cmdlet on every node, then compare the output to look for duplicate GUID's:
              Get-WmiObject Win32_NetworkAdapter | fl Name,GUID
     
    This issue has only been seen on cloned systems...   it also appears primarily on blade/chassis based systems.   If you have a deployment blocked by this issue, as a workaround you can run the normal OS setup process instead of cloning.  Note that with Win2008 setup is imaged based anyway, so setup is quite quick these days.  Or you can open a support case with Microsoft Product Support Services for additional assistance.

    Some questions:
    -  What type of hardware did you encounter the issue?  This has primarily been seen on blade/chassis based systems...
    -  How did you clone the OS?  This has been seen with some unsupported cloning techniques...
    -  If you SysPrep'd the image, what switches did you leverage?
  • Wednesday, October 07, 2009 2:00 PMHSChronic Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    I don't have cloned systems and I was having this issue. Both are clean installs of Hyper-V Server 2008 R2 that have been upgraded from Beta versions. I fixed the problem by uninstalling and reinstalling the virtual switches for those hosts. You said it only happens on cloned installs though these were fresh installs from the get go. Starting with build 7000 and then 7100 then 7600. Was there a bug with the beta (thats when the VMs were created) that was fixed in RTM or RC?



  • Friday, November 06, 2009 1:04 PM.Hadez Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    When running Validate you may see the following error:

    Validate Windows Firewall Configuration
    Validate that the Windows Firewall is properly configured to allow failover cluster network communication.
    Validating that Windows Firewall is properly configured to allow failover cluster network communication.
    An error occurred while executing the test.
    There was an error verifying the firewall configuration.
    An item with the same key has already been added.


    This has been seen where two nodes in the cluster have network adapters with the same GUID.  The validation tool, expects globally unique identifiers to be, well.. unique.  When the validation tool attempts to create an entry in the table with the same key (adapter GUID) a second time, a duplicate key error is thrown.
     
    If you are seeing this error please confirm this is the issue you are encountering by running the following PowerShell cmdlet on every node, then compare the output to look for duplicate GUID's:
              Get-WmiObject Win32_NetworkAdapter | fl Name,GUID
     
    This issue has only been seen on cloned systems...   it also appears primarily on blade/chassis based systems.   If you have a deployment blocked by this issue, as a workaround you can run the normal OS setup process instead of cloning.  Note that with Win2008 setup is imaged based anyway, so setup is quite quick these days.  Or you can open a support case with Microsoft Product Support Services for additional assistance.

    Some questions:
    -  What type of hardware did you encounter the issue?  This has primarily been seen on blade/chassis based systems...
    -  How did you clone the OS?  This has been seen with some unsupported cloning techniques...
    -  If you SysPrep'd the image, what switches did you leverage?


    We where getting the Same errors on 2 IBM HS22 Blade server installations on Windows 2008 R2, no installations where cloned.

    "An error occurred while executing the test.
    There was an error verifying the firewall configuration.
    An item with the same key has already been added."

    After reading your post we found that both the Broadcom NIC where using the same GUID on each server.

    Once we knew the error was NIC GUID related we did manage to get around this without to much issue.

    What we did the following:
    1. Removed the 2 networks cards via Device manager (Did not delete the Driver files)
    2. Rescanned the Hardware via Device Manager and it recreated the 2 network cards (This will assign new GUID ID's)
    3. Ran the powershell script again and confirmed that the GUID ID's for the network adapeters on both servers where different
    4. Re-Ran Failover Cluster Report and no more issue on verifying the FW issue.

    Hope this helps
    Gavin