none
Failover Cluster Validation fails Firewall Configuration for unknown reason

    Question

  • I have three nodes, all brand new fresh R2 installs with nothing but Hyper-V role installed and failover clustering feature enabled. I am trying to run the failover cluster validation tests and everything passes but it fails at the Firewall Configuration section with a weird error (see below). The firewall hasn't been touched on any of the machines except by installers to enable iSCSI and iSCSI MPIO. If anyone can help I would great appreciate it...I haven't a clue in the world what this means. Thanks!

    An error occurred while executing the test.
    There was an error verifying the firewall configuration.
    An item with the same key has already been added.

    Thursday, September 10, 2009 10:14 PM

Answers

  • When running Validate you may see the following error:

    Validate Windows Firewall Configuration
    Validate that the Windows Firewall is properly configured to allow failover cluster network communication.
    Validating that Windows Firewall is properly configured to allow failover cluster network communication.
    An error occurred while executing the test.
    There was an error verifying the firewall configuration.
    An item with the same key has already been added.


    This has been seen where two nodes in the cluster have network adapters with the same GUID.  The validation tool, expects globally unique identifiers to be, well.. unique.  When the validation tool attempts to create an entry in the table with the same key (adapter GUID) a second time, a duplicate key error is thrown.
     
    If you are seeing this error please confirm this is the issue you are encountering by running the following PowerShell cmdlet on every node, then compare the output to look for duplicate GUID's:
              Get-WmiObject Win32_NetworkAdapter | fl Name,GUID
     
    This issue has only been seen on cloned systems...   it also appears primarily on blade/chassis based systems.   If you have a deployment blocked by this issue, as a workaround you can run the normal OS setup process instead of cloning.  Note that with Win2008 setup is imaged based anyway, so setup is quite quick these days.  Or you can open a support case with Microsoft Product Support Services for additional assistance.

    Some questions:
    -  What type of hardware did you encounter the issue?  This has primarily been seen on blade/chassis based systems...
    -  How did you clone the OS?  This has been seen with some unsupported cloning techniques...
    -  If you SysPrep'd the image, what switches did you leverage?
    Tuesday, September 22, 2009 12:35 AM
    Owner

All replies

  • I've had two customers report the same problem to me.  I'm not sure what is causing the problem, but the workaround that seems to fix it everytime is to remove the server from the domain and re-add it.  Maybe someone else has a better idea or maybe it is worth opening a ticket with Microsoft to get to the root cause of the problem.
    David A. Bermingham, Director of Product Management, SteelEye Technology
    Friday, September 11, 2009 3:03 AM
  • Ditto for me!  I even spent time restarting two machines back from scratch as a test bed, just in case I had some weird configuration on the ones I had been trying before, but even the simple machines failed with the same error as above.  Two Windows 2008 R2 Enterprise, on the same subnet.  Vanilla installation of the OS, with IPv6 disabled, DC/DNS installed and configured on both.  I haven't tried the trick of leaving the domain yet (a little trickier on DCs).  What in the world man!
    Friday, September 11, 2009 4:37 AM
  • You will want to avoid making a cluster node a DC.
    David A. Bermingham, Director of Product Management, SteelEye Technology
    Friday, September 11, 2009 1:31 PM
  • I unjoined all three nodes from the domain, deleted their AD accounts and then rejoined them to the domain. Next I reran the cluster validation wizard and it had no effect whatsoever on the success of the validation wizard. It still stopped with the exact same firewall error. Very frustrating as everything else is passing!
    Friday, September 11, 2009 3:27 PM
  • I am also having the same issue. I've installed two 2008 R2 servers with ISCSI networks and 2 network cards. The only feature I added was the failover cluster. I created the cluster and started to install SQL 2008 but it would not installed due to a failure in the cluster validation. I ran the validation and got the same exact error you did. Everything else passed.

    I also removed from the domain and rejoined but it did not work. I also reinstalled windows and only added the secondary nic and failover cluster feature, no iscsi networks this time and no sql server install. During the precluster creation I get the the same error.
    Very frustrating.

    Please post back if you find a solution.
    Saturday, September 12, 2009 2:40 AM
  • How did you install the OS?  Are the clean installs off the Windows DVD?  or are these SysPrep'd / cloned installs?

    Sunday, September 13, 2009 5:10 AM
    Owner
  • Elden I think your on to something. Both my cluster nodes were sysprep'd. I did a reinstall over the weekend from cd and its working fine now. It seems there might be something wrong with syspreping and the clustering. Thanks for the tip.

    Monday, September 14, 2009 1:10 AM
  • Elden,

    Is there any guidance on what can and cannot be sysprep in a cluster?
    David A. Bermingham, Director of Product Management, SteelEye Technology
    Monday, September 14, 2009 2:01 PM
  • Hello,

    Thanks for all the help. I ended up calling Microsoft on Friday and after a 3 hour and 41 minute troubleshooting session, the problem is still not resolved. However, to answer everyone's questions and also to get more information about the others experiencing this problem, I have the following comments/questions...

    1. All clusters servers are diskless iSCSI clones from a base image that is sysprepped.

    2. We are using Broadcom NetXteme II 5708C NICs. One pair of the NICs are used for iSCSI w/MPIO and the other two are teamed using the BACS utility. The team type I'm using is 802.3ad link aggregation (LACP).

    3. Is anyone using the TOE on their NICs with this setup?

    4. Is anyone that is having this problem using sysprepped servers?

    5. Is anyone that is having this problem using teamed NICs on their non iSCSI VLAN?

    The MS engineer is under the impression that it is an issue with the TOE, NIC team and/or some sort of networking issue because we are seeing quite a few TCP retransmit and some TCP packet sequencing issues as well.
    Monday, September 14, 2009 3:58 PM
  • I've done some additional testing as follows. I disabled all TOE features and receive side scaling on all NICs. This did take care of all of the TCP retransmits that I was seeing. I also broke the NIC team and broke the link aggregation and set the switch back to standard access ports on the DATA VLAN. After doing that and assigning the same IP address that I had on the team back to one of the NICs, the error still presists.

    I'm wondering if this may be a bug or something where the sysprep isn't clearing out a registry key that needs to be cleared out. The error message "An item with the same key has already been added" seems to possibly point to something of that nature.

    Anyone else have any further input on this issue?
    Monday, September 14, 2009 6:02 PM
  • The biggest issue we see is sysprep being run with the Failover cluster feature installed...that would be a no, no as the MAC address for the NETFT driver (cluster network driver) would be the same on all nodes and that would cause a problem.
    Chuck Timon Senior, Support Escalation Engineer (SEE) Microsoft Corporation
    Thursday, September 17, 2009 11:04 PM
    Moderator
  • Hello Chuck,

    Thank you for your response. I read your post and agree that doing a sysprep after installing failover clustering would not be a good idea. This is not our process.

    The basic process that I use is as follows (keep in mind all of our servers are diskless and boot from our iSCSI SAN):

    1. Create the base image on our SAN. This image is created by completing the installation process for Server 2008 R2. No features, roles, software, or drivers of any sort are installed.

    2. Shut the server down.

    3. Clone the volume on the SAN and attach to a different server.

    4. Start up the cloned image and immediately sysprep the server.

    5. Install Hyper-V role, restart as necessary.

    6. Install Broadcom NIC drivers and video drivers, restart as necessary.

    7. Install Dell/Equallogic Host Integration Toolkit, restart as necessary.

    8. Configure network adapters, including disabling all TOE features and set all specific iSCSI items such as jumbo frames, flow control, MPIO, etc.

    9. Install Failover Cluster feature.


    We are currently using Dell PowerEdge R805 servers and Dell/Equallogic iSCSI SANs.

    Let me know if I can provide any further information that may help with this issue.

    I ended up having to manually install the OS on each server in the cluster rather than using my cloning/sysprep method and everything worked flawlessly.

    Thanks,
    Ken Kropp
    Monday, September 21, 2009 5:26 PM
  • You don't mention exactly when you join the domain.  I assume you do this sometime after you sysprep the image and not before, correct?
    David A. Bermingham, Director of Product Management, SteelEye Technology
    Monday, September 21, 2009 5:54 PM
  • That is correct. Sorry, I inadvertently left that out. I join the servers to the domain after step 8 above and before step 9. The base image server that is used for cloning/sysprep is never joined to the domain at any point in time.
    Monday, September 21, 2009 5:58 PM
  • When running Validate you may see the following error:

    Validate Windows Firewall Configuration
    Validate that the Windows Firewall is properly configured to allow failover cluster network communication.
    Validating that Windows Firewall is properly configured to allow failover cluster network communication.
    An error occurred while executing the test.
    There was an error verifying the firewall configuration.
    An item with the same key has already been added.


    This has been seen where two nodes in the cluster have network adapters with the same GUID.  The validation tool, expects globally unique identifiers to be, well.. unique.  When the validation tool attempts to create an entry in the table with the same key (adapter GUID) a second time, a duplicate key error is thrown.
     
    If you are seeing this error please confirm this is the issue you are encountering by running the following PowerShell cmdlet on every node, then compare the output to look for duplicate GUID's:
              Get-WmiObject Win32_NetworkAdapter | fl Name,GUID
     
    This issue has only been seen on cloned systems...   it also appears primarily on blade/chassis based systems.   If you have a deployment blocked by this issue, as a workaround you can run the normal OS setup process instead of cloning.  Note that with Win2008 setup is imaged based anyway, so setup is quite quick these days.  Or you can open a support case with Microsoft Product Support Services for additional assistance.

    Some questions:
    -  What type of hardware did you encounter the issue?  This has primarily been seen on blade/chassis based systems...
    -  How did you clone the OS?  This has been seen with some unsupported cloning techniques...
    -  If you SysPrep'd the image, what switches did you leverage?
    Tuesday, September 22, 2009 12:35 AM
    Owner
  • I don't have cloned systems and I was having this issue. Both are clean installs of Hyper-V Server 2008 R2 that have been upgraded from Beta versions. I fixed the problem by uninstalling and reinstalling the virtual switches for those hosts. You said it only happens on cloned installs though these were fresh installs from the get go. Starting with build 7000 and then 7100 then 7600. Was there a bug with the beta (thats when the VMs were created) that was fixed in RTM or RC?



    Wednesday, October 07, 2009 2:00 PM
  • When running Validate you may see the following error:

    Validate Windows Firewall Configuration
    Validate that the Windows Firewall is properly configured to allow failover cluster network communication.
    Validating that Windows Firewall is properly configured to allow failover cluster network communication.
    An error occurred while executing the test.
    There was an error verifying the firewall configuration.
    An item with the same key has already been added.


    This has been seen where two nodes in the cluster have network adapters with the same GUID.  The validation tool, expects globally unique identifiers to be, well.. unique.  When the validation tool attempts to create an entry in the table with the same key (adapter GUID) a second time, a duplicate key error is thrown.
     
    If you are seeing this error please confirm this is the issue you are encountering by running the following PowerShell cmdlet on every node, then compare the output to look for duplicate GUID's:
              Get-WmiObject Win32_NetworkAdapter | fl Name,GUID
     
    This issue has only been seen on cloned systems...   it also appears primarily on blade/chassis based systems.   If you have a deployment blocked by this issue, as a workaround you can run the normal OS setup process instead of cloning.  Note that with Win2008 setup is imaged based anyway, so setup is quite quick these days.  Or you can open a support case with Microsoft Product Support Services for additional assistance.

    Some questions:
    -  What type of hardware did you encounter the issue?  This has primarily been seen on blade/chassis based systems...
    -  How did you clone the OS?  This has been seen with some unsupported cloning techniques...
    -  If you SysPrep'd the image, what switches did you leverage?


    We where getting the Same errors on 2 IBM HS22 Blade server installations on Windows 2008 R2, no installations where cloned.

    "An error occurred while executing the test.
    There was an error verifying the firewall configuration.
    An item with the same key has already been added."

    After reading your post we found that both the Broadcom NIC where using the same GUID on each server.

    Once we knew the error was NIC GUID related we did manage to get around this without to much issue.

    What we did the following:
    1. Removed the 2 networks cards via Device manager (Did not delete the Driver files)
    2. Rescanned the Hardware via Device Manager and it recreated the 2 network cards (This will assign new GUID ID's)
    3. Ran the powershell script again and confirmed that the GUID ID's for the network adapeters on both servers where different
    4. Re-Ran Failover Cluster Report and no more issue on verifying the FW issue.

    Hope this helps
    Gavin

    • Proposed as answer by atcnf2008 Wednesday, December 30, 2009 9:17 PM
    Friday, November 06, 2009 1:04 PM
  • thanks Gavin,

    I can confirm Gavin's solution works, at least for me. The solution is  easy and  clean.
    Wednesday, December 30, 2009 9:18 PM
  • If you encounter this issue, can you please post the make / model of the NIC's you are using so that we can follow up with the manufacturers.

    Thanks!
    Elden
    Thursday, December 31, 2009 4:07 PM
    Owner
  • This worked perfectly for me!

    I was using clones from a template and following VMware's documentation to the letter, but still had the error.

     

    Thanks .Hadez!!!

    Tuesday, April 20, 2010 6:55 PM
  • Had 2 VMs built for me, ended up being cloned. Same GUIDs confirmed after the Firewall validation tests failed. On the 2nd node I simply removed the primary NIC - readded and confirmed GUID changed. All up and running in my POC LAB.
    Wednesday, May 12, 2010 10:50 PM
  • Like Elden writes, "This issue has only been seen on cloned systems... " is indeed true for me aswell. I thought I´d cut some installation time and pulled one of my R1 drives från the first installed server and swapped it against a drive in the other, exactly similar server/hardware.

    Everything seemed OK until the Cluster Validation Report gave me the nasty error below:

    "An error occurred while executing the test.
    There was an error verifying the firewall configuration.
    An item with the same key has already been added.
    "

    After I ran GWMI Win32_NetworkAdapter | fl Name,GUID on both servers, of course all 4 NIC,s in both systems had the exact same GUID!!! The solution was to disable/uninstall all 4 NIC,s in the secondary server and after a reboot and PnP scan all 4 NIC,s got new GUIDs and after that the Validation Report was OK.

    Now, I´ve just finished installing the first Cluster Node without any issues or glitches... :)

    Hope this helps someone else out there!

    Cheers,

    George

     

     

    Wednesday, December 01, 2010 10:11 AM
  • That works for me as well George!
    David A. Bermingham, MVP Senior Technical Evangelist, SteelEye Technology
    Wednesday, December 01, 2010 1:34 PM
  • So while installing a fresh win2008 r2 sp1 cluster, the same issue about key already been added pops up for me.


    I am running under a vmware cluster, and of course using the templating functionality therein.

    However it seems that one must also, as pointed out here, sysprep.

    Unfortunately as I have an install in 40 mins and the problem just cropped up, will probably have to just remove and add new nics for the affected network, which is 1 nic that is present in the base image.

    I believe, based upon comments here, and several MS support articles that one MUST sysprep in addition to other cloning technologies. Logically, if they've missed the nic guid thing in their cloning functionality without sysprep, then perhaps other things too. On the other hand Vmware might very well have stated that one should sysprep as well. Haven't had time since version 2 of esx to read all their documentation.

    I will test sysprep later, nic removal and new nic addition helped this particular problem.

    http://support.microsoft.com/kb/2012835

    http://support.microsoft.com/kb/969256


    All the good names were idd in use.

    Thursday, April 19, 2012 4:55 AM