none
802.1x - Onex Auth Timeout / Unusual XP SP3 Client Behavior

    Question

  • Hi all,

    Got an 802.1x Environment...Here's the Breif Details:

    Windows Server 2008 NPS/DHCP/RADIUS/Enterprise Root CA (NAP GPO, Cert AE)
    Windows XP SP3 Workstation
           PEAP and Secured Password (EAP-MSCHAP v2)
           "Validate Server Certificate" and "Enable Quarantine Checks" Ticked
           "Enable Fast Reconnect" NOT Ticked
    Cisco 2800 (Router on a Stick)
    Allied Telesis AT-8000s L2 Switch (3 VLAN's - 1,2,3)

    Im fairy confident that my network environment is configured correctly. I believe the issue lies with the Windows XP workstation. When I boot the machine up, without logging in, i can see my machine account authenticated on the switch and in the correct VLAN. Upon logging in, I am correctly placed in a different (correct) VLAN (new ip subnet verified) and my user account is now visible on the switch. The Windows Network Status Balloon shows "Connected" for about 10 seconds. The status then changes to "Attempting Authentication" before going to "Authentication Failed". Once my workstation was in the "Authentication Failed" status, it was no longer responding to any requests from the switch for 20 min. I installed a hotfix to allow a manual setting of this and set it to 1 min. Restarting the dot1x service seems to trigger a refresh as well.

    What's bizarre is that during the 60 seconds when the status is in "Attempting Authentication" i can successfully toggle compliance back and forth successfully. Example: "Attempting Authentication" is displayed, then turn the firewall off and trigger a re-authentication from the switch to the workstations port (dot1x re-authenticate ethernet e16)...."Immedietely" the ip drops to 0.0.0.0 and i get a new ip from the non-comp vlan. The status goes to "Connected" and then to "Attempting Authentication" within a few seconds. I can then turn the firewall back on and the reverse was successfull. If i leave it alone, every 1 min it will toggle from Attempting Auth to Auth Failed.

    The switch port shows Authenticated however the client is unable to keep the "Connected" Status. While the status is fluctuating between Attempting Auth and Auth Failed, network communication works..i can ping through the different subnets without issue...It's almost as if everything is working as expected except the status is never solid "Connected"...Even when the status is Authentication Failed, i can ping through to all the subnets as the switch port is always in an Authenticated state...Just the client is not....The switch port is set to Auto so initially, it is unauthenticated but then as expected, it authenticates the port and the windows portion of the setup just refuses to stay Connected.

    The client contains a valid Computer Certificate from the 2008 Cert Server via AE. The server contains a valid Domain Controller Cert issued from itself. The user account im using also contains a valid user cert. The user account is also NOT using a roaming user profile.

    The EventVwr messages im getting are:

    15504 - Restart Reason: Peer Initiated (When it first boots up and then initially goes into "Attempting Authentication" after "Connected")
    15504 (Different Error) - Restart Reason : Onex Auth Timeout (Every repeated attempt after the initial)
    15596 - Length of block timer (seconds): 60 (Another attempt will be made in 60 seconds )
    15514 - There was no response to the EAP Response Identity Packet

    I have attached the screen print of the Event Log History (Time) to see the correlation of events

    I also enabled tracing for ras and the only log file (NETMAN.log) that shows any information indicitave of a problem is as follows:

    "Not sending status to WZC because fZeroConfEnabled is FALSE and ncs is NCS_AUTHENTICATION_FAILED"

    I have confirmed that both the Wired Auto Config Service & Wireless Zero Config Service are Started

    Why does the status go from being "Connected" to the correct VLAN with the correct IP,  and then to Authentication Failed (still with correct vlan and ip with communication working)

    Update: I have successfully configured a Windows 7 Client using the same lab environment. This issue is specifically on Windows XP SP3 and we have 1000's of XP clients...This still needs to get addressed. :) Thanks!
    mercredi 16 décembre 2009 03:39

Réponses

  • Hi,

    XP clients typically have more trouble switching VLANs between computer and user authentication than what you have described, particularly if different VLANs have different subnets. I would probably start by eliminating a few variables. If possible, please try repeating the test with a client using a static IP address, a health policy that doesn't require firewall settings (use Windows Update as a requirement), and also verify that the switch does not have a period reauthentication timer running.

    On the switch, it might also be helpful to determine when line protocol goes down on the client port. If line protocol is down this would explain the client error message (no response). Line protocol can drop due to auth failure, and the reason for the auth failure should be reported on NPS (in event viewer).

    -Greg
    dimanche 17 janvier 2010 08:05
    Propriétaire

Toutes les réponses

  • Hi,

    XP clients typically have more trouble switching VLANs between computer and user authentication than what you have described, particularly if different VLANs have different subnets. I would probably start by eliminating a few variables. If possible, please try repeating the test with a client using a static IP address, a health policy that doesn't require firewall settings (use Windows Update as a requirement), and also verify that the switch does not have a period reauthentication timer running.

    On the switch, it might also be helpful to determine when line protocol goes down on the client port. If line protocol is down this would explain the client error message (no response). Line protocol can drop due to auth failure, and the reason for the auth failure should be reported on NPS (in event viewer).

    -Greg
    dimanche 17 janvier 2010 08:05
    Propriétaire
  • Hi MikeFi

    I have the same as yours. What is the status of your problem right now?

    Anyone, please help.

    Thanks.

    mercredi 12 mai 2010 08:45
  • I had the same issue with the failing authentications. Computer accounts authenticated succesfull, but one (new user) kept having issue's (as described by MikeFi) after logging in. I've found out there was a clock skew error on the ACS appliance with the AD authenticator. Setting up the correct NTP server for the ACS appliance (same NTP servers as for the AD servers) fixed the issue.

     

    mardi 7 juin 2011 14:08
  • Hi all!

     

    I had the same problem in a similar environment and have traced the problem to my switch, which is an Allied Telesis AT-8000GS series, very similar to Mike's AT-8000S.

    It turns out the problem is related to the switch firmware version. If your switch has a firmware version of 2.0.0.22 or above for the AT-8000GS and 3.0.0.40 or above for the AT-8000S series then there was a new interface scope command introduced named "dot1x legacy-supp-mode" which is meant to help with some old broken supplicants(Windows XP SP2). It's periodically sending EAPOL request identity frames according to tx timeout period. The problem is that this option is on by default and it is the culprit for these problems with all the newer supplicants (Windows XP SP3, Windows 7,...).

    So the simple solution is to turn it off by issuing "no dot1x legacy-supp-mode" while in the interface scope of your switch configuration.

     

    Best regards,

    Aleš Golob

    mercredi 6 juillet 2011 09:04