locked
802.1x with dynamic vlan switching - Problems with Roaming Profiles RRS feed

  • Question

  • Hi,

    while trying to get my NAP-Tests up and running I get into trouble and hope somebody here has maybe a solution.

    What I try is the following:

    - NAP with 802.1x enforcement
    - wireless and wired connection
    - Clients (all Windows XP SP3) got roaming profiles
    - dynamic vlan switching

    In my Testlab I only will do 802.1x in a wireless environement. The wireless setup is based on an Trapeze MX-400 wireless controller.
    I use vlan assigment on a user basis which seems to get me into trouble. Right after the user authenticates, the vlan switch occures (which goes pretty fast in my setup) and windows seems to get a timeout (no network connection) which prevents XP from loading the roaming profile. Futher problems are erros in GPO-Processing and Logonscript-Processing. All this issues seemed to have there root cause in the network timeout right after the vlan change.
    Deeper investigation shows me, that right after the topologie change windows do an DHCP Request with the old address (with the address from the wrong vlan which can't work) a few times. After 8 to 15 seconds XP do an DHCP Discover and the DHCP-Server issue the right IP-Address to the client. In the time between the DHCP Request and the sucessfull DHCP Discover there is no network connection and this seems to be my problem. I think in this setup doing an DHCP Request after the topologie change is a wrong behavoir from XP and leads into trouble.

    Is there a way to get rid of this problem?

    Any help is appreciated

    Regards

    Daniel
    Tuesday, May 13, 2008 7:10 AM

Answers

  • Hello Sascha,

     

    I hear your concern and understand what you would like to do.  With Vista, this is not a problem at all with the SSO feature, however, this feature does not exist in XP and unfortunately, we do not have any plans to backport this feature to XP as it is just too complex a change.

     

    Please understand that DHCP is not your problem.  Even if you were able to obtain your new address within a second, this would still break.  The problem has to do with the Netlogon service not waiting.  When the machine gets an IP address, the netlogon is aware of that and therefore as soon as the user logs in, Netlogon begins to communicate with the DC that is available.  At the same time, the connection is being reauthenticated and after it is successful, the client will switch VLANs.  This is what breaks the GPO/login script processing.  Even though the client is sending out DHCP requests, he still has an IP address that could be used if it were valid on the VLAN.  Once the communication is lost with the DC, the client does not re-enumerate the DC's address on this new VLAN and start the GPO processing again. 

     

    What some companies have done is keeps the IP addesses'ing the same for both VLAN, but provided restrictive ACLs on the machine VLAN to prevent machines from accessing the whole network. 

     

    I hope this clears things up.

    Friday, May 16, 2008 2:55 PM

All replies

  • Hey Daniel. I feel your pain. These delays can be a huge pain and make it tough to deploy. Can you give this a try and see if it changes the behavior?

     

    For Ethernet:

    [HKLM\SYSTEM\CurrentControlSet\Services\Dhcp\Parameters\DhcpGlobalForceBroadcastFlag\0] "0"=dword:00000001

    For Wireless adapter:

    [HKLM\SYSTEM\CurrentControlSet\Services\Dhcp\Parameters\DhcpGlobalForceBroadcastFlag\0] "1"=dword:00000001

     

    Also, what OS/Version/SP is your DHCP Server?

     

     

    {Jeff Sigman}{Senior Program Manager & NAP Hero}{Enterprise Security Group}

    {NAP Blog, FAQ, Forum, MSDN, Site and my bloÿg}

    Tuesday, May 13, 2008 7:05 PM
  • Hi Jeff,

    I tried the registry settings this morning but with no sucess. I have still about 7 seconds during the logon process with no network connectivity. Is there maybe a way to pause winlogon till there is network connectivity again?

    Here is the log from my dhcp server (there are two routers with ip helper configured between the client and the server, so you see double entries) :

    May 14 09:07:10 sv-65 dhcpd: DHCPREQUEST for XX.1.39.153 from 00:18:f3:a2:02:35 (nb-307) via XX.100.8.2: wrong network.
    May 14 09:07:10 sv-65 dhcpd: DHCPNAK on XX.1.39.153 to 00:18:f3:a2:02:35 via XX.100.8.2
    May 14 09:07:10 sv-65 dhcpd: DHCPREQUEST for XX.1.39.153 from 00:18:f3:a2:02:35 (nb-307) via XX.100.8.1: wrong network.
    May 14 09:07:10 sv-65 dhcpd: DHCPNAK on XX.1.39.153 to 00:18:f3:a2:02:35 via XX.100.8.1
    May 14 09:07:16 sv-65 dhcpd: DHCPDISCOVER from 00:18:f3:a2:02:35 (nb-307) via XX.100.8.1
    May 14 09:07:16 sv-65 dhcpd: DHCPDISCOVER from 00:18:f3:a2:02:35 via 10.100.8.2
    May 14 09:07:17 sv-65 dhcpd: DHCPOFFER on XX.100.8.249 to 00:18:f3:a2:02:35 (nb307) via XX.100.8.1
    May 14 09:07:17 sv-65 dhcpd: DHCPOFFER on XX.100.8.245 to 00:18:f3:a2:02:35 (nb-307) via XX.100.8.2
    May 14 09:07:17 sv-65 dhcpd: DHCPREQUEST for XX.100.8.249 (10.1.1.229) from 00:18:f3:a2:02:35 (nb-307) via XX.100.8.1
    May 14 09:07:17 sv-65 dhcpd: DHCPACK on XX.100.8.249 to 00:18:f3:a2:02:35 (nb-307) via XX.100.8.1
    May 14 09:07:17 sv-65 dhcpd: DHCPREQUEST for XX.100.8.249 (10.1.1.229) from 00:18:f3:a2:02:35 (nb-307) via XX.100.8.2
    May 14 09:07:17 sv-65 dhcpd: DHCPACK on XX.100.8.249 to 00:18:f3:a2:02:35 (nb-307) via XX.100.8.2

    As DHCP-Server we use ISC's dhcpd 3.0pl2 running on an
    Red Hat Enterprise Linux AS v. 3 Server


    Daniel
    Wednesday, May 14, 2008 7:32 AM
  • Hey again Daniel. It sounds like the reg_key did make a change, just not enough - is that correct?

     

    If I back up for a second, I must ask why it makes sense to have logged-on computers heading to different VLANs from the successful machine authentication? Is it that you have ONLY deployed user authentication? Usually, a NAP + 802.1X deployment doesn't see the problem you are hitting because it is VERY RARE to have an actual NAP caused VLAN switch when the user logs on. It would happen if NAP realized a security patch was missing at the exact moment a user logs on...

     

    We have no way at the moment of "pausing" Winlogin. If you must deploy like this, we will need to find a way to adjust the timing in DHCP / APs / etc.

     

     

    {Jeff Sigman}{Senior Program Manager & NAP Hero}{Enterprise Security Group}

    {NAP Blog, FAQ, Forum, MSDN, Site and my bloÿg}

    Wednesday, May 14, 2008 6:13 PM
  • Hey Jeff,

    got only 30 minutes for testing this morning, but no it don't seem that the reg_key makes a difference.

    It might not be the most common way to deploy 802.1x an NAP but the deployment I have in mind is actually split in two parts: one part is NAP the other part is to implement user based VLAN switching.
    Regardless on which workstation a user logs on, he always should be placed in the correct vlan (e.g. Human Resources). So my setup does both, machine and user authentication. When a workstation boots, it's placed in a guest vlan and after user logon the workstation gets placed in the users vlan. Based on VLANs we want to implement access-lists which grand or deny access to various services.
    Next week I will expand my test setup to see if roaming profiles are the only real problem during logon. My company used roaming profiles for a while, but I don't think that this is commonly deployed in other company and I will evaluate  if the use of roaming profiles is essential.

    Daniel
    Wednesday, May 14, 2008 8:19 PM
  •  

    Hello Daniel,

     

       Unfortunately, you are going to have a problem with the scenario as you have described.  With XP, we do not support either Dynamic VLAN switching (between the machine and user authentications) because of the problem that you have described.  Also, in certain scenarios, such as EAP-TLS authentication, you will potentially have problems because the credentials (a certificate) used to authenticate the user reside in the profile itself.  If the profile is too large, you potentially could timeout authenticating while the profile is downloading to the machine. 

     

       See http://support.microsoft.com/?kbid=935638 for a more detailed explaination.

     

       Now this does not mean that NAP and 802.1x will not work on XP SP3, but you will have to reconsider you VLAN assignments.  With NAP, we make the distinction between Compliant and Non-complaint machines and assign VLAN accordingly.  So rather than having a VLAN for machine authentication and one for the user, we would have a compliant VLAN that both a successful machine authentication and/or a user authentication would be placed into should the cleint be deemed healthy.  The same for a non-compliant machine.

     

    I hope this is helpful.

    Thursday, May 15, 2008 6:16 PM
  • Hi Clay, Hi Jeff

    as Daniel mentioned before, the big advantage we (and I guess a lot of other companies too) see in 802.1x is the possibility to place a department into a certain vlan where special rights are enforced depending on the L2/L3 Network. We would not have to configure every switch port to a fixed VLAN anymore, because the workstation will be assigned to the designated VLAN based on the users group membership.

    Windows handles the EAP-reauthentication (from machine based to user based credentials) flawlessly in a sub second period. But after the reauthentication (includiung the VLAN change), Windows sends a DHCP-REQUEST to the DHCP Server including the IP it had before (which is not valid in the new VLAN). Because of that, the DHCP Server sends a NACK to the Windows Workstation. After Windows received the NACK, it waits about 5 seconds before it does what it should do directly after the EAP-Reauthentication: Sending a DHCP-DISCOVER instead of a DHCP-REQUEST!

    This issue wouldn't be too bad, if the NETLOGON Process (which is loading the profiles, executing the logon scripts,...) would be aware of the connection/dhcp status and would just wait till the network device has an IP assigned and is ready for service.

    I've read that Vista has an GPO Setting like "This network uses different VLAN for authentication with machine and user credentials". Would that solve the issue? Is there something comparable for Windows XP SP2/3?

    We would really appreciate it, if we could solve this issue without installing 3rd-party supplicants like Juniper Odysses or others which would also replace the GINA to accomplish that...

    Best regards!

    Sascha
    • Proposed as answer by d_m_lee Monday, March 18, 2013 3:16 PM
    Thursday, May 15, 2008 7:28 PM
  • My understanding is that you can force DHCP back to a broadcast discover using the reg_key I mention above (and this KB). Could you double-verify and let me know?? I will ask the DHCP dev team if I need to…

     

    {Jeff Sigman}{Senior Program Manager & NAP Hero}{Enterprise Security Group}

    {NAP Blog, FAQ, Forum, MSDN, Site and my bloÿg}

    Friday, May 16, 2008 12:11 AM
  • Hello Sascha,

     

    I hear your concern and understand what you would like to do.  With Vista, this is not a problem at all with the SSO feature, however, this feature does not exist in XP and unfortunately, we do not have any plans to backport this feature to XP as it is just too complex a change.

     

    Please understand that DHCP is not your problem.  Even if you were able to obtain your new address within a second, this would still break.  The problem has to do with the Netlogon service not waiting.  When the machine gets an IP address, the netlogon is aware of that and therefore as soon as the user logs in, Netlogon begins to communicate with the DC that is available.  At the same time, the connection is being reauthenticated and after it is successful, the client will switch VLANs.  This is what breaks the GPO/login script processing.  Even though the client is sending out DHCP requests, he still has an IP address that could be used if it were valid on the VLAN.  Once the communication is lost with the DC, the client does not re-enumerate the DC's address on this new VLAN and start the GPO processing again. 

     

    What some companies have done is keeps the IP addesses'ing the same for both VLAN, but provided restrictive ACLs on the machine VLAN to prevent machines from accessing the whole network. 

     

    I hope this clears things up.

    Friday, May 16, 2008 2:55 PM
  • Hi Clay,

    that are bad news for the NAP deployment Sascha and me have in mind. With migration to Vista not in sight we will see in which direction we push this project.
    So far thank you Jeff and Clay for your help and clarification in this issue. We will come back to you if we need further assistance.

    Best Regards

    Daniel
    Friday, May 16, 2008 4:04 PM
  • Bummer. Please let us know if you need other help. Also, if you figure out anything on the timing I'd love to hear about it!

     

    {Jeff Sigman}{Senior Program Manager & NAP Hero}{Enterprise Security Group}

    {NAP Blog, FAQ, Forum, MSDN, Site and my bloÿg}

    Friday, May 16, 2008 4:28 PM
  • Hi Clay,

     

    you mean upon computer boot up, suppose they are in vlan 999 (10.10.22.0/24).  Upon user logon, the switch port change to vlan 38 (10.10.22.0/24) with the same IP address'ing but different VLAN?  

     

    Wondering can this be done on the Cisco 6506 core switches, different vlans with same IP range? How to do ACL? How about routing?

     

    -lpeng  

    Thursday, May 22, 2008 4:03 AM
  • Hey Jeff, Hey Clay,

    Sascha and me decided to give Vista a try and see if, as Clay stated, our plans for deployment are no problem with vista.

    Sorry to say but there are also big problems with vista. I tried to config Vista with sso and vlan switching via netsh. So far no problem at all but prelogon don't work. In the logs I see that the WLAN Autoconfig Service (wlansvc) tries to connect to the network without ssid. This gives an error an vista abandon the prelogon and goes on with windows logon an after that with wireless authentication and this time it works.

    I'm very confused now -  don't seem to that simple with vista.


    Regards

    Daniel

    Monday, May 26, 2008 8:46 AM
  • Daniel,

    We are also in test setting up 802.1x with dynamic vlan Switching from client vlan to user vlan and back again.

     

    We are experiencing similar problems to yourself. We have got it to work 100% if we use our security switch as the DHCP server instead of Server 2003. This is only a temporary fix to allow us to get the lab up and running. We have time scheduled to revisit this issue mid/end June 08. I will update this post if we find anything.

     

    Using Server 2003 and a sniffer (Wireshark) we have noticed that the clients that fail and the server do not initiate a Discover Offer Request Acknowledge (DORA) conversation. Instead we see three NAK packets and the client then gives up. We have about a 50/50 success rate getting the clients to switch. I think when we revisit this issue we need to determine if the client is ever receiving the NAK or receiving the NAK in full.

     

    The security switch DHCP forces the client to obtain a new address every time (and starts from the beginning of the range when it runs out) it switches vlan.

     

     

    Lab kit:

     

    Network:

    Cisco Router Switch: 3560

    Nortel Wireless Security Switch: 2360 (We now use this to dish out DHCP)

     

    Servers (We have tried both as DHCP server):

    Windows Server 2003 Enterprise R2 SP2

    Windows Server 2003 Enterprise R2 SP1

     

    Clients:

    Windows XP SP2 (x6)

    Windows 2000 SP4 (x3)

     

    Jamie...

    Thursday, May 29, 2008 11:18 AM
  • Hi ,

    I have tha same problems of yours, with Dot1x on Wired Ethernet + XP-SP3 + VISTA-SP1.

    The big problem is the dynamic Vlan assignment (per user, that it's different from "machine") with the enrollment of the Domain Policies.

    The policies are truncated after the user make the interactively logon ..... :-(

    I also open a Case Ticket to Microsoft Directly, my company of 1500 employess has the "Premier Support", but the solution not arrives.

     

    Now I must use the PEAP with MSCHAPv2 authentication, but i would use EAP-TLS with authentication of certificate (first machine after user) but the big problem of EAP-TLS is the auto-enrollement of certificate when the user makes the first logon on machine.

    Yes, I know that I must Use Smart-Card But it's not possible for 1500 employees .... :-(

    However the problem for Dynamic VLan, PEAP or EAP-TLS is the same, the "********" Netlogon Service that cannot WAIT.
     
    My Lab Kit:
     
    Cisco 3560 Switch
     
    Client XP SP3
    Client VISTA SP1
     
    AD Win2003
    ACS Cisco Solution Engine 4.1 - RADIUS Server
     
    Thanks Marcom22
     
     

     

    Thursday, May 29, 2008 4:53 PM
  • Hey everyone...

    I'm sad to say that the 802.1x implementation in Windows is awful.. but that's how it is.. A lot of people proofed that already with a bunch of traces on different layers (network, client and server based) and even Microsoft admit to that in their knowledge base articles like http://support.microsoft.com/kb/935638/en-us/ ...

    Let's face it again:

    - The Netlogon process is not aware of the 802.1x and DHCP Client Status.. it starts even before the EAP-Reauthentication (Machine to User VLAN Change) is done and the DHCP Client got an IP of the new network...

    - The DHCP Client wastes about 10 seconds after the VLAN change trying to hardly REQUEST the old ip address on the new vlan instead of DISCOVER a new one right after the change.

    - The Windows Vista prelogon doesn't seem to work as Daniel mentioned before.. Maybe when you deploy the policy with Windows 2008 Server, but not if you set it locally via netsh!

    So what can we do? Wait till Vista SP4 or Windows 7 and hope that one time all services (802.1x/eap, dhcp, netlogon) are aware of each other and work properly together?!

    Or try to solve these problems using 3rd party products? We already tried that with Netscreen Odysses and Cisco Secure Client... and yes, it works more often, but sometimes there are still problems that the supplicant doesn't wait for the whole process to get finished (eap reauth incl. vlan change... getting a new ip... pinging gateway) and let the netlogon go long before that...

    Right now we're are cluesless and don't even see any reason to upgrade thousands of clients from XP to Vista anymore... (it would have been the first and only real reason for us to do that)...

    Cheers,

    Sascha


    Thursday, May 29, 2008 5:50 PM
  •  

    Hello Sascha

     

       It sounds like this thread is covering a lot of different things that may not necessarily be related.

     

    1.  XP's ability to transition between machine and user authenticated VLANs

    2.  XPs ability to use NAP which makes use of compliant vs non-compliant VLANs

    3.  SSO on a Vista wireless connection.

     

       So point 1, I think we have this covered, but if not, I have not problem revisiting this.  Unfortunately, this simply will not work on a consistent basis using the native components.

     

       Point 2 should not be an issue with XP because we are transitioning between a compliant vs non-compliant VLAN.  If the machine boots and is deemed healthy by NAP, it will transition to a Complinant VLAN.  When the user logs in, if he too is deemed compliant, no VLAN switching will occur and the problems described previously will not happen.

     

       Point 3 is new.  Are you doing your VLAN switching on a wireless AP?  I have personally tested this on Vista with a wired connection and it does work.  Have you confirmed your XML profile?

     

     

       <OneX xmlns="http://www.microsoft.com/networking/OneX/v1">
        <singleSignOn>
         <type>prelogon</type>
         <maxDelay>20</maxDelay>
        </singleSignOn>
        <EAPConfig...</EAPConfig>
       </OneX>

    Thursday, May 29, 2008 9:14 PM
  • Hi all,

    We use 802.1x and dynamic VLAN's with XP a lot and you can get around the problem of netlogon not waiting for the VLAN switch by using a 3rd party supplicant which holds the logon service until the VLAN switch and obtaining a new DHCP lease has occured.

    We have found the Cisco Secure Services Client is very useful for this.
    version 4.1.2 is the best from our experience, a lot of the newer versions don't work.

    cheers,
    Tony
    • Proposed as answer by tjlavelle7 Tuesday, December 16, 2008 4:24 PM
    Tuesday, December 16, 2008 4:23 PM
  • i our case authentication fails at all: 

    XP SP3 (with roaming profiles)
    NAP (w2008)
    w2003 DC

    does anyone have situation like that there roaming profiles works and clients can authenticate (i am not talking now about gpo or script delays)?
     
    Tuesday, December 23, 2008 11:48 AM
  • Hi tjlavelle7,

    I know it's been a long time since anybody wrote in this thread but maybe you're still reading this.

    How did you set your dynvlan with 802.1x and XP up if I may ask ?

    We use Cisco Hardware in our network if that's helpfull.

     

    Thanks a lot

     

    Best regards

    Thorsten

    Friday, August 26, 2011 6:51 AM
  • Is this still a prolem in Windows 2008 DC environment? 
    Tuesday, April 3, 2012 10:29 AM