none
SCE 2010 - Some Agents Could Not Be Authenticated & Rejected

    Question

  • I have a Windows 2008 R2 Server with SCE 2010 installed.  Things appear to be running normally for the most part.

    Approximately 250 clients with a mixed environment of XP SP3 and Windows 7.

    Some Windows 7 Clients (SCE agents) fail to connect to the SCE server. So far, the problem clients are all Windows 7 laptop computers (Dell Latitude E4200 or E6420 models). Not all of the laptops are affected, I have many more that the agent installs normally and function properly with SCE 2010.  No difference whether the laptop is docked or undocked and connected directly to the network, and a clean Windows installation does not guarantee success.

    I can run the discovery successfully, and SCE reports the agent is installed successfully. However, in these problem cases, the computer appears stuck in the "Pending Management" section of the Administration Console, despite the agent being partially or completely installed on the client.

    On the client, Event 20070 "The Connection was closed immediately after authentication" and Event 21016 "OpsMgr was unable to set up a communications channel to server.domain.net" are logged in the  Operations Manager Event Log.

    On the server, event 20002 "A device at IP xxx.xxx.xxx.xxx attempted to connect but could not be authenticated, and was rejected" is logged.

    DNS entries appear normal for the clients in question and the server. The client and server can resolve one another.  Group Policy is applied and up to date on the problem machines as well.  Firewall is off, as it is with all other clients.

    I have viewed 3 or 4 related topics/threads, none of which seem directly applicable. Some of the links in past threads are no longer valid, so I could not follow every suggestion. I cannot figure out why these clients/agents cannot be authenticated, and are being rejected.

    Friday, March 30, 2012 7:08 PM

Answers

  • I install the operating system, join it to our domain, ensure the computer account is in the proper AD containers and receiving the proper Group Policies. Then I run SCE discovery, and install the agent.  For the computers that it does not work, the "error" or "result" is the same one I have described throughout this thread.

    All of that said, I think I have figured out the problem.  I believe the problem may be with a third party application - the Cisco VPN Client. During installation, it disrupts network service momentarly. Once installed, it creates a virtual LAN adapter which is in a disabled state when the software is not in use. I have also run into instances where this VPN Client application becomes corrupt and impedes Windows' network functionality.   During all of my rounds of testing, the SCE Agent installs normally on all systems without the Cisco VPN Client. All of my cases where the agent hangs in a "Pending" state and/or reports those authentication errors (described in previous posts of this thread) the Cisco VPN Client is present. 

    I re-installed Windows serveral times and added the Cisco VPN Client at different times. For example, first attempt to install the agent, then install the VPN Client software; install thed VPN Client Software first, then discover and install the  SCE Agent; install & uninstall the VPN Client, then the SCE Agent, and so on. In these trials, sometimes the agent returns those authentication errors (as per above posts), and sometimes it installs and communicates properly.  I have not determined whether this is a timing issue (i.e. when the Cisco Client is installed in relation to the SCE Agent being installed or the computer discovered), or if it is a matter of some random, occasional corruption as a result of the Cisco VPN Client installation.

    Since I've been able to isolate the problem in this instance - aparently related to a third party application - I have found a method or order of operations where I get consistent, complete installations of the SCE Agent, so I consider the matter resolved, for now.

    Thanks to everyone who offerred suggestions and help, much appreciated. 

    Friday, May 25, 2012 7:12 PM

All replies

  • Hello,

    It seems like a common issue which was discussed several times on this forum. You can refer to these threads to see if the steps can help:

    A device which is not part of this management group has attempted to access this Health Service.
    http://social.technet.microsoft.com/Forums/en-US/systemcenter/thread/a7a21192-92f2-477f-a268-e1c355df0e98

    Agent installed but still pending 
    http://social.technet.microsoft.com/Forums/en-US/systemcenterdeployment/thread/e8d5391a-579b-4a46-9128-9ee2f3c2c71d

    SCE Agent installs but still pending in console 
    http://social.technet.microsoft.com/Forums/en-US/systemcenter/thread/393c450d-8899-4de8-b360-75ce2d0a9afd

    SCE 2010 Clients stuck in Pending Management - Unable to Approve - Licensing Issue? 
    http://social.technet.microsoft.com/Forums/en-US/systemcenteressentials/thread/b33dbd8e-34d4-4cd4-8321-35f994941667

    If the issue still exists, please tell us what you have tried and what are the results.

    Thanks,


    Yog Li

    TechNet Community Support

    Monday, April 02, 2012 9:27 AM
    Moderator
  • Thanks for the reply.

    Link#1 does not appear to apply to my environment. Neither the client or server are logging the errors/events mentioned in that post, and it appears it was meant for SCOM 2007, not SCE2010.  However, I attempted to run the script to 'check' our SCE2010 DB, but the script returns a text error and does not run properly, at least not in our environment.

    Link #2 the suggestions did not prove helpful for my environment. Firewalls are off on both the client and server, and DNS records and tests all check out normally.

    Link #3 Does not apply to my environment, DNS appears to be working normally. Resolution of the clients and server works on both ends, returning the proper results. Proper A and AAAA records are recorded on the DNS server, as well as proper entries in the reverse lookup zone.

    Link #4 suggestions did not prove effective, either. Some of the laptops in question were fresh Windows installations, never having had an SCE agent on them previously.  I tried installing the agent manually, copying the certificates from the server, ensuring the registry entries were correct, the Health Service was running, etc. The agents install OK, but they do not show up on the server (the admin console), and if I try to add them from the console, I get the same symptoms: the discovery is a success, it reports the agent installation is a success, but the computer is stuck in "Pending Management" and I only have the option to reject it.

    To Recap:

    - DNS looks OK.

    -Firewalls are off.

    - The only computers I am having these problems with are laptop models. It makes no difference whether they are docked and connected to the network, or are connected with their on-board NIC. Approximatley 5 laptops out of 28 are having this problem. None of my desktop models experience this (200 desktop clients).

    - The problem computers are members of the SCE Managed Computers group.

    - Group Policies are all being applied properly.

    - Previous agent installations do not appear to be a problem, since I am experiencing the same behaviour even with a clean Windows installation "out of the box."

    - I have installed agents manually, copied the certificates from the server the appropriate location on the client. I have uninstalled agents installed by the server, deleted relevant registry entries, and re-installed agents manually, to no effect.

    In addition to the above posts, I have also tried the suggestions in the following link:

    http://social.technet.microsoft.com/Forums/en-US/systemcenterdeployment/thread/b28b1c29-a3bc-4c84-8c55-208492231cdd/


    • Edited by Inside Edge Tuesday, April 03, 2012 4:19 PM
    Tuesday, April 03, 2012 4:18 PM
  • Hi Inside Edge,

    I see that you were covering a lot of items already. That's good because it help us provide you more specific feedback. So, regarding this problem, the root cause usually is either connectivity or kerberos authentication. You were telling Yog Li that the DNS resolution works. questions:

     - what is the dns suffix of a working computer vs a broken one?

     - are you able to telnet fqdn:5723 to the SCE server from either a broken or working computer?

     - did you check that the SPNs from the SCE server are registered properly? You can use http://blogs.technet.com/b/jonathanalmquist/archive/2008/08/14/operations-manager-2007-spn-s.aspx as a reference. This article is for opsmgr 2007 but it should apply to SCE

     - you can also check yourself for duplicate spns by running ldifde -m -f output.ldf then opening the ouput.ldf file with notepad and check if the SPNs are correct..ie from the link above

    servicePrincipalName: MSOMSdkSvc/<RMS fqdn>
    servicePrincipalName: MSOMSdkSvc/<RMS netbios name>
    servicePrincipalName: MSOMHSvc/<RMS fqdn>
    servicePrincipalName: MSOMHSvc/<RMS netbios name>

    Please let me know if you have any question.

    Thanks

    Jose

    Tuesday, April 10, 2012 6:36 PM
  • Hi Jose,

    Thanks for the response.

    - the dns suffix of a working computer matches that of a non-working computer (domainname.net)

    - I am unable to telnet to the SCE server from either a working computer or a non-working one. I used the syntax:  telnet server.domain.net:5723

    The command prompt returns the failure:  Connecting To server.domain.net:5723...Could not open connection to the host, on port 23: Connect failed

    What exactly am I looking for in the SPN output?  I ran the SPN check (SETSPN -L servername) on clients that work and do not work, and got the same results, which appear correct. When I use the ldifde command, there is a lot of information in the output file. I searched the content for MSOHM, and only found entries referencing the proper (the only) SCE server.

    Thanks again,


    • Edited by Inside Edge Tuesday, April 10, 2012 9:09 PM
    Tuesday, April 10, 2012 9:09 PM
  • when you have a chance, please copy/paste the output of setspn.

    now, regarding telnet...the SCE server should be listening on 5723. The syntax you need to use is

    telnet fqdn 5723

    also you may want to try locally from the SCE server (this one as a quick test as it should be ok)

    telnet localhost 5723

    as I've said before, it looks like the agent is unable to communicate properly. That ends up being one or two, connectivity, or kerberos. It may be something else though..but let's make sure this is covered and we'll take it from there

    Thanks

    Jose.

    Wednesday, April 11, 2012 1:47 PM
  • Ok, here are the results.

    The server appears to be listening on port 5723 normally. Telnet from a working and non-working computer connects, with the same result as a telnet localhost from the SCE server itself.

    The SETSPN -L output is as follows. It is the same on a working or non-working computer. I have replaced our local domain name with the generic term "domain" for the sake of privacy:

    C:\Users\administrator>SETSPN -L HSDEPLOY
    Registered ServicePrincipalNames for CN=HSDEPLOY,OU=Servers,DC=domain,DC=net:
            MSOMHSvc/HSDEPLOY.domain.net
            MSOMHSvc/HSDEPLOY
            MSOMSdkSvc/HSDEPLOY.domain.net
            MSOMSdkSvc/HSDEPLOY
            MSSQLSvc/HSDEPLOY.domain.net:ESSENTIALS
            MSSQLSvc/HSDEPLOY.domain.net:51608
            TERMSRV/HSDEPLOY.domain.net
            TERMSRV/HSDEPLOY
            WSMAN/HSDEPLOY.domain.net
            WSMAN/HSDEPLOY
            RestrictedKrbHost/HSDEPLOY
            HOST/HSDEPLOY
            RestrictedKrbHost/HSDEPLOY.domain.net
            HOST/HSDEPLOY.domain.net

    Thanks again for your help so far.                                                                                                                                                                    

    Thursday, April 12, 2012 12:30 AM
  • To add, in case anyone might be able to help, it really does appear to be an authentication issue.  The problem computers all show up in the SCE 2010 Server's Operations Manager Event log with Event 20002, "A device at IP [ip address] attempted to connect but could not be authenticated, and was rejected."

    I can re-install Windows (from scratch) on these computers, and it appears to be random whether the agent will install properly (install, authenticate, and communicate fully) with the SCE server or not.  I have performed 4 from-scratch installations of Windows 7 on one of the computers that has this issue. Two out of the 4 times, the agent did not install correctly.  I have been quite methodical and focused about the testing, and as far as I can tell I am following the same procedure and flow of events (install Windows, update, join to the domain, ensure the account is prestaged in AD, etc).  I do not install any 3rd party applications, I'm just working with a basic Windows 7 installation, fully updated, and so far it's 50/50.  When it doesn't work, it's the same symptoms as outlined in this thread:  the computer is discovered, the agent is installed (all the certs, files, and services are presetn and running on the client), but on the SCE 2010 server, it hangs in the "Pending" category of the Administration section, and the server logs event 20002 (and the client logs event 20070). 

    Friday, April 20, 2012 3:54 PM
  • What is the procedure that you have to add/remove computers? what do you mean that agents did not install properly? do you have an error message?

    Now, regarding that error 20002 you got, is telling me that the SCE server "doesn't know" about that object so it drops any data that is coming from it. However the object does not know either and you may see that you get events like 20070, 21016, etc. I dont think that reinstalling the windows 7 computer is going to make any difference from a SCE standpoint. Also, why are you mentioning certs? are you using certificate based authentication instead of Kerberos? did you compare the dns configuration between a working computer and one that is not working?

    what I would do at this point is try to take a step back and check the overall health of the SCE Server. Make sure it builds config fine (look for event 1210, on the SCE Health service state folder, check that the updated xml file is generated properly).

    Also, I would focus on fixing one agent at the time rather than try multiple installations. Why? because SCE works like SCOM, communicating back and forth between SCE Server, Agent, and keeping track of config. If too many changes are done...then you may be unable to keep track of whatever changes are done.

    Please let me know if you have any question.

    Thanks!

    Jose

    Thursday, May 10, 2012 6:03 AM
  • I install the operating system, join it to our domain, ensure the computer account is in the proper AD containers and receiving the proper Group Policies. Then I run SCE discovery, and install the agent.  For the computers that it does not work, the "error" or "result" is the same one I have described throughout this thread.

    All of that said, I think I have figured out the problem.  I believe the problem may be with a third party application - the Cisco VPN Client. During installation, it disrupts network service momentarly. Once installed, it creates a virtual LAN adapter which is in a disabled state when the software is not in use. I have also run into instances where this VPN Client application becomes corrupt and impedes Windows' network functionality.   During all of my rounds of testing, the SCE Agent installs normally on all systems without the Cisco VPN Client. All of my cases where the agent hangs in a "Pending" state and/or reports those authentication errors (described in previous posts of this thread) the Cisco VPN Client is present. 

    I re-installed Windows serveral times and added the Cisco VPN Client at different times. For example, first attempt to install the agent, then install the VPN Client software; install thed VPN Client Software first, then discover and install the  SCE Agent; install & uninstall the VPN Client, then the SCE Agent, and so on. In these trials, sometimes the agent returns those authentication errors (as per above posts), and sometimes it installs and communicates properly.  I have not determined whether this is a timing issue (i.e. when the Cisco Client is installed in relation to the SCE Agent being installed or the computer discovered), or if it is a matter of some random, occasional corruption as a result of the Cisco VPN Client installation.

    Since I've been able to isolate the problem in this instance - aparently related to a third party application - I have found a method or order of operations where I get consistent, complete installations of the SCE Agent, so I consider the matter resolved, for now.

    Thanks to everyone who offerred suggestions and help, much appreciated. 

    Friday, May 25, 2012 7:12 PM