locked
High CPU and Network usage on ADFS servers RRS feed

  • Question

  • Hello,

    Recently we noticed that ADFS (Microsoft.IdentityServer.Servicehost) is consuming a large amount of CPU resources (average of about 60% of a Dual core 2.4ghz VM) and it is also being very chatty with the DC's running about a consistent 50-60Mbps of traffic with them constantly.  This traffic appears to be LDAP related (port 389) but we don't see any errors or issues being reported on any of the servers.  No accounts have been locked out and there is no large spike of failed login attempts. 

    This would appear to be related to our Office 365 federation as if that is paused the processor and network usage drops to nothing.

    We have checked through all of the events on the farm and DC's and there are no repetitive errors that would seem to point to any issues.  Additionally everything is working as expected and the logins are fast with redirects to the Office 365 portal taking less than a second.

    Any suggestions?  Should we even be concerned with this?  It doesn't seem like ADFS used to take up this many resources but it doesn't seem to be hurting anything either.  I'm just curious as to what it is doing and why it is working so hard at it. 

    Thanks!

    Thursday, September 15, 2016 4:50 AM

Answers

  • So, this ended up being magically fixed by the latest November Windows Update roll up that came out. 

    Not sure exactly what was causing it still, but something in that update took care of it. 

    One thing to note though, the update had to be installed across all nodes in the farm before the resolution happened.  I had already updated and rebooted all the nodes but one in the cluster with the roll up and nothing changed until I updated the final node.  Once that update was applied (even before it was rebooted) the entire farm just calmed down.  Processors dropped from 90% (it had been getting worse) down to 2% across the board.

    After rebooting everything the fix still seems to have stuck as all nodes in the farm are now behaving themselves.

    To anyone who experiences this, I wish I could give you something more specific but for our situation this update resolved the issue:

    https://support.microsoft.com/en-us/kb/3197874

    My guess is that it was the security bulletin mentioned here : https://technet.microsoft.com/library/security/ms16-137

    Of particular interest is this section:

    • Changing the way that LSASS handles specially crafted requests.
    • Correcting how Windows Virtual Secure Mode handles objects in memory

    Could have been that a memory leak was fixed somewhere by this.

    • Marked as answer by BLSL Monday, November 21, 2016 7:41 AM
    Monday, November 21, 2016 7:39 AM

All replies

  • Can you show us the rules you have on the AD Claim Provider and the rules on the Relying party trusts?

    Note: Posts are provided “AS IS” without warranty of any kind, either expressed or implied, including but not limited to the implied warranties of merchantability and/or fitness for a particular purpose.

    Thursday, September 15, 2016 4:43 PM
  • Hello Pierre,

    I am attaching screen shots of the AD Claims provider trusts rules below and the Relying party Trusts claims rules.  I believe this was all auto configured during the Hybrid installation process and none of these have been modified to my knowledge.

    AD Claims provider trust rules:

    Claims rules for Relying party trust:

    Please let me know if this is what you were looking for.

    Thanks!


    • Edited by BLSL Thursday, September 15, 2016 5:53 PM
    Thursday, September 15, 2016 5:50 PM
  • *bump*

    Any thoughts?

    Tuesday, September 20, 2016 6:01 PM
  • So I guess we are assuming that the perf issue is connected to LDAP because we see a lot of LDAP traffic then.

    Well, are you using an alternate login Id with a list of forest for lookup? You can also check for this Perfmon counter: \AD FS\Average Search Latency for Alternate Login ID

    You can also run some LDAP tracing... Give it a shot...


    Note: Posts are provided “AS IS” without warranty of any kind, either expressed or implied, including but not limited to the implied warranties of merchantability and/or fitness for a particular purpose.

    Tuesday, September 20, 2016 11:05 PM
  • So, this ended up being magically fixed by the latest November Windows Update roll up that came out. 

    Not sure exactly what was causing it still, but something in that update took care of it. 

    One thing to note though, the update had to be installed across all nodes in the farm before the resolution happened.  I had already updated and rebooted all the nodes but one in the cluster with the roll up and nothing changed until I updated the final node.  Once that update was applied (even before it was rebooted) the entire farm just calmed down.  Processors dropped from 90% (it had been getting worse) down to 2% across the board.

    After rebooting everything the fix still seems to have stuck as all nodes in the farm are now behaving themselves.

    To anyone who experiences this, I wish I could give you something more specific but for our situation this update resolved the issue:

    https://support.microsoft.com/en-us/kb/3197874

    My guess is that it was the security bulletin mentioned here : https://technet.microsoft.com/library/security/ms16-137

    Of particular interest is this section:

    • Changing the way that LSASS handles specially crafted requests.
    • Correcting how Windows Virtual Secure Mode handles objects in memory

    Could have been that a memory leak was fixed somewhere by this.

    • Marked as answer by BLSL Monday, November 21, 2016 7:41 AM
    Monday, November 21, 2016 7:39 AM
  • Hi, Just noticed same issue. 

    I Update latest updates and server was restarted twice, but still high CPU and Network usage. 

    This influence also DC servers with high load.

    Thursday, February 23, 2017 9:42 AM
  • Hello Aas,

    What is your OS versions?  I am running server 2012 R2 on all the systems in question. 

    Have you verified that the latest updates are pulled from Microsoft and not a local WSUS server?

    Thanks!

    Monday, March 6, 2017 10:07 PM
  • Also running Server 2012 R2 and latest updates downloaded /installed from Microsoft, not a local WSUS.

    Tuesday, March 7, 2017 8:34 AM