locked
SCOM Agent crashing RRS feed

  • Question

  • Hi,

    We had an issue on one 2008 R2 sp1 server where SCOM agent keeps crashing with error message : "A monitoring host is unresponsive or has crashed.  The status code for the host failure was 2164195371" with Event ID 4000. Also, the event ID 1103 for rules/monitors failed and got unloaded. I did uninstalled and reinstalled SCOM agent but the issue persists. Could someone please advise how to fix it?

    Regards,

    Daya Ram

    Wednesday, August 24, 2016 10:51 AM

All replies

  • Do you see a .net crash in the application log? like a 1023 I think or something like that.  This could be an agent that is overloaded with workflow data, and thus it's running out of memory and crashing itself.  However, I do remember getting a hotfix to fix an issue on W2K8 where the agent was crashing because of an issue with how SCOM was collecting performance data, it was a while ago, so I forget exactly what it was, but I would see 4000 events, and 1023's from .net in my application log.

    Regards, Blake Email: mengotto<at>hotmail.com Blog: http://discussitnow.wordpress.com/ If my response was helpful, please mark it as so, if it answered your question, then please also mark it accordingly. Thank you.

    Wednesday, August 24, 2016 3:56 PM
  • First of all check if all applicable fixes are installed (including the latest OpsMgr UR): https://blogs.technet.microsoft.com/kevinholman/2009/01/27/which-hotfixes-should-i-apply/


    http://OpsMgr.ru/

    Thursday, August 25, 2016 5:53 AM
  • 1) update to latest CU

    2) also provide any relate log and its detail description

    Roger

    Thursday, August 25, 2016 7:32 AM
  • Hi Blake,

    I didn't see any event id 1023 in application logs. Event ID 1103 and 4000 is generating in OM logs with below description:

    Event ID 1103: "Summary: 223 rule(s)/monitor(s) failed and got unloaded, 223 of them reached the failure limit that prevents automatic reload. Management group "SCOM_AEGIS_PROD". This is summary only event, please see other events with descriptions of unloaded rule(s)/monitor(s)."

    Event ID 4000: "A monitoring host is unresponsive or has crashed.  The status code for the host failure was 2164195371."

    Thursday, August 25, 2016 8:45 AM
  • Hi Sir,

    Have you checked the following article regarding EventID 1103/4000 in scenario 2  :

    http://michaelmyhre.blogspot.sg/2014/07/monitoringhostexe-crashing-after.html

    Best Regards,

    Elton


    Please remember to mark the replies as answers if they help and unmark them if they provide no help. If you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.com .

    Friday, August 26, 2016 6:54 AM