none
Microsoft.SystemCenter.Agent.MaintenanceMode rule error RRS feed

  • Question

  • We are seeing a huge increase of errors on monitored servers in the Operations Manager log for the rule Microsoft.SystemCenter.Agent.MaintenanceMode

    Event ID 1102

    Error: Rule/Monitor "Microsoft.SystemCenter.Agent.MaintenanceMode" running for instance "Servername" with id:"{4E332338-EC16-ED43-60DD-929682495E1F}" cannot be initialized and will not be loaded. Management group "Management Group Name"

    This happens occasionaly on different times and on different servers. 

    Anyone else seeing this?


    • Edited by sweprs Thursday, January 23, 2020 8:22 AM mistype
    Thursday, January 23, 2020 8:22 AM

Answers

  • So Microsoft support found the solution.

    It was the old Override set for SCOM 1801/1807 enabled for the rule Agent Initiated Maintenance Mode that was the problem.

    Is Start-SCOMAgentMaintenanceMode disabled in 1801?

    Despite removing the Override in the GUI it was still there in the Override XML file. We had changed the IntervalSeconds on the Rule and this parameter is not present on the new rule in SCOM 2019. Hence that the rule didn't load.

    After we removed this override file and imported the file without its presence, everything started to work.

    • Marked as answer by sweprs Wednesday, February 5, 2020 12:07 PM
    Wednesday, February 5, 2020 12:07 PM

All replies

  • make sure that your agent default action has following minimum privileges:

    •  Member of the local Users group
    •  Member of the local Performance Monitor Users group
    • “Allow log on locally” permission (SetInteractiveLogonRight)

      Roger
    Thursday, January 23, 2020 9:15 AM
  • Hi,

    I would also check what happens if the agent cache of the affected system is emptied and recreated.

    Regards,


    (Please take a moment to "Vote as Helpful" and/or "Mark as Answer" where applicable. This helps the community, keeps the forums tidy, and recognizes useful contributions. Thanks!) Blog: https://blog.pohn.ch/ Twitter: @StoyanChalakov

    Thursday, January 23, 2020 9:24 AM
    Moderator
  • Could be that the server is overloaded. 

    Just curious if anyone else had this behaviour? Or any alerts at all?


    Thursday, January 23, 2020 9:36 AM
  • Hi,

    I can't remember having seen this exact error, this is the Agent Initiated Maintenance Mode Rule which runs every 5 minutes by default on the agent computers, it detects agent initiated maintenance mode requests.

    If you think the server might be overloaded, do you receive any other errors or similar errors?

    How big is your SCOM environment?

    How many monitored agents, and SCOM management servers do you have?

    Best regards,
    Leon


    Blog: https://thesystemcenterblog.com LinkedIn:

    Thursday, January 23, 2020 9:55 AM
  • Actually, after digging som more into this and I was also alerted by my Server Team.

    Its seems like the CMDLET Start-SCOMAgentMaintenanceMode doesn't set the registry value on the server anymore when running SCOM 2019 agents.

    Just tried with an old 1807 agent that aren't updated yet, and the value is written in the registry.

    Anyone who can try the same on a SCOM 2019 agent to see if you get the same?

    Thursday, January 23, 2020 11:50 AM
  • That's because the method changed with SCOM 2019, it now generates an event (ID 19999 in OperationsManager event log) instead of setting a registry key : 

    Maintenance mode can be enabled directly from the monitored Windows computer by a server administrator using the PowerShell cmdlet Start-SCOMAgentMaintenanceMode. When server administrator or operator runs this PowerShell cmdlet on the computer, the command logs an event, which stores arguments for the maintenance mode, such as duration, reason, comment, and information like time of invocation of cmdlet.

    https://docs.microsoft.com/en-us/system-center/scom/manage-maintenance-mode-overview?view=sc-om-2019#enable-from-target-system-1

    Coincidentally, the rule Microsoft.SystemCenter.Agent.MaintenanceMode changed so it now traps that event instead of running a script that looks inside the registry.

    Also, the older version of the rule relied on System.PrivilegedMonitoringAccount RunAs profile to access the registry while the new one doesn't need a RunAs, since it just reads the event log.

    So maybe you still have the old version of the rule cached in some agents, and it tries to use the RunAs, but fails for any reason (not available anymore, not configured with Logon As a Service which is required for RunAs accounts starting with SCOM 2019...).

    Did you simply try clearing agent cache?

     

    Alternatively, you can have a look at Kevin Holman's solution, even if you don't need the SCCM part : https://kevinholman.com/2019/07/28/scom-agent-initiated-maintenance-mode-with-sccm-maintenance-windows/ 



    Thursday, January 23, 2020 12:44 PM
  • This has changed indeed in SCOM 2019, you'll notice the changes here:

    SCOM 1807
    https://docs.microsoft.com/en-us/system-center/scom/manage-maintenance-mode-overview?view=sc-om-1807#enable-from-target-system

    SCOM 2019
    https://docs.microsoft.com/en-us/system-center/scom/manage-maintenance-mode-overview?view=sc-om-2019#enable-from-target-system

    As CyrAz said, try clearing the cache first to see if that helps, if not you could try repair installing the agent.


    Blog: https://thesystemcenterblog.com LinkedIn:

    Thursday, January 23, 2020 12:51 PM
  • Totally missed this one :-)

    Tried clearing the cache and repairing the agent. Still no luck. I can see the 19999 event in the System log but it doesn't set the server in Maint.

    According to the doc

    To confirm that maintenance mode request is successful, look in the Operations Manager event log for event ID 2222 followed by an event with event ID 2223. In case event ID 2223 is not available, submit the maintenance mode request again.

    I can see the Event ID 2222 written to the Management server but not 2223.

    Thursday, January 23, 2020 1:20 PM
  • Have you enabled the rule Agent Initiated Maintenance Mode Rule by creating an override?

    Similar post:
    Is Start-SCOMAgentMaintenanceMode disabled in 1801?


    Blog: https://thesystemcenterblog.com LinkedIn:

    Thursday, January 23, 2020 1:35 PM
  • The documentation is a bit misleading here.

    What really happens is that when Microsoft.SystemCenter.Agent.MaintenanceMode runs properly, it collects and stores the events 19999 in SCOM database and datawarehouse.

    Then, another rule called Microsoft.SystemCenter.Agent.MaintenanceMode.Trigger.Rule gets all events stored by Microsoft.SystemCenter.Agent.MaintenanceMode  in the past 240s (by default). That rule generates an event 2222 when it runs, but that doesn't mean it found any event.

    If it finds events, it will initiate Maintenance Mode and generate events 2223.

    Since Microsoft.SystemCenter.Agent.MaintenanceMode is failing in your environment, no events are stored in DB/DW so Microsoft.SystemCenter.Agent.MaintenanceMode.Trigger.Rule doesn't find anything to process when it runs, which explains why you see events 2222 but not 2223.

    So back to square 1 : let's try to find why Microsoft.SystemCenter.Agent.MaintenanceMode is failing...



    • Edited by CyrAz Thursday, January 23, 2020 1:45 PM
    Thursday, January 23, 2020 1:44 PM
  • The rule is enabled by default with the target Agent.

    We did have it Overriden before against the Windows Computer class. But then I believe the rule was off by default. I will try to enable it against All Windows Computers class accordning to the article.

    Thursday, January 23, 2020 2:22 PM
  • I enabled logging on the  Microsoft.SystemCenter.Agent.MaintenanceMode.Trigger.Rule but I only get:

     [ReadAgentMaintenanceModeSystemEvents.ps1] : Number of Events to process: 0

    Thursday, January 23, 2020 2:39 PM
  • Yep, I explained that above : 

    "Since Microsoft.SystemCenter.Agent.MaintenanceMode is failing in your environment, no events are stored in DB/DW so Microsoft.SystemCenter.Agent.MaintenanceMode.Trigger.Rule doesn't find anything to process when it runs"

    Thursday, January 23, 2020 2:49 PM
  • But still I get :-(

    Rule/Monitor "Microsoft.SystemCenter.Agent.MaintenanceMode" running for instance "XXXXXXXXXX" with id:"{68B2440A-7587-9A6A-4F7A-AD715E5A38FB}" cannot be initialized and will not be loaded. Management group "XXXXXXXXXXXXXX"

    Thursday, January 23, 2020 2:51 PM
  • What operating system is this agent running?

    And You were running SCOM 2019 correct? Was this agent upgraded from your previous SCOM version?


    Blog: https://thesystemcenterblog.com LinkedIn:

    Thursday, January 23, 2020 2:52 PM
  • Tested on two servers. Both Win 2016

    The agents were upgraded from 1807.

    Maybe the next step is to open a MS case...

    Thursday, January 23, 2020 2:58 PM
  • Before doing so, try also with a complete uninstall of the SCOM agent, remove any folders & registry entries left behind (if there are any).

    Blog: https://thesystemcenterblog.com LinkedIn:

    Thursday, January 23, 2020 3:00 PM
  • Same thing unfortunately.
    Thursday, January 23, 2020 3:16 PM
  • Is it the only rule failing?

    No other error in the logs? Especially right after the agent (service) starts

    You could try collecting an agent diagnostic trace : https://support.microsoft.com/en-us/help/942864/how-to-use-diagnostic-tracing-in-system-center-operations-manager-2007


    • Edited by CyrAz Thursday, January 23, 2020 3:28 PM
    Thursday, January 23, 2020 3:27 PM
  • Hi,
     
    After going through the thread, I find the most steps are tried but still not resolved. It seems that our issue is a little tough. I suggest to open a case to work on this.
     
    Best regards.
    Crystal

    Please remember to mark the replies as answers if they help.
    If you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.com.

    Friday, January 24, 2020 6:54 AM
  • I will open a case and will try to update this thread if we find a solution
    Friday, January 24, 2020 8:29 AM
  • Hi,

    Thanks for your understanding. And wait for the sharing to help more people.

    have a nice day!

    Best regards.
    Crystal


    Please remember to mark the replies as answers if they help.
    If you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.com.

    Monday, January 27, 2020 5:22 AM
  • So Microsoft support found the solution.

    It was the old Override set for SCOM 1801/1807 enabled for the rule Agent Initiated Maintenance Mode that was the problem.

    Is Start-SCOMAgentMaintenanceMode disabled in 1801?

    Despite removing the Override in the GUI it was still there in the Override XML file. We had changed the IntervalSeconds on the Rule and this parameter is not present on the new rule in SCOM 2019. Hence that the rule didn't load.

    After we removed this override file and imported the file without its presence, everything started to work.

    • Marked as answer by sweprs Wednesday, February 5, 2020 12:07 PM
    Wednesday, February 5, 2020 12:07 PM
  • Hi,
     
    Thanks for the sharing. I am glad to hear that our issue is resolved. Here, to make our thread clear to read, please let me write a summary:
     
    Issue Definition:
    ===============
    See a huge increase of errors on monitored servers in the Operations Manager log for the rule Microsoft.SystemCenter.Agent.MaintenanceMode
    Event ID 1102
    Error: Rule/Monitor "Microsoft.SystemCenter.Agent.MaintenanceMode" running for instance "Servername" with id:"{4E332338-EC16-ED43-60DD-929682495E1F}" cannot be initialized and will not be loaded. Management group "Management Group Name"
    This happens occasionally on different times and on different servers.
     
    Cause:
    ==========
    It was the old Override set for SCOM 1801/1807 enabled for the rule Agent Initiated Maintenance Mode that was the problem.
     
    Resolution:
    ===============
    Despite removing the Override in the GUI it was still there in the Override XML file. We had changed the IntervalSeconds on the Rule and this parameter is not present on the new rule in SCOM 2019. Hence that the rule didn't load. After we removed this override file and imported the file without its presence, everything started to work.
     
    Thanks for your time and have a nice day.
     
    Best regards.
    Crystal

    Please remember to mark the replies as answers if they help.
    If you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.com.

    Thursday, February 6, 2020 3:20 AM