none
How To Monitor SCORCH Runbook Failures Using SCOM RRS feed

  • Question

  • Hi All,

    How we can monitor the runbook Failures using scom. I am aware of the Orchestrator MP but that monitors only the Runbook services but not the Runbook as such. Is there an MP to monitor the Runbook and it's activity.

    Kindly advice me. 


    raj

    Monday, January 28, 2019 3:40 PM

Answers

  • The issue with this SQL approach is that an Orchestrator runbook can very well "sucessfully fail" : runbook state can be green even if one or more activities did not run as expected, but didn't actually crash.

    So an approach similar to the one Leon describes is better since it let you decide on which condition you want to trigger an alert.

    The one that I used for one of my customer is actually a mix of these solutions :

    Instead of using the event log it relies on a database. I created an "error handling runbook" that is called by every other runbook and creates a row on the database with runbook Id, status (success/error), error message and whatever else can be useful.

    Then SCOM queries this database in a very similar way to what is done on Xin's link.

    Tuesday, January 29, 2019 7:33 AM
  • Hello!

    I could share one way how I did this, I connected every Runbook activity to a Run .Net Script activity which wrote an event to the Windows Event Log, whether the Runbook activity succeeds or fails.

    The event would look something like this:

    Write-EventLog -LogName $LogName -Source $Source -EntryType $EntryType -EventId $EventIDStart -Message $Message -ComputerName $LogServer

    Then I created various rules/monitors or you can also create a management pack, in SCOM that was checking for these events.

    I created a custom event log in Windows for the runbook alerts:

    Event log name: "Runbook Alerts"

    Then I came up with different event ID's, all my runbooks had an own ID to be able to easily identify the runbook from the "Runbook Alerts" event log.

    • EventIDOK = 2XXXX
    • EventIDError = 3XXXX

    In some occasions I event had an event ID that was generated when the runbook was started:

    • EventIDStart = 1XXXX

    So let's say we have a runbook called Runbook1, it's ID is 25, that would make the event IDs as follows:

    • EventIDStart = 10025
    • EventIDOK = 20025
    • EventIDError = 30025


    So if each Runbook activity finishes successfully, an event ID of 20025 would be generated to the "Runbook Alerts" event log, if there was an error then an event ID of 30025 would be generated in the "Runbook Alerts" event log.

    Since I had multiple runbook servers, I created an Event Log Forwarder, so I forwarded all the "Runbook Alerts" event log to a specific "log server" that would gather all these logs, and then I would have SCOM looking for events from that "log server".

    Best regards,
    Leon


    Blog: https://thesystemcenterblog.com LinkedIn:

    Monday, January 28, 2019 4:03 PM
  • Hi,

    Found a document which seemed helpful for our question, please refer to the link below.

    This script will use SQL query in order to fetch any data related to runbook jobs that failed in Orchestrator. It receives the ID of the runbook we wish to monitor as an argument, and search for any runbook jobs failures. Upon finding runbook jobs failures, the script will create an array that will display the data regarding the job failure in a readable format.

    Orchestrator Runbook Jobs Failures Rule

    https://community.squaredup.com/answers/question/orchestrator-runbook-jobs-failures-rule/


    Please remember to mark the replies as answers if they help.
    If you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.com.


    Tuesday, January 29, 2019 3:09 AM
    Moderator

All replies

  • Hi

    You may consider the possibility to monitor the events that are generated on the SCO server to identify the failures.

    Cheers


    Sam (Please take a moment to "Vote as Helpful" and/or "Mark as Answer" wherever applicable. Thanks!) Blog:AnalyticOps Insights Twitter:Sameer Mhaisekar

    Monday, January 28, 2019 3:49 PM
  • Hello!

    I could share one way how I did this, I connected every Runbook activity to a Run .Net Script activity which wrote an event to the Windows Event Log, whether the Runbook activity succeeds or fails.

    The event would look something like this:

    Write-EventLog -LogName $LogName -Source $Source -EntryType $EntryType -EventId $EventIDStart -Message $Message -ComputerName $LogServer

    Then I created various rules/monitors or you can also create a management pack, in SCOM that was checking for these events.

    I created a custom event log in Windows for the runbook alerts:

    Event log name: "Runbook Alerts"

    Then I came up with different event ID's, all my runbooks had an own ID to be able to easily identify the runbook from the "Runbook Alerts" event log.

    • EventIDOK = 2XXXX
    • EventIDError = 3XXXX

    In some occasions I event had an event ID that was generated when the runbook was started:

    • EventIDStart = 1XXXX

    So let's say we have a runbook called Runbook1, it's ID is 25, that would make the event IDs as follows:

    • EventIDStart = 10025
    • EventIDOK = 20025
    • EventIDError = 30025


    So if each Runbook activity finishes successfully, an event ID of 20025 would be generated to the "Runbook Alerts" event log, if there was an error then an event ID of 30025 would be generated in the "Runbook Alerts" event log.

    Since I had multiple runbook servers, I created an Event Log Forwarder, so I forwarded all the "Runbook Alerts" event log to a specific "log server" that would gather all these logs, and then I would have SCOM looking for events from that "log server".

    Best regards,
    Leon


    Blog: https://thesystemcenterblog.com LinkedIn:

    Monday, January 28, 2019 4:03 PM
  • Hi,

    Found a document which seemed helpful for our question, please refer to the link below.

    This script will use SQL query in order to fetch any data related to runbook jobs that failed in Orchestrator. It receives the ID of the runbook we wish to monitor as an argument, and search for any runbook jobs failures. Upon finding runbook jobs failures, the script will create an array that will display the data regarding the job failure in a readable format.

    Orchestrator Runbook Jobs Failures Rule

    https://community.squaredup.com/answers/question/orchestrator-runbook-jobs-failures-rule/


    Please remember to mark the replies as answers if they help.
    If you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.com.


    Tuesday, January 29, 2019 3:09 AM
    Moderator
  • The issue with this SQL approach is that an Orchestrator runbook can very well "sucessfully fail" : runbook state can be green even if one or more activities did not run as expected, but didn't actually crash.

    So an approach similar to the one Leon describes is better since it let you decide on which condition you want to trigger an alert.

    The one that I used for one of my customer is actually a mix of these solutions :

    Instead of using the event log it relies on a database. I created an "error handling runbook" that is called by every other runbook and creates a row on the database with runbook Id, status (success/error), error message and whatever else can be useful.

    Then SCOM queries this database in a very similar way to what is done on Xin's link.

    Tuesday, January 29, 2019 7:33 AM
  • Hi Raj,

    were you able to accomplish the task? All of the answers are pretty much helpful, because there isn't a standard way of doing this. I like Leon's approach the most, as I have done something similar in the past. 

    We will appreciate your feedback here!

    Regards,


    (Please take a moment to "Vote as Helpful" and/or "Mark as Answer" where applicable. This helps the community, keeps the forums tidy, and recognizes useful contributions. Thanks!) Blog: https://blog.pohn.ch/ Twitter: @StoyanChalakov

    Thursday, February 21, 2019 9:54 AM
    Moderator