none
Unix/Linux Shell Command Two State Monitor Stays Not Monitored

    Question

  • Hi,

    I could really use some help on this one because i have been struggling with this for some days now.
    At the site i am working we are running SCOM 2012 R2 RU4. I gave installed the SCOM agent on three Solaris 10 machines using the root account.

    The data comes in perfectly and several checks are done as expected.
    Now i need to create some custom monitoring for which i start to use the template " Unix/Linux Shell Command Two State Monitor "
    To make sure that i am able to run shell commands i build a test monitor as described in http://technet.microsoft.com/en-us/library/jj126087.aspx

    I configured a Unix runsas account and places it in the Unix action and agent profile. I had a Unix admin test the command on the Unix machine under the confired account. It gives an output as expected. So the permissions seem OK.

    Now in SCOM unfortunately the newly created monitor stays "Not Monitored". Yes, it is enabled and targeted to the Unix computer class.

    I have enabled the verbose logging on the Unix machine. Unfortunately this does not give any usefull info.

    Who has any experience with this?

    Regards,

    Marthijn.

    Friday, November 28, 2014 12:08 PM

Answers

  • Marthijn,

    The script will be run by the SCOM server against the agent. The UNIX/Linux agents are one way communication only. SCOM pulls them for their data so when it runs the 'Two State Monitor' you created it will call the 'winrm invoke ExecuteShellCommand' from my first post passing in your script. If you are not seeing this winrm call in the TRACE logging of the agent, SCOM is not sending it and there is something wrong on the SCOM side. Check the Windows Event logs under 'Operations Manager' on the SCOM server and see if there are any alerts related to your monitor.

    Try creating a 'Task' in SCOM using your script and target it against the SUN server. You can run this task manually and it will run the script through SCOM and this will tell us if your script is getting processed properly. If the task works then I still think you have something misconfigured in the 'Two State Monitor'.

    Regards,

    -Steve

    Monday, December 01, 2014 3:11 PM
    Moderator

All replies

  • Marthijn,

    See if you can run your monitor via a winrm script. Follow the steps below and see if it works.

    Create a test xml file [c:\temp\shellCmd.xml] with the following:

    <p:ExecuteShellCommand_INPUT xmlns:p="http://schemas.microsoft.com/wbem/wscim/1/cim-schema/2/SCX_OperatingSystem"> 
      <p:Command>#!/bin/sh
    enter your shell script here...
    exit 0
      </p:Command>
      <p:timeout>10</p:timeout>
    </p:ExecuteShellCommand_INPUT>

    Save the file then run the following winrm command, modify with your settings.

    winrm invoke ExecuteShellCommand http://schemas.microsoft.com/wbem/wscim/1/cim-schema/2/SCX_OperatingSystem?__cimnamespace=root/scx -remote:https://<UNIX/Linx system>:1270 -auth:basic -username:<UNIX/Linux user> -password:<UNIUX/Linux password> -skipCAcheck -skipCNcheck -skiprevocationcheck -encoding:utf-8 -file:c:\temp\shellCmd.xml

    If all goes well the command will return with your results. If it fails post the error message. If you have trace logging enabled on the agent you should see this call being processed. If this works and you still cannot get it to work through SCOM I would recheck all setting in the monitor.

    Regards,

    -Steve

    Friday, November 28, 2014 4:59 PM
    Moderator
  • Hoi Steve,

    Thank you for the quick reply and this tip.
    I have a script on the SUN server which simply tests the existance for a folder. using WINRM i get the output as expected. This tells me that the account i use is configured correctly.

    When i put this same script in a monitor using the " Unix/Linux Shell Command Two State Monitor " i get nothing.

    I am trying to figure out how SCOM will execute this specific script. I would expect it to run via the installed agent.

    Best regards,

    Marthijn.

    Monday, December 01, 2014 8:18 AM
  • Marthijn,

    The script will be run by the SCOM server against the agent. The UNIX/Linux agents are one way communication only. SCOM pulls them for their data so when it runs the 'Two State Monitor' you created it will call the 'winrm invoke ExecuteShellCommand' from my first post passing in your script. If you are not seeing this winrm call in the TRACE logging of the agent, SCOM is not sending it and there is something wrong on the SCOM side. Check the Windows Event logs under 'Operations Manager' on the SCOM server and see if there are any alerts related to your monitor.

    Try creating a 'Task' in SCOM using your script and target it against the SUN server. You can run this task manually and it will run the script through SCOM and this will tell us if your script is getting processed properly. If the task works then I still think you have something misconfigured in the 'Two State Monitor'.

    Regards,

    -Steve

    Monday, December 01, 2014 3:11 PM
    Moderator
  • Hi Steve,

    That certainly clears things up for me. Thanks.

    I found that when i did the same test in another SCOM management group the first two state monitor i created actually worked. Any other after that didn't. Had the same issue in the environment i am working in right now.

    I will jump on this first thing tommorow and check the logs. Would be great if i can get this to work and do some cool Unix monitoring.

    Best regards,

    Marthijn.


    Monday, December 01, 2014 9:04 PM
  • Found the cause....

    Thanks to Steve i took a closer look at the Operations Manager eventlog and found two events:

    Warning EventID 5409:

    Failed to replace parameter while creating the alert for monitor state change. It was possibly caused by incorrect XPATH and will result in monitor unload.

    Failing replacement: $Data/Context///*[local-name()=”StdOut”]$

    Error EventID 4503:

    A module reported an error 0x80FF0004 from a callback which was running as part of rule .............

    I would have expected SCOM to alert about these failures (workflow alerts), but this is not the case.

    As the warning shows i didn't configure the Alert description the right way. As soon as i removed this failed string the monitor change from status Not Monitored.

    Thnx again Steve!

    Tuesday, December 02, 2014 8:12 AM
  • Found the cause....

    Thanks to Steve i took a closer look at the Operations Manager eventlog and found two events:

    Warning EventID 5409:

    Failed to replace parameter while creating the alert for monitor state change. It was possibly caused by incorrect XPATH and will result in monitor unload.

    Failing replacement: $Data/Context///*[local-name()=”StdOut”]$

    Error EventID 4503:

    A module reported an error 0x80FF0004 from a callback which was running as part of rule .............

    I would have expected SCOM to alert about these failures (workflow alerts), but this is not the case.

    As the warning shows i didn't configure the Alert description the right way. As soon as i removed this failed string the monitor change from status Not Monitored.

    Thnx again Steve!

    Same issue here, apparently 

    $Data/Context///*[local-name()="StdOut"]$

    Is not accepted in the Alert description of a Rule???

    Thursday, September 15, 2016 7:05 AM
  • For that purpose you should use the <AlertParameters> configuration:

    <AlertParameters>
       <AlertParameter1>$Data/Context///*[local-name()="StdOut"]$</AlertParameter1>
    </AlertParameters>

    You can then refer to it in the Alert Description something like this:

    <DisplayString ElementID="YourMonitorID_AlertMessage">
      <Name>Your Monitor Name</Name>
      <Description>The Output of StdOut is: {0}</Description>
    </DisplayString>

    This should work the same for Rules.

    Friday, September 16, 2016 10:11 AM
  • An old post, but i bumped into the same issue yesterday and found yet another root cause, might help someone else:

    Yours and mine faulty replacements: $Data/Context///*[local-name()=”StdOut”]$

    The one that worked for me:           $Data/Context///*[local-name()="StdOut"]$

    See the difference? Here it is: ” and " (quotation marks)




    Best regards. Alex

    Wednesday, February 14, 2018 12:11 PM