none
SCOM 1801: False Alerts for Service Monitoring

    Question

  • Hi,

    we create classes & groups for the custom server, and monitor some services on one or more servers.

    Example:

    Group name: Custom.Windows.Group

    Members: Server1, Server2, Server3

    Only one server (Server1) has a service running named for example "MyService". But the other servers (server 2 & Server 3) has not this service.

    We monitor "MyService" on the group Custom.Windows.Group for availability.

    Now, the issue....

    SCOM generates alerts that "MyService" does not available on Server2 or Server3. But actually "MyService" runs only on Server1.

    Managed Services Host - (MyService) service is not running
    
    Source: SERVER2
    Path: 
    Last modified by: System
    Last modified time: 9/20/2018 12:39:53 PM
    Alert description: Service Managed Services Host - (MyService) failure on SERVER2. Please see the alert context for details.
    Notification subscription ID:: {E65B89F4-7264-973F-9BFF-FBD3E76EC6BE}
    How can we avoid such kind of "false positives"?

    Bug?

    MP problem?

    Best regards

    Birdal



    • Edited by _Birdal Thursday, September 20, 2018 3:56 PM
    Thursday, September 20, 2018 3:34 PM

Answers

  • Hey Birdal,

    Probably some slightly off discovery definition. If some service is only limited to some particular servers, your custom class discovery should be defined to discover only the services on those servers to be its instances. Or, you have the good old option of restricting the monitor using overrides.

    Hope this helps

    Cheers


    Sam (Please take a moment to "Vote as Helpful" and/or "Mark as Answer" wherever applicable. Thanks!) Blog:AnalyticOps Insights Twitter:Sameer Mhaisekar

    Friday, September 21, 2018 4:50 AM
  • Hi Birdal,

    What you say is absolutely right. I agree with you that the monitor should say "healthy" on the server where the service does not exist - ideally, the monitor should not even exist there.

    Having said that, I have seen many anomalies in this behaviour personally as well as on the forums. The service monitor goes critical on the servers where it does not exist. CyrAz also backs me up here-

    "I must say I've already witnessed the same behavior as Birdal, despite what is explained in the blog, and I've even sometimes witnessed a different behavior on different servers targeted by the same monitor."

    In this manner, yes I agree with you that this is a bug. I will try to have this addressed to the product team when I get a chance.

    Regarding

    >>Consider this example: if we would define different service monitor for each server with 5 services, we MUST define for 2.000 servers totally 10.000 service monitors... Is it logical? No, that isn't...<<

    Nope, of course it isn't logical. And fortunately you don't have to do that. in that case what you do is create 5 custom classes with a way to discover every service (with whatever logic) and then you create only 5 service monitors, targeted at the relevant class. That way you don't get false alarms as the particular service monitor is only acting on the server which has the instance of that service.

    Hope this helps clear up things for you a little, please let us know if you still have questions.

    Suggest you to give this a read as well:

    SCOM BASIC SERVICE MONITOR VS. WINDOWS SERVICE TEMPLATE

    Cheers 


    Sam (Please take a moment to "Vote as Helpful" and/or "Mark as Answer" wherever applicable. Thanks!) Blog:AnalyticOps Insights Twitter:Sameer Mhaisekar


    • Edited by Sameer Mhaisekar Monday, September 24, 2018 11:26 AM
    • Marked as answer by _Birdal Thursday, October 4, 2018 7:49 AM
    Monday, September 24, 2018 7:23 AM
  • >>Instead of this workoaround, we can also work with overrides for monitored service: disable this service monitor to a specific server if the service does not exist there?!<<

    Of course you can do it.

    Now if we're talking about *technically the most proper ideal solution*, then the first option of creating the exact discovery for you class is the right one. The override option is a little less optimized, but is generally easier and a requires less authoring efforts than the first one.

    Another option is the "windows service template", which does all the authoring for you and applies the monitor only where the service is actually present.

    >>That is perthaps a workaround, but as you can guess, this workaround is also an enormously big task in a big-sized IT infrastrcuture.<<

    Well, that is the base working model of SCOM, so I think we're gonna have to live with it for a little longer. However, like I said I will bring this to the product team's attention (along with some other things) when i get the chance.

    Hope this helps.

    Cheers


    Sam (Please take a moment to "Vote as Helpful" and/or "Mark as Answer" wherever applicable. Thanks!) Blog:AnalyticOps Insights Twitter:Sameer Mhaisekar


    • Edited by Sameer Mhaisekar Monday, September 24, 2018 11:28 AM
    • Marked as answer by _Birdal Thursday, October 4, 2018 7:50 AM
    Monday, September 24, 2018 11:25 AM

All replies

  • Hey Birdal,

    Probably some slightly off discovery definition. If some service is only limited to some particular servers, your custom class discovery should be defined to discover only the services on those servers to be its instances. Or, you have the good old option of restricting the monitor using overrides.

    Hope this helps

    Cheers


    Sam (Please take a moment to "Vote as Helpful" and/or "Mark as Answer" wherever applicable. Thanks!) Blog:AnalyticOps Insights Twitter:Sameer Mhaisekar

    Friday, September 21, 2018 4:50 AM
  • Hi,

    Looks like this is the default health states for monitoring services.

    SCOM 2012 R2 – Monitor a Windows Service


    Please remember to mark the replies as answers if they help.
    If you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.com.

    Friday, September 21, 2018 9:29 AM
    Moderator
  • Hi Birdal,

    I agree with Sameer, why would you target a service check to servers that do not have the service installed?
    Or do you mean this is like a failover setup where only one of the three servers has the service in a running state? If so you have an option available in the template that says Monitor Only Automatic Service. This also checks the startup type.

    Friday, September 21, 2018 11:16 AM
  • Hi Birdal,

    I join Sam here...you need to ensure you group only the servers that have the service or you make sure Service Discovery is written the proper way and discovers only existing instances.

    Do you use the SCOM template for Service monitoring? If yes, you need to target only servers that have the service on them. 

    Regards,


    (Please take a moment to "Vote as Helpful" and/or "Mark as Answer" where applicable. This helps the community, keeps the forums tidy, and recognizes useful contributions. Thanks!) Blog: https://blog.pohn.ch/ Twitter: @StoyanChalakov


    Friday, September 21, 2018 2:20 PM
    Moderator
  • However according to Kevin Holman, by default a non existing service should show up as healthy : https://blogs.technet.microsoft.com/kevinholman/2010/11/07/monitoring-windows-services-automatic-manual-and-disabled-using-checkstartuptype/

    There is an overridable parameter (Check Startup Type) that can make the monitor act differently, everything is detailed in the article.

    I must say I've already witnessed the same behavior as Birdal, despite what is explained in the blog, and I've even sometimes witnessed a different behavior on different servers targeted by the same monitor.

    In the end I agree with you guys : it's always better to have a properly designed discovery.


    • Edited by CyrAz Friday, September 21, 2018 3:55 PM
    Friday, September 21, 2018 3:05 PM
  • Hi Sam,

    I don'tb aggree with you.

    If any service does not exist on any server, this server state should be displayed HEALTHY.

    In my opinion, that is a bug.

    Best regards

    Birdal

    Monday, September 24, 2018 6:40 AM
  • Hi Stoyan,

    I am not agree with Sam, as I answered.

    No, we don't use SCOM template. But we use discovery process that collects all Windows servers  for any division to a group. Then we define service monitor for this group. 

    The expected behaviour is SCOM should show all server HEALTH if no of this monitored service exists on these servers. 

    Consider this example: if we would define different service monitor for each server with 5 services, we MUST define for 2.000 servers totally 10.000 service monitors... Is it logical? No, that isn't...

    Best reagrds

    Birdal

    Monday, September 24, 2018 6:46 AM
  • Hi CryAz,

    I am absolutely agree with you, as Kevin Holman also described: 

    "Probably the biggest complaint about this behavior – is the UI.  Health explorer will show “Healthy” for the service monitor, EVEN if the service is not running, or doesn’t exist.  Let me explain.  If the service is set to Manual or Disabled, and not running – the monitor will initialize, ignore the service, and show healthy.  This is probably not the best behavior and it would be nice if we could control this to show warning state or unmonitored state, but that is another topic.  Additionally, if the service does not exist – the monitor will also show as healthy.  It is simply ignored."

    OK, Kevin Holman writes...

    "Here – you can clearly see why we treat disabled or non-existent services as healthy, when CheckStartupType = True (which is the default)"

    We used in our MPs for "CheckStartUpType" the following:

    <CheckStartupType />

    Kevin Holman writes. for the following parameters in XML...

    <CheckStartupType>true</CheckStartupType>
    <CheckStartupType />

    "Note that BOTH uses a slightly different method to set CheckStartupType value, but both have the same effect – setting it to true."

    And altough we have the value "true" for the parameter "CheckStartupType", we get false Alerts.

    Best regards

    Birdal


    • Edited by _Birdal Monday, September 24, 2018 7:23 AM
    Monday, September 24, 2018 7:08 AM
  • Hi Birdal,

    What you say is absolutely right. I agree with you that the monitor should say "healthy" on the server where the service does not exist - ideally, the monitor should not even exist there.

    Having said that, I have seen many anomalies in this behaviour personally as well as on the forums. The service monitor goes critical on the servers where it does not exist. CyrAz also backs me up here-

    "I must say I've already witnessed the same behavior as Birdal, despite what is explained in the blog, and I've even sometimes witnessed a different behavior on different servers targeted by the same monitor."

    In this manner, yes I agree with you that this is a bug. I will try to have this addressed to the product team when I get a chance.

    Regarding

    >>Consider this example: if we would define different service monitor for each server with 5 services, we MUST define for 2.000 servers totally 10.000 service monitors... Is it logical? No, that isn't...<<

    Nope, of course it isn't logical. And fortunately you don't have to do that. in that case what you do is create 5 custom classes with a way to discover every service (with whatever logic) and then you create only 5 service monitors, targeted at the relevant class. That way you don't get false alarms as the particular service monitor is only acting on the server which has the instance of that service.

    Hope this helps clear up things for you a little, please let us know if you still have questions.

    Suggest you to give this a read as well:

    SCOM BASIC SERVICE MONITOR VS. WINDOWS SERVICE TEMPLATE

    Cheers 


    Sam (Please take a moment to "Vote as Helpful" and/or "Mark as Answer" wherever applicable. Thanks!) Blog:AnalyticOps Insights Twitter:Sameer Mhaisekar


    • Edited by Sameer Mhaisekar Monday, September 24, 2018 11:26 AM
    • Marked as answer by _Birdal Thursday, October 4, 2018 7:49 AM
    Monday, September 24, 2018 7:23 AM
  • Hi Sam,

    thank you for your detailed feedback...

    >>> "In this manner, yes I agree with you that this is a bug. I will try to have this addressed to the product team when I get a chance."

    I and also others will be very glad to get a solution for this bug.

    >>> "Nope, of course it isn't logical. And fortunately you don't have to do that. in that case what you do is create 5 custom classes with a way to discover every service (with whatever logic) and then you create only 5 service monitors, targeted at the relevant class."

    That is perthaps a workaround, but as you can guess, this workaround is also an enormously big task in a big-sized IT infrastrcuture.

    Instead of this workoaround, we can also work with overrides for monitored service: disable this service monitor to a specific server if the service does not exist there?!

    I don't know whioch one is better...

    Best regards

    Birdal


    • Edited by _Birdal Monday, September 24, 2018 10:26 AM
    Monday, September 24, 2018 10:24 AM
  • >>Instead of this workoaround, we can also work with overrides for monitored service: disable this service monitor to a specific server if the service does not exist there?!<<

    Of course you can do it.

    Now if we're talking about *technically the most proper ideal solution*, then the first option of creating the exact discovery for you class is the right one. The override option is a little less optimized, but is generally easier and a requires less authoring efforts than the first one.

    Another option is the "windows service template", which does all the authoring for you and applies the monitor only where the service is actually present.

    >>That is perthaps a workaround, but as you can guess, this workaround is also an enormously big task in a big-sized IT infrastrcuture.<<

    Well, that is the base working model of SCOM, so I think we're gonna have to live with it for a little longer. However, like I said I will bring this to the product team's attention (along with some other things) when i get the chance.

    Hope this helps.

    Cheers


    Sam (Please take a moment to "Vote as Helpful" and/or "Mark as Answer" wherever applicable. Thanks!) Blog:AnalyticOps Insights Twitter:Sameer Mhaisekar


    • Edited by Sameer Mhaisekar Monday, September 24, 2018 11:28 AM
    • Marked as answer by _Birdal Thursday, October 4, 2018 7:50 AM
    Monday, September 24, 2018 11:25 AM
  • I agree with Sam : for me the best practice would be to design a different MP for each different application that contains services, and to discover/monitor these services in the dedicated MP.

    If you have a requirement to only monitor some services and nothing more, you could use the service template, even though I don't really like it.

    Monday, September 24, 2018 2:03 PM
  • Hi Sam,

    I read about both possibilities and decided not to use "Windows Service Template". Instead, I prefer to create groups for each service running on servers.

    But I am not well in XML.

    Can you give me some XML examples how I can create a group of Windows Server (not clients) on which the same service runs?

    Please consider that these services are not Windows, but custom application services.

    Thank you in advance.

    Best regards

    Birdal


    • Edited by _Birdal Tuesday, September 25, 2018 11:41 AM
    Tuesday, September 25, 2018 11:41 AM
  • Hi Birdal

    It's the discovery that counts.  You can create custom discoveries for each service type and let em rip.  If your needs are not greater than what the Microsoft.SystemCenter.NTService provides, then you can even create custom discoveries which discovers instances of NTService.

    <DataSource ID="DS" TypeID="Windows!Microsoft.Windows.Win32ServiceInformationProviderWithClassSnapshotDataMapper">
      <ComputerName>$Target/Property[Type="Windows!Microsoft.Windows.Computer"]/NetworkName$</ComputerName>
      <ServiceName>MyCustomService</ServiceName>
      <Frequency>60</Frequency>
      <ClassId>$MPElement[Name="MSNL!Microsoft.SystemCenter.NTService"]$</ClassId>
      <InstanceSettings>
    	<Settings>
    	  <Setting>
    		<Name>$MPElement[Name="Windows!Microsoft.Windows.Computer"]/PrincipalName$</Name>
    		<Value>$Target/Property[Type="Windows!Microsoft.Windows.Computer"]/PrincipalName$</Value>
    	  </Setting>
    	  <Setting>
    		<Name>$MPElement[Name="MSNL!Microsoft.SystemCenter.NTService"]/ServiceName$</Name>
    		<Value>$Data/Property[@Name='Name']$</Value>
    	  </Setting>
    	  <Setting>
    		<Name>$MPElement[Name="MSNL!Microsoft.SystemCenter.NTService"]/ServiceProcessName$</Name>
    		<Value>$Data/Property[@Name='BinaryPathName']$</Value>
    	  </Setting>
    	  <Setting>
    		<Name>$MPElement[Name="MSNL!Microsoft.SystemCenter.NTService"]/DisplayName$</Name>
    		<Value>$Data/Property[@Name='DisplayName']$</Value>
    	  </Setting>
    	  <Setting>
    		<Name>$MPElement[Name="MSNL!Microsoft.SystemCenter.NTService"]/Description$</Name>
    		<Value>$Data/Property[@Name='Description']$</Value>
    	  </Setting>
    	  <Setting>
    		<Name>$MPElement[Name="System!System.Entity"]/DisplayName$</Name>
    		<Value>$Data/Property[@Name='DisplayName']$</Value>
    	  </Setting>
    	</Settings>
      </InstanceSettings>
    </DataSource>

    I need to have special settings for each application I discover so what I did is build a "template" Visual Studio project with known placeholders then when I have a new app, I copy the template and do a quick search & replace (easily done with PowerShell, heck, created a function that does it all)

    That said, looking back at the thread (your first message), I seem to understand that you've created a group and then targeted a monitor to the group.  If that's the case it's a design bug on your side as groups reside at the Management Server level not the agent level. 

    Now if what you want is some high level monitor that is healthy as long as the service is running on either server A, B or C then you'll need to first create a distributed app (DA) class (base off System!System.Service), and a custom class for your service, along with a containment relationship between the two.  If your custom class is NOT based off the NTService class then you'll need to create a custom service monitor, however you will have the option for the custom class not to roll-up at the server level (so it can be unhealthy but the server is healthy).  At the DA, you will need to create a "best of" dependency monitor using the containment relationship.  Then you'll need to create discoveries which will link the DA to the custom class (most of the time, I use PowerShell as I am doing other stuff in my discoveries but I would assume that the Microsoft.Windows.FilteredRegistryClassAndRelationshipDiscoveryProvider would suffice).  Note, however that the dependency monitor will not have detailed info from the individual agents so your custom alert will need to be generic.

    I noticed that you seem to write a lot of MP, so I would strongly suggest that you invest the time following up on the MP Authoring class found in MS Virtual Academy (or Channel 9).  It will clear up so many things and you will save sooo much time and a heck of a lot less hand banging! :)

    Finally apparently with Squared-Up v4 coming soon you will be able to more easily create distributed apps (I do not work for them, just a happy customer) using point and click.  You may want to look into it.

    Sorry for the long post and HTH

    • Marked as answer by _Birdal Wednesday, September 26, 2018 9:03 AM
    • Unmarked as answer by _Birdal Wednesday, September 26, 2018 9:03 AM
    Tuesday, September 25, 2018 6:22 PM
  • Hi.,

    my question is related to building a server computers group on which a specific service runs.

    Best regards

    Birdal

    Wednesday, September 26, 2018 9:03 AM
  • Hi,

    What is the best method to create SCOM dynamic groups?

    - Powershell

    - WMI

    - etc...

    ?

    Best regards

    Birdal

    Wednesday, September 26, 2018 9:22 AM
  • Hi Birdal,

    There is also a community MP for service monitoring. I have not tried it myself, but looks promising. Kevin Greene has a good walk-through of it. Please try and let us know your feedback also. :)

    SCOM - New Community MP for Monitoring Windows Services

    Hope this helps

    Cheers


    Sam (Please take a moment to "Vote as Helpful" and/or "Mark as Answer" wherever applicable. Thanks!) Blog:AnalyticOps Insights Twitter:Sameer Mhaisekar

    Wednesday, September 26, 2018 11:53 AM