locked
Maintenance Mode for all Objects. RRS feed

  • Question

  • Hi, any help appreciated.

    With the numerous Service Based Alerts that SCOM raises for different applications, is it definitely the case that if someone puts into maintenance mode all instances of the computer class which could contain the application (for example, if you have a TMG array and you maintenance all servers in the array) and in the MM you select "and all contained objects" from the UI, will this definitely maintenance the application completely ?  i.e. what I mean is, are all the service based alerts somehow related to the computer they exist on through being hosted there ultimately so a way to be sure you have completely Maintenanced a particular management packs alerts would be to put all instances of the computers hosting the app into MM.

    Logically I think this is not the case, but in which case how could you ever do a reliable maintenance feature unless you knew in advance all the specific classes you had to maintenance along with the computers to cover all alerts.

     

    I realise for health service heartbeats you need to do the watchers, thats a seperate issue.  This is more for the other MP's such as Exchange, TMG, Sharepoint where we have service based alerting alongside traditional host based.

     

    thanks

    Mike

    Thursday, October 6, 2011 12:14 PM

Answers

  • The first goal of MM is to stop alerting.  When you MM a computer, it puts all contained classes in MM and unloads any  monitors targeting those types.  If the containment relationship is defined three levels of derivation up from the concrete type, this is why you see this level of relationship.

    The not-monitored means that no monitors are running on those types - which is the correct outcome - the MM operation instructs the agent to unload all running workflows that target the MM object.

    For application availability, use the SLM feature and distributed applications.  This is going to be more reliable than trying to track it at the granular component level.


    Microsoft Corporation
    • Marked as answer by Mike261272 Monday, October 10, 2011 8:00 PM
    Monday, October 10, 2011 7:41 PM

All replies

  • In most cases, this will be the case.  When you put the computer in MM, all contained/hosted instances also get MM (which means unloaded by the agent).

    In special cases where you are using a management pack that uses a relationship that should not be used (AgentShouldManage) you can get an incomplete MM, but this is as I said, the "bad management pack" case.

    In one other case - where a watcher node is doing web monitoring of an application (from a different computer such as a management server) you may still get alerts for an application that you have put the hosting computer(s) for your web app into MM.  This is because that agent on the watcher node won't know you took the web app down.


    Microsoft Corporation
    • Proposed as answer by Nicholas Li Monday, October 10, 2011 3:32 AM
    Friday, October 7, 2011 3:57 PM
  • Hi Dan,

     

    thanks for the reply.  To take the MS TMG MP as an example.  There are the various server centric objects in there, such as the firewall service etc then there is the concept of a firewall Array.  The Array is defiend as a rollup of the various components that make up the firewall services across multiple computer objects.

    If I MM all computer objects within the array then when looking at the array object in the scom UI, it moves not to maintenanced but to "not monitored". Of you open health explorer then child objects eventually get down to the server based classes which are maintenanced but that is a good 3 or so levels down. Therefore although the MM of the computer related objects has resulted in the alerting being stopped from an availability perspective this object would not show as 'in  maintenance' during this period therefore availability reports for the array would now be incorrect, is my thinking correct here ?

    If we wish to track the true application availability we have to maintenance the computer object, all children, plus any objects that are "above" this in the sense that they represent an entity that covers multiple machines.  If this is the case we are currently a bit lost as to how we do this in a large (12000+ agents) environment when the scheduled maintenance mode resource kit utility does not allow more than 20 jobs.

    Is my understanding of this correct ?

    any help very much appreciated.

     

    Mike

    Monday, October 10, 2011 5:39 PM
  • The first goal of MM is to stop alerting.  When you MM a computer, it puts all contained classes in MM and unloads any  monitors targeting those types.  If the containment relationship is defined three levels of derivation up from the concrete type, this is why you see this level of relationship.

    The not-monitored means that no monitors are running on those types - which is the correct outcome - the MM operation instructs the agent to unload all running workflows that target the MM object.

    For application availability, use the SLM feature and distributed applications.  This is going to be more reliable than trying to track it at the granular component level.


    Microsoft Corporation
    • Marked as answer by Mike261272 Monday, October 10, 2011 8:00 PM
    Monday, October 10, 2011 7:41 PM