locked
SCOM Maintenance Mode, where is IT? RRS feed

  • Question

  • I have reviewed everyone's tools to accommodate maintenance mode and each one has great value no doubt.  But all are lacking in one area or another.  My initial requirements for maintenance mode were to accommodate our monthly server maintenance where our operators would put "waves" of servers in maintenance mode during our scheduled maintenance windows.  Each wave could consist of 50 to 300 servers or more.  So what I did was build a console view and task for our operators, the task executes a PowerShell script that opens and reads a excel spreadsheet (column 1 = FQDN of servers),

     

    #########################################

    # Setup Connection to Excel SpreadSheet                              #

    #########################################

    $excelApp = New-Object -COM Excel.Application

    $TheMasterDeviceList = "c:\localdocs\mm\Load-It-Up\Loader.xls"

    $book = $excelApp.Workbooks.Open($TheMasterDeviceList)

    $sheet = $book.Worksheets.Item(1)

     

    then the script makes the connection to the RMS

     

    ###################################################################

    # Setup the Connection to the RMS SDK                                                                                           #

    ###################################################################

    Add-PSSnapin "Microsoft.EnterpriseManagement.OperationsManager.Client" -ErrorVariable errSnapin;

    Set-Location "OperationsManagerMonitoring::" -ErrorVariable errSnapin;

    new-managementGroupConnection -ConnectionString:server.domain.com -ErrorVariable errSnapin;

    set-location server.domain.com -ErrorVariable errSnapin;

     

    and places the servers in maintenance, within the task I write the output so the operator sees each server as it is placed in maintenance.  I of course exclude all of our SCOM infrastructure servers, right.

     

    if ($ServerName -match '(RMS | OPSdb | OPSdw | etc.)')

             { write-host "Skip - SCOM Infrastructure Server: " $ServerName } #skip it

             else { .....

     

                       .....

     

                      $GetTheAgent.HostComputer | New-MaintenanceWindow -StartTime:$StartTime -EndTime:$EndTime

                      -Comment "Monthly Maintenance for server:$ServerName"

     

    The problems with this solution in general are (1) its TO slow, maintenance window starts at 10:00am, so what time do we execute the task to get them all in at 10:00am.  My solution here is probably to do a DelayedStartTime

     

    $StartTime = [System.DateTime]::Now

    $DelayedStartTime = $StartTime.AddMinutes(X)

    ......

    ......

    New-MaintenanceWindow -startTime:$DelayedStartTime -endTime:$EndTime -monitoringObject:$healthService -comment "Monthly Maintenance for server:$ServerName"

     

    Another problem is what about the Proxy monitors, web, port, etc.  If I put the watcher node/Proxy server in maintenance, everything (all port/web monitors) is in maintenance, not good, so now the challenge is to figure out, for SAP server1, what web and/or port objects do I place in maintenance during the maintenance of SAP server1, sheez, big challenge.

     

    Also, why is maintenance mode via PowerShell so SLOW when compared to the console action to place a server in maintenance mode?  Obviously the connection to the RMS, etc. all add latency, but it SHOULD NOT take as long as it does.

     

    So, is there a question here somewhere, yes...I am wanting to see maintenance mode in the CONSOLE.  Something like a web view where we could "check box" the servers we want then "apply", done.

     

    Does anyone see a way this could be done similar to the way OpsLogix built there console configuration view (awesome stuff).

     

    FYI:  I don't want to have to create (dynamic groups) every time I ran our monthly maintenance, it would be TOO slow.  So the group option is out for me.

     

    Would be very interested in all of your thoughts on this subject…yet again…

     

    Many Thanks in advance…

    Tuesday, June 15, 2010 2:10 PM

Answers

  • hi

    Yep - agree on all counts. Maintenance mode isn't pretty at the moment.

    In regard to this question - "Another problem is what about the Proxy monitors, web, port, etc.  If I put the watcher node/Proxy server in maintenance, everything (all port/web monitors) is in maintenance, not good, so now the challenge is to figure out, for SAP server1, what web and/or port objects do I place in maintenance during the maintenance of SAP server1, sheez, big challenge."

    For proecss monitors and a link that helps with webs monitors, have a look here: http://systemcentersolutions.wordpress.com/category/maintenance-mode/

    But no, it most certainly isn't a great way to have to do things.

    Good Luck

    Graham


    View OpsMgr tips and tricks at http://systemcentersolutions.wordpress.com/
    • Proposed as answer by Vivian Xing Wednesday, June 16, 2010 7:24 AM
    • Marked as answer by Vivian Xing Friday, June 25, 2010 8:09 AM
    Tuesday, June 15, 2010 5:00 PM
  • As far as being slower in Command Shell vs Console, we make the same calls to the SDK in Command Shell as we do in the console, so I've got a hunch that maybe you're script needs a bit of performance tuning.  You didn't post your script, so it's hard to say.  I'll make one blind suggestion.  Build an array of agents one time, and then use that array for your filtering during each pass of putting agent into MM.  In other words, use the get-Agent cmdlet only one time in your script...and do not loop it...because this returns a lot of data and requires a lot of processing on the client.

    Incidentally, when an agent is put into MM, this is in effect immediately.  Some have the notion that we need to wait for the Config update to suspend workflows on the agent computer, but this is not true.  When the RMS receives the MM command about an agent, the SDK will immediately proceed to resolve any alerts that were generated by that health service during the interim of the health service actually suspending its running workflows.


    HTH, Jonathan Almquist - MSFT

    Friday, June 18, 2010 6:52 AM

All replies

  • hi

    Yep - agree on all counts. Maintenance mode isn't pretty at the moment.

    In regard to this question - "Another problem is what about the Proxy monitors, web, port, etc.  If I put the watcher node/Proxy server in maintenance, everything (all port/web monitors) is in maintenance, not good, so now the challenge is to figure out, for SAP server1, what web and/or port objects do I place in maintenance during the maintenance of SAP server1, sheez, big challenge."

    For proecss monitors and a link that helps with webs monitors, have a look here: http://systemcentersolutions.wordpress.com/category/maintenance-mode/

    But no, it most certainly isn't a great way to have to do things.

    Good Luck

    Graham


    View OpsMgr tips and tricks at http://systemcentersolutions.wordpress.com/
    • Proposed as answer by Vivian Xing Wednesday, June 16, 2010 7:24 AM
    • Marked as answer by Vivian Xing Friday, June 25, 2010 8:09 AM
    Tuesday, June 15, 2010 5:00 PM
  • Thanks Graham.  Somehow I am going to find the time in my 26 hour day to build something to fill this gap.

    Wednesday, June 16, 2010 1:47 PM
  • As far as being slower in Command Shell vs Console, we make the same calls to the SDK in Command Shell as we do in the console, so I've got a hunch that maybe you're script needs a bit of performance tuning.  You didn't post your script, so it's hard to say.  I'll make one blind suggestion.  Build an array of agents one time, and then use that array for your filtering during each pass of putting agent into MM.  In other words, use the get-Agent cmdlet only one time in your script...and do not loop it...because this returns a lot of data and requires a lot of processing on the client.

    Incidentally, when an agent is put into MM, this is in effect immediately.  Some have the notion that we need to wait for the Config update to suspend workflows on the agent computer, but this is not true.  When the RMS receives the MM command about an agent, the SDK will immediately proceed to resolve any alerts that were generated by that health service during the interim of the health service actually suspending its running workflows.


    HTH, Jonathan Almquist - MSFT

    Friday, June 18, 2010 6:52 AM