none
SCCM 2012 R2 - Maintenance Windows - a few questions

    Question

  • Let's say I configure a maintenance windows (MW) for the 2nd saturday of every month from 6p-10p and I have 100 servers in this maintenance window, are they all going to try and patch and reboot at the same time? (Assuming they are all online)

    This brings up concerns when trying to plan maintenance windows for our servers - especially when we are 80%+ virtualized. I can't find anything that states the sequence of events that happen when a machine goes to install patches based on a MW. Do they wait on a heartbeat? Do they typically kick off right at the front of the MW? Does SCCM spread the installs out?

    We currently use WSUS (working on migrating to SCCM) and in WSUS we have a ton of groups based on the hour so we spread out the machines being patched. I am trying to avoid having so many different options and times and instead just using MWs.

    Another issue we are going to run into is certain machines need to be patched/rebooted in a certain order and with others we just cant have them all down at the same time (e.g. domain controllers). I supposed I could create 2-3 phases per patch weekend to spread those machines out - is that the best option? (for instance; on the 2nd Saturday we have MW Phase 1 from 6p-8, MW Phase 2 from 8-10, and MW Phase 3 from 10-midnight)


    • Edited by atom_acres Tuesday, September 09, 2014 7:47 PM
    Tuesday, September 09, 2014 7:47 PM

Answers

  • Let's say I configure a maintenance windows (MW) for the 2nd saturday of every month from 6p-10p and I have 100 servers in this maintenance window, are they all going to try and patch and reboot at the same time? (Assuming they are all online)

    This brings up concerns when trying to plan maintenance windows for our servers - especially when we are 80%+ virtualized. I can't find anything that states the sequence of events that happen when a machine goes to install patches based on a MW. Do they wait on a heartbeat? Do they typically kick off right at the front of the MW? Does SCCM spread the installs out?

    We currently use WSUS (working on migrating to SCCM) and in WSUS we have a ton of groups based on the hour so we spread out the machines being patched. I am trying to avoid having so many different options and times and instead just using MWs.

    Another issue we are going to run into is certain machines need to be patched/rebooted in a certain order and with others we just cant have them all down at the same time (e.g. domain controllers). I supposed I could create 2-3 phases per patch weekend to spread those machines out - is that the best option? (for instance; on the 2nd Saturday we have MW Phase 1 from 6p-8, MW Phase 2 from 8-10, and MW Phase 3 from 10-midnight)


    Hi,

    I often help customers with this scenario that you are facing. I'll try to answer your questions as clearly as possible.

    A) Yes, if you have 100 servers in one MW during 6p-10p, they will start at 6p if they have recieved the policy before that. What you can do is to activate the Deadline randomization that was introduced in ConfigMgr 2012 SP1. You can read more on that here http://technet.microsoft.com/en-us/library/gg682067.aspx

    B) ConfigMgr does not wait for a heartbeat or anything like that. You need to manage to logic to move your workload from hosts before you take them down for a reboot. VMM could help you with this IIRC. Or at least SCSM SMA and PowerShell. The sequence is basically that they install the patches, if a reboot is required, it will reboot at the end of the patches.

    C) Yes you need to create different MWs and place those servers in the MW order that they need to be rebooted. Be careful tough. If you have ServerA, ServerB and ServerC that need to reboot in that order. You place them in MWs in that order. You might end up getting patches to some or all of those servers from a Patch Tuesday but perhaps only ServerB will need a reboot so only ServerB will reboot and thus put you in a place you don't want to be. To solve this, create a re-running package/program deployment that reboots ALL servers that are sensitive each and EVERY MW.


    Tim Nilimaa | Blog: http://infoworks.tv | Twitter: @timnilimaa

    • Proposed as answer by Jason SandysMVP Tuesday, September 09, 2014 8:15 PM
    • Marked as answer by atom_acres Wednesday, September 10, 2014 11:52 AM
    • Unmarked as answer by atom_acres Wednesday, September 10, 2014 2:40 PM
    • Marked as answer by atom_acres Wednesday, September 10, 2014 4:36 PM
    Tuesday, September 09, 2014 8:06 PM

All replies

  • Like you already suggested, I would advice you to use update groups and spread the servers out of a couple of weekends.

    Besides the fact that you have a bit more control it solves also a security risk which you should consider. When you apply patches to all of your servers and there is a problem the impact is only for the servers in the update group you apply patches for. It gives you a bit more control over the situation.


    • Edited by JohnvLeeuwen Tuesday, September 09, 2014 8:02 PM
    Tuesday, September 09, 2014 8:01 PM
  • Let's say I configure a maintenance windows (MW) for the 2nd saturday of every month from 6p-10p and I have 100 servers in this maintenance window, are they all going to try and patch and reboot at the same time? (Assuming they are all online)

    This brings up concerns when trying to plan maintenance windows for our servers - especially when we are 80%+ virtualized. I can't find anything that states the sequence of events that happen when a machine goes to install patches based on a MW. Do they wait on a heartbeat? Do they typically kick off right at the front of the MW? Does SCCM spread the installs out?

    We currently use WSUS (working on migrating to SCCM) and in WSUS we have a ton of groups based on the hour so we spread out the machines being patched. I am trying to avoid having so many different options and times and instead just using MWs.

    Another issue we are going to run into is certain machines need to be patched/rebooted in a certain order and with others we just cant have them all down at the same time (e.g. domain controllers). I supposed I could create 2-3 phases per patch weekend to spread those machines out - is that the best option? (for instance; on the 2nd Saturday we have MW Phase 1 from 6p-8, MW Phase 2 from 8-10, and MW Phase 3 from 10-midnight)


    Hi,

    I often help customers with this scenario that you are facing. I'll try to answer your questions as clearly as possible.

    A) Yes, if you have 100 servers in one MW during 6p-10p, they will start at 6p if they have recieved the policy before that. What you can do is to activate the Deadline randomization that was introduced in ConfigMgr 2012 SP1. You can read more on that here http://technet.microsoft.com/en-us/library/gg682067.aspx

    B) ConfigMgr does not wait for a heartbeat or anything like that. You need to manage to logic to move your workload from hosts before you take them down for a reboot. VMM could help you with this IIRC. Or at least SCSM SMA and PowerShell. The sequence is basically that they install the patches, if a reboot is required, it will reboot at the end of the patches.

    C) Yes you need to create different MWs and place those servers in the MW order that they need to be rebooted. Be careful tough. If you have ServerA, ServerB and ServerC that need to reboot in that order. You place them in MWs in that order. You might end up getting patches to some or all of those servers from a Patch Tuesday but perhaps only ServerB will need a reboot so only ServerB will reboot and thus put you in a place you don't want to be. To solve this, create a re-running package/program deployment that reboots ALL servers that are sensitive each and EVERY MW.


    Tim Nilimaa | Blog: http://infoworks.tv | Twitter: @timnilimaa

    • Proposed as answer by Jason SandysMVP Tuesday, September 09, 2014 8:15 PM
    • Marked as answer by atom_acres Wednesday, September 10, 2014 11:52 AM
    • Unmarked as answer by atom_acres Wednesday, September 10, 2014 2:40 PM
    • Marked as answer by atom_acres Wednesday, September 10, 2014 4:36 PM
    Tuesday, September 09, 2014 8:06 PM
  • Like you already suggested, I would advice you to use update groups and spread the servers out of a couple of weekends.

    Besides the fact that you have a bit more control it solves also a security risk which you should consider. When you apply patches to all of your servers and there is a problem the impact is only for the servers in the update group you apply patches for. It gives you a bit more control over the situation.



    First you do TEST, then production. But remember, Security and Critical patches can be more of a life saver than antimalware products. Don't wait too long (like a week or two) or that can be a security risk as well. So to spread it out over weekends(!) might not be ideal. Better Test goes Release+1h or something and Production Realease+24h (after one day of testing). Now this isn't a perfect solution, I just wanted to spread some light on the fact that it can be an issue to wait a "long" time.

    Tim Nilimaa | Blog: http://infoworks.tv | Twitter: @timnilimaa

    Tuesday, September 09, 2014 8:21 PM
  • Yes, I agree, waiting too long can cause a whole lot of different problems. But then again, a bit of control could never hurt! :)

    But my experience so far is that patch management differs per customer. Some customers are very fast in adopting the patches others are more conservative. Nothing wrong with both choices, but these requirements effect the way you apply the patches.

    Also a good point regarding the test servers, test your patches before applying them! Best advice ever! :)

    Tuesday, September 09, 2014 8:48 PM

  • A) Yes, if you have 100 servers in one MW during 6p-10p, they will start at 6p if they have recieved the policy before that. What you can do is to activate the Deadline randomization that was introduced in ConfigMgr 2012 SP1. You can read more on that here http://technet.microsoft.com/en-us/library/gg682067.aspx



    Tim Nilimaa | Blog: http://infoworks.tv | Twitter: @timnilimaa

    So by going into my client settings and changing the setting 'Disable deadline randomization' from YES to NO - and ensuring that my maintenance windows are at least 2 hrs long (default deadline randomization is up to 120 minutes) I can expect my machines to install updates at different times throughout the MW??

    Also, FYI, I am using an ADR that deploys the updates and sets the deadline to "As soon as possible" but I make the machines wait until the MW.

    Wednesday, September 10, 2014 1:57 PM
  • The best course of action here is to create multiple collections that each have their own distinct maintenance windows and then separate your servers between these collections appropriately. This will control and stage the installation of updates in an orderly fashion. Randomness with server patching and reboots can only lead to bad things IMO.

    Note that this doesn't change the target collection for your update deployment, these collections are strictly to set MWs.

    Also, ADRs do *not* deploy updates. ADRs create or update three other object type: Software Update Groups, Update Deployments, Update Packages. The parameters defined in the ADR are used when creating or updating these objects.


    Jason | http://blog.configmgrftw.com | @jasonsandys

    Wednesday, September 10, 2014 3:00 PM
  • The scenario I am working on would be something like this:

    Non-Prod servers updated on X weekend - split between round 1, round 2, and round 3

    Prod servers updated on Y weekend - split between round 1, round 2, and round 3

    Rounds would be something like this: 1 = 6p-8p; 2 = 8p-10p; 3 = 10p-12

    This splits them up some and allows for me to patch machines in a certain order. 

    My concern with this approach is that even splitting them up between 3 "rounds" there still might be 40 machines patching and potentially rebooting right at the beginning of the "round" (or MW). By enabling deadline randomization and setting the installation deadline on the ADR to ASAP - does this ensure that not all 40 machines will patch and reboot right at 6pm during Round 1 (etc)?

    *EDIT* - A little more info.... We currently have a bunch of groups in WSUS split between every weekend of the month and every hour of the evening. I am trying to simplify that setup when we move to SCCM and still accomplish ordered patching when needed and not having everything patch and reboot at the same time (I don't want to bring our vmware hosts to their knees)

    • Edited by atom_acres Wednesday, September 10, 2014 3:23 PM
    Wednesday, September 10, 2014 3:13 PM
  • My concern with this approach is that even splitting them up between 3 "rounds" there still might be 40 machines patching and potentially rebooting right at the beginning of the "round" (or MW).

    Why is that a concern? It's common practice. Also, systems will have different updates queued up for them and will also take different amounts of time to install the updates in the first place so there already is some inherent randomness as to when they actually perform their reboot.

    Jason | http://blog.configmgrftw.com | @jasonsandys

    Wednesday, September 10, 2014 3:23 PM
  • This is going to sound long winded....

    Let's say we only have 2-3 different OSs and most of our machines are set up very similarly and we aren't getting too crazy with what updates we are pulling in - there's a good chance most of the machines will be pulling in similar updates. If they require a reboot and they all started the patch process right at 6pm they could all be rebooting at or close to the same time. With most of our servers in vmware I have concerns with 30-40 VMs rebooting together.

    Now on one hand this is a lot of IFs and VMware may be able to handle this.

    On the other hands, if this deadline randomization works like I think it does (no one has answered that question yet) then I won't have to worry about all those IFs 

    Its my job as an analyst to be paranoid :]

    *EDIT* 

    FYI: I spoke with our VMware admin and he said his main concern is our backend storage can't handle a bunch of VMs rebooting at or around the same time. (typically around 20+ that reside on the same storage pool)

    • Edited by atom_acres Wednesday, September 10, 2014 3:59 PM
    Wednesday, September 10, 2014 3:38 PM
  • So yes, disk IO and shared resources can certainly be a concern in virtual server farm, but to me random reboot times and random patch installation times are the greater of those two evils. Remember that this randomization will affect all deployments targeted to those systems also.

    To directly answer the question, I don't remember explicitly since I don't disable randomization (it was enabled pre-SP1 and couldn't be disabled in the UI so this was discussed multiple times). I seem to remember that deadline randomization had no effect within a maintenance window but can't find that documented anywhere anymore -- not that it isn't, I just can't find it right now (or I could be wrong).

    So, ultimately though, what's to stop you from creating more granular collections with staggered MWs? Its a one time activity that will mitigate the risk you are concerned about.


    Jason | http://blog.configmgrftw.com | @jasonsandys

    Wednesday, September 10, 2014 4:06 PM
  • Yes that would be an alternative and it would be a one time set up. I am looking to simplify the process and setup if possible. If I can't do what I outlined above I will have to use this setup. Just seems like there should be an easier way then having to manage a bunch of hourly patch windows. 

    I agree I do not want my servers 'randomly' rebooting but if I know it will only happen during the maintenance window I define and that window is only 2 hrs or so - then I am OK with that. 

    Thanks for all the answers and help. I will go ahead and re-mark the answer. However, if anyone has the answer to the following question, please respond below!

    By utilizing maintenance windows, enabling deadline randomization, and setting the installation deadline on my ADR to ASAP - does this ensure that not all machines will patch and reboot right at the beginning of a maintenance window? AKA, will they spread out their patches/reboots across the deadline randomization period but within the constraints of the maintenance window?

    *EDIT*

    decided to create a new post for the question above - you can find it here

    • Edited by atom_acres Wednesday, September 10, 2014 4:57 PM
    Wednesday, September 10, 2014 4:36 PM