locked
Configuration Manager Site and Component Failure Scenario RRS feed

  • Question

  • I've been scouring the web and sifting through as many documents as I can find to answer the questions that follow. I have also been working with a lab to help figure some of this out myself. Prior to launching off into my next round of experiments, I wanted to pose a few questions to the experts that reside in this forum in hopes of guiding my journey down the SCCM 2012 rabbit-hole.

    So far I recognize that the best way to approach HA in an SCCM environment is to provision multiple instances of the following Site System Roles:

    • Management Point
    • Distribution Point
    • Software Update Point

    There are a few others but I am not concerned with them at this point.

    I also recognize the following (straight from the docs and key facts highlighted):

    Client operations are autonomous

    Configuration Manager client autonomy includes the following:

    • Clients do not require continuous contact with any specific site system servers. They use known configurations to perform preconfigured actions on a schedule.
    • Clients can use any available instance of a site system role that provides services to clients, and they will attempt to contact known servers until an available server is located.
    • Clients can run inventory, software deployments, and similar scheduled actions independent of direct contact with site system servers.
    • Clients that are configured to use a fallback status point can submit details to the fallback status point when they cannot communicate with a management point.

    Clients cache information to use in the future

    When a client communicates with a management point, the client can obtain and cache the following information:

    • Client settings.
    • Client schedules.
    • Information about software deployments and a download of the software the client is scheduled to install, when the deployment is configured for this action.

    When a client cannot contact a management point the following actions are taken:

    • Clients locally cache the status, state, and client information they report to the site, and transfer this data after they establish contact with a management point.
    • Information about software deployments and a download of the software the client is scheduled to install, when the deployment is configured for this action.

    So assuming that I am employing the following site design:

    • Single Primary Site
    • Clustered SQL database on disparate system
    • Multiple Distribution Points

    Now consider the scenario where I have scheduled an update deployment and a few hours later the Primary Site Server goes down. Let's also assume that the updates were successfully pushed to the DPs.

    What happens at the client level with regards to the scheduled updates now that the Primary is down (along with its associated MP)?

    So my initial thoughts are that some of the clients will have pulled down information about the update schedule and cached it. Can I assume that these clients will successfully install the updates?

    Can I mitigate this issue (make sure that all clients pick up the directive to install said updates) by provisioning another MP on a different Site System?

    What does having an additional SUP buy me? Isn't this just a mechanism to interface with Microsoft in order to pull down specific updates?






    • Edited by SecOpsGuy Thursday, March 12, 2015 6:06 PM
    Thursday, March 12, 2015 5:25 PM

Answers

  • First, note that using a separate clustered SQL DB adds complexity and latency with no true advantages generally. Most folks, including very large enterprises, use co-located SQL Server. Because the primary site server itself is always a single point of failure, having a clustered SQL adds no real high availability.

    What happens at the client level with regards to the scheduled updates now that the Primary is down (along with its associated MP)?

    Nothing. The client's know nothing about the primary site server and as you've already pointed out, the clients work autonomously based on the policy they previously downloaded. Thus, assuming the clients already downloaded the policy, which happens every 60 minutes by default, all they need access to after that is the DP to download the content which they will start doing at the start time of the deployment (+ up to a random interval of up to 1 or 2 hours so the DP isn't overwhelmed) if the deployment is required or the time the updates are user initiated if the deployment is available. Note that this is a reason that for high availability purposes, you should use a SQL replica on your MPs that way the backend DB is not needed for clients to download policy.

    What does having an additional SUP buy me? Isn't this just a mechanism to interface with Microsoft in order to pull down specific updates?

    Partly. The SUP is really just a thin layer of control of the local WSUS instance by ConfigMgr. This WSUS instance is used to download the update metadata from Microsoft and sync it to the DB but it is also used by clients to download that same metadata as well as EULAs. Thus, for clients to be able to scan for update compliance/applicability, they must have access to a WSUS instance corresponding to a SUP in the site.


    Jason | http://blog.configmgrftw.com | @jasonsandys

    • Proposed as answer by Jeff Poling Thursday, March 12, 2015 6:49 PM
    • Marked as answer by SecOpsGuy Thursday, March 12, 2015 7:17 PM
    • Unmarked as answer by SecOpsGuy Thursday, March 12, 2015 8:23 PM
    • Marked as answer by SecOpsGuy Thursday, March 12, 2015 8:25 PM
    Thursday, March 12, 2015 6:43 PM

All replies

  • First, note that using a separate clustered SQL DB adds complexity and latency with no true advantages generally. Most folks, including very large enterprises, use co-located SQL Server. Because the primary site server itself is always a single point of failure, having a clustered SQL adds no real high availability.

    What happens at the client level with regards to the scheduled updates now that the Primary is down (along with its associated MP)?

    Nothing. The client's know nothing about the primary site server and as you've already pointed out, the clients work autonomously based on the policy they previously downloaded. Thus, assuming the clients already downloaded the policy, which happens every 60 minutes by default, all they need access to after that is the DP to download the content which they will start doing at the start time of the deployment (+ up to a random interval of up to 1 or 2 hours so the DP isn't overwhelmed) if the deployment is required or the time the updates are user initiated if the deployment is available. Note that this is a reason that for high availability purposes, you should use a SQL replica on your MPs that way the backend DB is not needed for clients to download policy.

    What does having an additional SUP buy me? Isn't this just a mechanism to interface with Microsoft in order to pull down specific updates?

    Partly. The SUP is really just a thin layer of control of the local WSUS instance by ConfigMgr. This WSUS instance is used to download the update metadata from Microsoft and sync it to the DB but it is also used by clients to download that same metadata as well as EULAs. Thus, for clients to be able to scan for update compliance/applicability, they must have access to a WSUS instance corresponding to a SUP in the site.


    Jason | http://blog.configmgrftw.com | @jasonsandys

    • Proposed as answer by Jeff Poling Thursday, March 12, 2015 6:49 PM
    • Marked as answer by SecOpsGuy Thursday, March 12, 2015 7:17 PM
    • Unmarked as answer by SecOpsGuy Thursday, March 12, 2015 8:23 PM
    • Marked as answer by SecOpsGuy Thursday, March 12, 2015 8:25 PM
    Thursday, March 12, 2015 6:43 PM
  • Ok, great. Thanks for your feedback! Please allow me to probe a little further...

    With a clustered DB the additional MPs would still be able to talk to the site's DB, right? Wouldn't this be an advantage of using a clustered DB and save you the hassle of running a replica? Am I missing an important caveat?

    So let's say that the additional MPs do leverage the Clustered DB and the Primary Site Server has gone down. Will the clients that have yet to check-in and grab the latest policy (with the scheduled updates we're most concerned about) be able to check-in to one of the MPs still up and grab what it needs to install the updates?

    Thanks again!


    • Edited by SecOpsGuy Thursday, March 12, 2015 7:03 PM
    Thursday, March 12, 2015 7:03 PM
  • For the clustered DB, kind of. Some of what happens must go through the primary site server like inventory. Using SQL replica's provides the same advantage without the overhead, complexity, licensing, and headache of either failover clustering or using a remote SQL instance. Here's a great post from Steve Thmpson on why not to use a remote SQL instance: http://stevethompsonmvp.wordpress.com/2014/12/20/why-you-should-not-use-remote-sql-server-with-configmgr-2012/

    For the question in the second paragraph, yes.


    Jason | http://blog.configmgrftw.com | @jasonsandys

    Thursday, March 12, 2015 8:18 PM
  • Booyah! Thanks again, man. Much appreciated.
    Thursday, March 12, 2015 8:26 PM