locked
High availability Architecture question RRS feed

  • Question

  • Hi all,

    We are currently designing a SCCM infrastructure solution for our customer and would like to ask the experts about the most suitable architecture.

    The company has less than 500 employees, one principal site (headquarter) and a remote site (less than 100 employees). The WAN link is more or less slow.

    This said, we clearly don't need a CAS (central administration site), we will use a standalone primary site with 2 or 3 DPs on the principal and remote site.

    But for high availability concerns, should we use secondary sites? add another standalone primary site? or use a CAS with 2 or more child primary sites? 

    Thank you in advance for your help.

    Thursday, December 19, 2013 10:54 AM

Answers


  • For acceptable downtime it's 4 to 6h. 

    I wouldn't call that "highly available" then. You can easily restore from backup in that time.

    Torsten Meringer | http://www.mssccmfaq.de

    • Marked as answer by Adil Faiz Monday, December 23, 2013 3:00 PM
    Thursday, December 19, 2013 12:39 PM

All replies

  • How about the growth of  the environment. If you dont have much growth one standalone Primary site server and one secondary is enough for 600 employess. Otherwise we need to have CAS server with primary and secondary site server.

    • Proposed as answer by Joyce L Friday, December 20, 2013 3:15 AM
    Thursday, December 19, 2013 11:07 AM
  • I would not say that a growth to above 100 000 clients from 500 clients is that likely but I can be wrong :)

    You state that you would like to see some HA in ConfigMgr, is the WAN link between the sites HA? Are you planning to run the Site Server as a VM with a clustered virtualization platform? Redundant switches? What I'm trying to point at is that do you really need HA? Would it be good enough to have a VM that you can create snapshots or a replicate VM off?

    You state that the WAN link is slow, so I would recommend you to use a Stand Alone Primary Site Server at the HQ and a Secondary Site Server with DP at the BO.

    This is however only based on the information that you have provided and other factors could weight in so this is not a guarantee to be the best solution.


    Tim Nilimaa | Blog: http://infoworks.tv | Twitter: @timnilimaa

    Thursday, December 19, 2013 11:19 AM
  • How about the growth of  the environment. If you dont have much growth one standalone Primary site server and one secondary is enough for 600 employess. Otherwise we need to have CAS server with primary and secondary site server.

    Thanks Gan for your reply,

    The environnement won't grow above 1000 employees.

    So with one primary site server and one secondary site server it will do it. Let's suppose that the primary site server is down, will the secondary site server carry on?


    • Edited by Adil Faiz Thursday, December 19, 2013 11:48 AM
    Thursday, December 19, 2013 11:47 AM
  • Yes, but why do you want "high availability"? What features should be highly available? What downtime would be acceptable?

    Torsten Meringer | http://www.mssccmfaq.de

    Thursday, December 19, 2013 11:52 AM
  • I would not say that a growth to above 100 000 clients from 500 clients is that likely but I can be wrong :)

    You state that you would like to see some HA in ConfigMgr, is the WAN link between the sites HA? Are you planning to run the Site Server as a VM with a clustered virtualization platform? Redundant switches? What I'm trying to point at is that do you really need HA? Would it be good enough to have a VM that you can create snapshots or a replicate VM off?

    You state that the WAN link is slow, so I would recommend you to use a Stand Alone Primary Site Server at the HQ and a Secondary Site Server with DP at the BO.

    This is however only based on the information that you have provided and other factors could weight in so this is not a guarantee to be the best solution.


    Tim Nilimaa | Blog: http://infoworks.tv | Twitter: @timnilimaa

    Thanks tim for your input,

    Most likely the servers will be VMs. The reason for my question is that the customer wants to be sure that the infrastructure will be up if the primary site server go down, how can we provide this HA?

    Thursday, December 19, 2013 12:01 PM
  • Yes, but why do you want "high availability"? What features should be highly available? What downtime would be acceptable?

    Torsten Meringer | http://www.mssccmfaq.de

    Hi Torsten, thanks for your interest :)

    HA is the customer's request. Currently, the most important feature that needs to be highly available is SUP as we will replace his existing WSUS and replace it with SCCM patch management feature.

    For acceptable downtime it's 4 to 6h. 

    Thank you in advance.

    Thursday, December 19, 2013 12:06 PM
  • If the primary site server is down, the seconday server cannot carry all the activities. We can manage the availability with  backup. But its upto you to decide the downtime.
    Thursday, December 19, 2013 12:10 PM

  • For acceptable downtime it's 4 to 6h. 

    I wouldn't call that "highly available" then. You can easily restore from backup in that time.

    Torsten Meringer | http://www.mssccmfaq.de

    • Marked as answer by Adil Faiz Monday, December 23, 2013 3:00 PM
    Thursday, December 19, 2013 12:39 PM
  • You're right. But let's suppose that acceptable downtime is 1h or less, what solution can we suggest?
    Thursday, December 19, 2013 12:42 PM
  • You can go with Hierarchy of sites like CAS , Primary site and Secondary site.

    Thursday, December 19, 2013 1:01 PM
  • You can go with Hierarchy of sites like CAS , Primary site and Secondary site.

    How will  creating a CAS / Primary / Secondary give you any HA?


    http://www.enhansoft.com/

    Thursday, December 19, 2013 1:09 PM
  • You can go with Hierarchy of sites like CAS , Primary site and Secondary site.


    That will *not* provide high availabilty!

    Torsten Meringer | http://www.mssccmfaq.de

    Thursday, December 19, 2013 1:11 PM
  • You're right. But let's suppose that acceptable downtime is 1h or less, what solution can we suggest?

    If the primary site is down, so what? All this means is that you can't deploy any NEW SU during that time frame. (This does assume that your SU and DP is on a different server from your primary site.)

    But honestly for 1000 PC, What is the cost to the company if the server is down for 6 hours to 24 hours and they can't deploy NEW SU automatically? Now what is the cost of creating an HA solution, is it cost affective? I would suggest (for most companies) No, it is not worth it to do this.  Particularly when you say the WAN is slow. If they want a real HA solution then the WAN would be FAST and they would have redundant setup at the second location.


    http://www.enhansoft.com/

    Thursday, December 19, 2013 1:17 PM
  • Thanks for the reply. The below documentation states that will provide high availability. Please correct me if i wrong.

    http://technet.microsoft.com/en-us/library/hh846246.aspx

    At each site, you deploy site system roles to provide the services that you want clients to use at that site. The site database contains the configuration information for the site and for all clients. Use one or more of the available options to provide for high availability of the site database, and the recovery of the site and site database if needed.

    The following table provides information about the available options for Configuration Manager sites that support high availability.

    Option More information

    Use a SQL Server cluster to host the site database

    When you use a SQL Server cluster for the database at a central administration site or primary site, you use the fail-over support built into SQL Server.

    Secondary sites cannot use a SQL Server cluster, and do not support backup or restoration of their site database. You recover a secondary site by reinstalling the secondary site from its parent primary site.

    Deploy a hierarchy of sites with a central administration site, and one or more child primary sites

    This configuration can provide fault tolerance when your sites manage overlapping segments of your network. In addition, this configuration offers an additional recovery option to use the information in the shared database available at another site, to rebuild the site database at the recovered site. You can use this option to replace a failed or unavailable backup of the failed sites database.

    Create regular backups at central administration sites and primary sites

    When you create and test a regular site backup, you can ensure that you have the data necessary to recover a site, and the experience to recover a site in the minimal amount of time.

    Install multiple instances of site system roles

    When you install multiple instances of critical site system roles such as the management point and distribution point, you provide redundant points of contact for clients in the event that a specific site system server is off-line.

    Install multiple instances of the SMS Provider at a site

    The SMS Provider provides the point of administrative contact for one or more Configuration Manager consoles. When you install multiple SMS Providers, you can provide redundancy for contact points to administer your site and hierarchy.

    Thursday, December 19, 2013 1:26 PM
  • Either offload some roles and duplicate them or (and I like option 1 way better) put the primary on a virtual machine and use VMWare or Hyper-V replica as your HA solution. It's not offically supported but I've been doing it for years and it works wondefully. Just put everything on 1 server and make that server HA.

    Frankly I think you are overthinking this. Don't make a complicated design for such a small shop, they will never be able to manage it. K.I.S.S. I've managed sites over 10x that size with a single primary site server and remote DP's.


    John Marcum | http://myitforum.com/myitforumwp/author/johnmarcum/



    • Edited by John Marcum Thursday, December 19, 2013 2:21 PM
    Thursday, December 19, 2013 2:19 PM
  • A CAS will not proved HA nor will adding any extra primaries.

    Overlapping content boundaries will help if you have two primaries and the DPs are one site are offline. However since you need to talk to an MP this will NOT help if the MP is down since the client will not automatic fail over to the second primary site.  By the time you have failed over all 1000 clients, you could have rebuild the site server.

    Hyper-v clustering is supported method to archive HA. However to be truly affective each node of the cluster should be in to different data centers.


    http://www.enhansoft.com/

    Friday, December 20, 2013 2:42 AM
  • I can give you my layout, for a company of 100,000 computers and growing

    CAS - DB on standalone SQL instance, non-clustered

    2 Primaries - Both DBs on standalone SQ Instances, non-clustered

    3 SUPs on each primary.  3-4 DPs on each primary, 3 MPs on each primary.

    Lots of remote DPs and secondaries.

    Everything except the CAS/Primaries/SQL is virtualized.  Disk is mirrored on my physical servers between two datacenters.

    Could I have made it more fault tolerant?  Sure.  At what cost?  Clustering would have at least doubled my cost.  having multiple SUPs/MPs/DPs cover an array of possible issues, and they are located on different physical VM hosts in different datacenters.

    Multiple Primary sites can aid in recovery, not HA.  Big difference. 

    Without knowing more about your customers layout, I would recommend:

    1 Primary with SQL standard colocated.  All roles installed except the ones below, if possible.

    2 additional servers (virtual) with MP/SUP/DP role

    1 workstation DP at your smaller site

    Friday, December 20, 2013 6:05 PM
  • But Andy if you SQL is not clustered, how does this help with HA? If SQL does down so do the MPs and therefore it is not HA.

    BTW it sound like SQL is remote from CM, is it?


    http://www.enhansoft.com/

    Friday, December 20, 2013 6:28 PM
  • But Andy if you SQL is not clustered, how does this help with HA? If SQL does down so do the MPs and therefore it is not HA.

    BTW it sound like SQL is remote from CM, is it?


    http://www.enhansoft.com/

    Correct.

    My example above was more of a "here's how I'm doing it", not a pristine example of HA.  While I am not against using all of the scenarios for making SCCM more redundant, I'm also not convinced they are worth the cost and effort for our environment.

    If it came down to making the MPs more fault tolerant, I'd be more inclined to use MP replicas to buffer this rather than SQL clustering.  We can take the occasional SQL server reboot for patching or upgrades with no issues. 

    SQL is indeed remote.  Just works better for our DBAs. 

    Friday, December 20, 2013 11:15 PM
  • Dang Garth.... He said "remote SQL" and "better for DBA's" in the same post. :-)


    John Marcum | http://myitforum.com/myitforumwp/author/johnmarcum/

    Friday, December 20, 2013 11:56 PM
  • Dang Garth.... He said "remote SQL" and "better for DBA's" in the same post. :-)

    I saw that.. Run away screaming..

    Andy must like pain or drinks a lot! ;-)


    http://www.enhansoft.com/

    Saturday, December 21, 2013 12:06 AM
  • Dang Garth.... He said "remote SQL" and "better for DBA's" in the same post. :-)

    I saw that.. Run away screaming..

    Andy must like pain or drinks a lot! ;-)


    http://www.enhansoft.com/


    LOL! Me too!

    John Marcum | http://myitforum.com/myitforumwp/author/johnmarcum/

    Saturday, December 21, 2013 12:08 AM
  • I can give you my layout, for a company of 100,000 computers and growing


    That might be ok for a company with 100k+ but the original question was about 500 clients.

    Torsten Meringer | http://www.mssccmfaq.de

    Saturday, December 21, 2013 4:07 PM
  • Thanks you all for your help guys :)
    Monday, December 23, 2013 3:01 PM
  • I can give you my layout, for a company of 100,000 computers and growing


    That might be ok for a company with 100k+ but the original question was about 500 clients.

    Torsten Meringer | http://www.mssccmfaq.de

    Of which I also made a recommendation.  Again, it was an example that even a "large" company didn't need all of the HA bells and whistles.

    With regards to SQL remote, of course I like to drink a lot.  But I have to say, with the using SQL cluster (for "HA"), you're forced to use the same config.

    Monday, December 23, 2013 3:17 PM