none
HA and DR RRS feed

  • Question


  • We need to implement SCOM in our environment.  We are looking to utilize two-2 node SQL Cluster as the database.  We have requirements for Highly Available and Disaster Recovery Tolerant solution and have following questions:

    1.  Is it possible to have HA and DR of SCOM Server Components?
    2.  If so, how many minimum servers would be required for HA Setup?
    3.  If so, how many minimum servers would be required for DR Setup?
    4.  If so, what would be recommendation with respect to Instances? Separate instance or Shared would also work?
    5.  What would be bare minimum hardware recommendations for a small to medium size organization? any sizing help would be great help!
    6.  If possible then what should be considered while implementing this solution?
    7.  Is HA and DR really necessary for such systems?  Is it a norm in the industry?
    8.  If we do not go for HA and/or DR, how much recovery time is needed in case of a single server?
    9.  Any link to article/blog/website would be appreciated?
    Monday, May 27, 2019 5:43 PM

Answers

  • Hey

    1.  Is it possible to have HA and DR of SCOM Server Components?

    Of course it is, and it is well documented here by Microsoft:

    High Availability and Disaster Recovery
    2.  If so, how many minimum servers would be required for HA Setup?

    SCOM by design is built for HA, meaning your Management Servers will take over the load of the failed one. You technically need at least two MS for HA. Please read this:

    SCOM Windows Agents Failover
    3.  If so, how many minimum servers would be required for DR Setup?

    Please read the doc linked in first point, and if you still have this question, we can discuss.
    4.  If so, what would be recommendation with respect to Instances? Separate instance or Shared would also work?

    I'm sorry instance of what? Database? SCOM DB follows the same HA/DR practice as any other DB.
    5.  What would be bare minimum hardware recommendations for a small to medium size organization? any sizing help would be great help!

    There's a sizing tool that might help, please see here:

    Sizing SCOM 2012R2 and 2016
    6.  If possible then what should be considered while implementing this solution?

    Can you elaborate in more detail your question?
    7.  Is HA and DR really necessary for such systems?  Is it a norm in the industry?

    It depends really. Some organizations do it, while other don't. I believe it depends on you whether you want to invest more infrastructure in SCOM DR if your server monitoring is very critical for you.
    8.  If we do not go for HA and/or DR, how much recovery time is needed in case of a single server?

    Like I said, HA can largely be taken care of with little to no downtime.
    9.  Any link to article/blog/website would be appreciated?

    Please give this a read, it has great info about this.

    SCOM High Availability and DR

    Hope this helps

    Cheers


    Sam (Please take a moment to "Vote as Helpful" and/or "Mark as Answer" wherever applicable. Thanks!) Blog:AnalyticOps Insights Twitter:Sameer Mhaisekar

    Monday, May 27, 2019 6:50 PM
  • Hi,

    1. Is it possible to have HA and DR of SCOM Server Components?

    Answer: Yes, everything except the SCOM Reporting server.


    2.  If so, how many minimum servers would be required for HA Setup?

    Answer: At least two (2) SCOM management servers, and a SQL Server cluster for high availability of the SCOM database.


    3. If so, how many minimum servers would be required for DR Setup?

    Answer: None. DR is a simple backup and restore.

    4. If so, what would be recommendation with respect to Instances? Separate instance or Shared would also work?

    Answer: Create a SQL server cluster or AlwaysOn for your SCOM database.

    5. What would be bare minimum hardware recommendations for a small to medium size organization? any sizing help would be great help!

    Answer: Please refer to the official Microsoft documentation in the link below:

    System requirements for System Center Operations Manager

    6. If possible then what should be considered while implementing this solution?

    Answer: Plan your design depending on how big your environment is, you can refer to the link below for some more information.

    Planning a Management Group Design

    7. Is HA and DR really necessary for such systems?  Is it a norm in the industry?

    Answer: SCOM is an IT tool which monitors infrastructure and application health. If you want to monitor your production and get notified if something goes bad, you should consider SCOM as an important component and therefore high availability becomes important.

    8. If we do not go for HA and/or DR, how much recovery time is needed in case of a single server?

    Answer: If you don't have a DR, you can't restore anything.

    Everything is stored in the SCOM database and data warehouse, make sure to have backups of them. It is difficult to say how long a recovery will take, installing a new SCOM management server is pretty straightforward and quick.

    9. Any link to article/blog/website would be appreciated?

    Answer: You can use your favorite search engine, there are many blogs and guides out there.

    The official Microsoft documentation for SCOM has many step-by-step guides as well, and you should definitely check Kevin Holman's blog: https://kevinholman.com

    You can also check my blog for some guides: https://thesystemcenterblog.com


    Best regards,
    Leon


    Blog: https://thesystemcenterblog.com LinkedIn:



    Monday, May 27, 2019 6:50 PM

All replies

  • Hey

    1.  Is it possible to have HA and DR of SCOM Server Components?

    Of course it is, and it is well documented here by Microsoft:

    High Availability and Disaster Recovery
    2.  If so, how many minimum servers would be required for HA Setup?

    SCOM by design is built for HA, meaning your Management Servers will take over the load of the failed one. You technically need at least two MS for HA. Please read this:

    SCOM Windows Agents Failover
    3.  If so, how many minimum servers would be required for DR Setup?

    Please read the doc linked in first point, and if you still have this question, we can discuss.
    4.  If so, what would be recommendation with respect to Instances? Separate instance or Shared would also work?

    I'm sorry instance of what? Database? SCOM DB follows the same HA/DR practice as any other DB.
    5.  What would be bare minimum hardware recommendations for a small to medium size organization? any sizing help would be great help!

    There's a sizing tool that might help, please see here:

    Sizing SCOM 2012R2 and 2016
    6.  If possible then what should be considered while implementing this solution?

    Can you elaborate in more detail your question?
    7.  Is HA and DR really necessary for such systems?  Is it a norm in the industry?

    It depends really. Some organizations do it, while other don't. I believe it depends on you whether you want to invest more infrastructure in SCOM DR if your server monitoring is very critical for you.
    8.  If we do not go for HA and/or DR, how much recovery time is needed in case of a single server?

    Like I said, HA can largely be taken care of with little to no downtime.
    9.  Any link to article/blog/website would be appreciated?

    Please give this a read, it has great info about this.

    SCOM High Availability and DR

    Hope this helps

    Cheers


    Sam (Please take a moment to "Vote as Helpful" and/or "Mark as Answer" wherever applicable. Thanks!) Blog:AnalyticOps Insights Twitter:Sameer Mhaisekar

    Monday, May 27, 2019 6:50 PM
  • Hi,

    1. Is it possible to have HA and DR of SCOM Server Components?

    Answer: Yes, everything except the SCOM Reporting server.


    2.  If so, how many minimum servers would be required for HA Setup?

    Answer: At least two (2) SCOM management servers, and a SQL Server cluster for high availability of the SCOM database.


    3. If so, how many minimum servers would be required for DR Setup?

    Answer: None. DR is a simple backup and restore.

    4. If so, what would be recommendation with respect to Instances? Separate instance or Shared would also work?

    Answer: Create a SQL server cluster or AlwaysOn for your SCOM database.

    5. What would be bare minimum hardware recommendations for a small to medium size organization? any sizing help would be great help!

    Answer: Please refer to the official Microsoft documentation in the link below:

    System requirements for System Center Operations Manager

    6. If possible then what should be considered while implementing this solution?

    Answer: Plan your design depending on how big your environment is, you can refer to the link below for some more information.

    Planning a Management Group Design

    7. Is HA and DR really necessary for such systems?  Is it a norm in the industry?

    Answer: SCOM is an IT tool which monitors infrastructure and application health. If you want to monitor your production and get notified if something goes bad, you should consider SCOM as an important component and therefore high availability becomes important.

    8. If we do not go for HA and/or DR, how much recovery time is needed in case of a single server?

    Answer: If you don't have a DR, you can't restore anything.

    Everything is stored in the SCOM database and data warehouse, make sure to have backups of them. It is difficult to say how long a recovery will take, installing a new SCOM management server is pretty straightforward and quick.

    9. Any link to article/blog/website would be appreciated?

    Answer: You can use your favorite search engine, there are many blogs and guides out there.

    The official Microsoft documentation for SCOM has many step-by-step guides as well, and you should definitely check Kevin Holman's blog: https://kevinholman.com

    You can also check my blog for some guides: https://thesystemcenterblog.com


    Best regards,
    Leon


    Blog: https://thesystemcenterblog.com LinkedIn:



    Monday, May 27, 2019 6:50 PM
  • Thanks; appreciate it.

    1.  Thanks.  What about the Reporting Server.  Can it be made HA?

    2.  Thanks.

    3.  If Backup/Restore is the only option for DR then can we have Management Server or Servers already installed and in place in the DR Site and once there is failure at primary site, one can start the Recovery Process?

    4.  Actually was trying to ask what is recommended i.e. shared sql instance, separate sql instance, etc.  And can it share SQL Instance or SQL Server with other System Center products?

    5.  Thanks.

    6.  That is with respect to HA and DR Solution

    7.  Noted.  Thanks.

    8.  Thanks.

    9.  Thanks.

    Monday, May 27, 2019 8:16 PM
  • Thanks; appreciate it.

    1.  Thanks.  So, no solution for Reporting Server?  I guess would need to reconfigure if in a 2 node scenario the one managing reporting goes down.  Please, correct me.

    2.  Thanks.

    3.   If Backup/Restore is the only option for DR then can we have Management Server or Servers already installed and in place in the DR Site and once there is failure at primary site, one can start the Recovery Process and save time?

    4.  Thanks.  Would you recommend local or remote SQL?  And also, can it share Instance and/or Server with other System Center Products?

    5.  Thanks.

    6.  Thanks.

    7.  Thanks.

    8.  Thanks.  Would having them installed already in advance be a good strategy?

    9.  Thanks.

    Monday, May 27, 2019 8:21 PM
  • Hi

    1. Reporting server is not really considered as very critical, so many of the customers choose to keep it simple with no HA. If you lose it, you only lose your ability to fetch reports in that time, not the data. You can re-install it anytime afterwards as well. If you absolutely must, see here - great info:

    High Availability for OM 2012 Reporting Server?

    SCOM 2012 R2 Reporting Services High Availability Configuration Setting

    3. You can, and it is mentioned in the 2nd method in the doc I posted, please check it out.

    4. You can, but I strongly advise not to. SCOM DB is quite IO intensive and it's better to have it on it's own. In addition I've actually seen some issues due to sharing of SSRS among different SC products.

    And I recommend remote SQL, if your SCOM is more than say a hundred servers. Also that makes it more robust in HA/DR POV.

    Hope that helps

    Cheers


    Sam (Please take a moment to "Vote as Helpful" and/or "Mark as Answer" wherever applicable. Thanks!) Blog:AnalyticOps Insights Twitter:Sameer Mhaisekar


    Monday, May 27, 2019 8:26 PM
  • 1. Thanks. What about the Reporting Server.  Can it be made HA?

    Answer:
    Unfortunately no, the reporting is not that critical, even if you lose the server you are't losing any data, only the engine to get it, in the worst scenario you are not able to create reports for a couple of hours. 

    3. If Backup/Restore is the only option for DR then can we have Management Server or Servers already installed and in place in the DR Site and once there is failure at primary site, one can start the Recovery Process?

    Answer: Yes you "can" have an additional Management server on your DR site, if you need to perform a recovery, you can refer to Kevin's blog post HERE about recovering a SCOM management server.

    4. Actually was trying to ask what is recommended i.e. shared sql instance, separate sql instance, etc.  And can it share SQL Instance or SQL Server with other System Center products?

    Answer: I would recommend to have it on a separated SQL instance, but you "can" have the SCOM instance shared with other System Center products, but I don't think it's recommended.

    You can read the blog post below about this:
    Can the SCOM databases share a SQL instance with other databases?


    Blog: https://thesystemcenterblog.com LinkedIn:

    Monday, May 27, 2019 8:28 PM
  • Please refer to Paul Keely SCOM 2012 R2 HA options with Server 2012 and SQL 2012 for design the HA for SCOM. although it is written for SCOM 2012, it may also apply for SCOM 2016 and SCOM 2019.
    https://gallery.technet.microsoft.com/SCOM-2012-R2-HA-options-540beb95

    1. What about the Reporting Server.  Can it be made HA?
    Building two reporting servers and using Network Load Balancing to share workload of SCOM reporting request.

    3. for DR solution for SCOM, it means that how we implement recovery SCOM operations Manager and operations Manager datawarehouse DB.
    There are options for HA of SQL
    a) using SQL cluster : it just protect the failure of single SQL node
    b) using SQL cluster in the main site and AlwaysOn in the remote site

    Roger
    Tuesday, May 28, 2019 3:22 AM
  • 1.  Thanks.  So, recovery process should be in place in case of a failure to follow.

    3.  Thanks.

    4.  Noted.  Thanks.

    Tuesday, May 28, 2019 6:08 PM
  • 1.  Is it practical or overkill?

    3.  Thanks.

    Tuesday, May 28, 2019 6:09 PM
  • Well you should have backup of everything, but the Reporting service can easily be reinstalled.

    You can overkill if you want or have extra resources.


    Blog: https://thesystemcenterblog.com LinkedIn:

    Tuesday, May 28, 2019 6:17 PM
  • Thanks for the reply.

    Just to clarify last thing about Reporting Service.  It would be installed on the Server itself i.e. SCOM Server, only its DB would be present at SQL Server/Cluster?

    Also, the NLB-ing of Reporting Services that you often talk about also happens at the SCOM-level where two-2 servers have Reporting Services installed and are NLB-ed together?

    Please, correct my understanding.

    Thanks in advance.

    Thursday, June 20, 2019 7:16 AM