locked
SCOM 2019 - What if we cross the 3000 agent limit per Management Server ? RRS feed

  • Question

  • Hi everyone,

    I'm currently working on a SCOM design for a client with a HUGE load of servers to monitor - 25k+. I have questions regarding SCOM's limits:

    1. what happens if a Management Server monitors more than 3000 agents? Do we "only" face performance problems or do we have a supportability problem with Microsoft ?
    2. same question regarding a Management Group capacity: official documentation states a max of 15k agents. What if I give a Management Group more than this ? Like 20k agents ?

    I'm anyway thinking of a multiple Management Groups architecture but I'd like/need to know what really happens if we cross the limitation mentionned by the official documentation.

    Thanks in advance and stay safe :).

    Regards,

    P.

    Wednesday, April 8, 2020 1:16 PM

Answers

  • Hi,

    As every environment is unique so it's very difficult to carve in stone what exactly is the limits. The sizing given by the documentation are "recommendations", this is usually what's been tested by Microsoft, it doesn't mean that your SCOM environment won't work if you go over these recommendations.

    1. what happens if a Management Server monitors more than 3000 agents? Do we "only" face performance problems or do we have a supportability problem with Microsoft ?

    The performance will most likely be affected, if you have a slow SCOM environment you might not be able to monitor your infrastructure properly and workloads may start timing out.
    2. Same question regarding a Management Group capacity: official documentation states a max of 15k agents. What if I give a Management Group more than this ? Like 20k agents ?

    Same as above, performance will most likely get affected, but it doesn't still mean it won't work. These are simply recommendations, and should be followed to get the best possible working SCOM monitoring environment.

    What usually happens is that SCOM starts "lagging", being more unresponsive, taking long to load views, workloads may time out more which in the end results in poor monitoring experience.

    Best regards,
    Leon


    Blog: https://thesystemcenterblog.com LinkedIn:

    Wednesday, April 8, 2020 1:28 PM