We have a large SCCM 2007 environment (over 2000 sites, 80,000+ clients) and we are finding the Central Site doesn't appear to be able to keep up with the influx of messages coming into COMPSUMM.BOX. We can move all of the files out and feed them back in slowly, but even if we do nothing but clear out the files the inbox gets backlogged again.
In this Inbox there are a combination of SVF and SUM files. The SVF files are small and process without any issues. The problem is the SUM files. The SUM files up to about 500KB or less appear to process OK and don't hold up the overall inbox processing too much, but the problem is when the 2MB, 3MB, 4MB and 5MB SUM files come in. As soon as they hit, the inbox comes to a crawl. It takes a significant amount of time to process each of these larger files (15mins+ each) and while it's processing one, more come in, then it gets to the next one, and then more come in and by the end of the day we can end up with as many as 300,000 files in this folder. Eventually, it seems to process a lot of them and occasionally even gets caught up (down to only 10,000 files), but in busier periods it needs help by us moving some of the files out. Also, the large SUM files seem to process at a higher priority and leave other items sitting there not moving while the system works on the larger files? What process generates the larger files flowing through here and how often?
My question is, is there anything we can do to reduce the amount of data flowing through this specific inbox so it is able to better keep up with the load? Can we change the processing priority so that the larger SUM files don't automatically jump to the front of the queue? I know we can change the schedule for the "Site System Status Summarizer", but there doesn't seem to be such a schedule for the "Component Status Summarizer". I don't want to turn off replicating these messages from the child primary sites, but we also don't want to be dealing with a constant backlog situation either.
Any suggestions are much appreciated.
So just to be clear large file don't have a high priority than smaller one.
What do you mean by 2000 sites? Do you me CM07 Primary / secondary site?
Are there any errors in the Compsumm.log?
What setting do you have for HW inv., SW inc, SW metering, Heartbeat discovery?
How and where are you doing AD discovery?
What is you Disk IO like?
What is you Memory (pages/sec) like?
What is you CPU % like?
What is SQL like? is it overloaded?
Have you preformed any of the DBCC command on SQL?
I wasn't saying that the larger files process because they are larger files but rather that SUM files appear to process at higher priority than SVF files in this Inbox. Once one of the large SUM files hits it's turn in the queue, compsumm.box starts growing. There were 88,000 files in here when I came in this morning and it still had files from the 17th that hadn't processed. Moved out all the large files (1000+ of them) and all of the other files in the Inbox processed. Move the 1000 back in... backlog begins again. We can have 300 SVF files come in and one SUM file and the SUM file will begin processing immediately and the SVF files don't appear to process until the SUM file is done. I was thinking more along the lines that the content of the SUM files may be prioritized higher than others.
By 2000 Sites, yes CM07 Primary and Secondary Sites. 19 Primary and 2286 Secondary to be exact.
No errors in COMPSUMM.LOG. Just very busy processing.
HINV = Every 4 Days
SINV = Every 7 Days
SWM = Every 7 Days
Heartbeat = Every 7 Days
Discovery/Inventory isn't an issue... No DDR or MIF backlogs. We don't run AD System Discovery, only Group Discovery and it runs on the tier two primary sites once a month and are staggered by region.
CPU is steady at about 50% with SMSEXEC using about 30%.
Memory is steady at around 10GB of the 36GB Total
SQL is interesting... SQL is a cluster on a dedicated server with 48GB physical memory. SQL Server 2008 SP3 64-Bit SQL is configured to be able to use up to 42GB of memory leaving 6GB for the OS. In Task Manager, SQLServr.exe is showing it is using about 800-900MB most of the time. However, looking in the status bar in Task Manager it shows Physical Memory: 98%, consistently. I ran a TASKLIST and exported the results to view in Excel but when I total everything there, it is only about 2.1GB. Hmm... Running RAMMap.exe it shows AWE allocated with 43GB of memory. AWE is NOT enabled in SQL. From another Google search, this appears to be something others have seen as well, but I'm not finding any good solution, or any really clear indication this is actually a problem as opposed to how Win2K8 is managing memory. Don't like seeing the server showing 98% memory usage though. Going to continue to look at this further and feed the info back to the team.
I have not performed the DBCC commands on the SQL database (yet) but will do so.