Failed to store data in the Data Warehouse
-
Wednesday, October 07, 2009 9:36 PM
Hello,
We are running SCOM 2007 R2 and just started receiving these alerts on our RMS and also on one of our management servers. I saw one blog site saying this was caused by the KMS management pack, but we don't have this MP installed. Any suggestions on how to go about troubleshooting this?
Thanks,
Tom
Log Name: Operations Manager
Source: Health Service Modules
Date: 10/7/2009 2:14:56 PM
Event ID: 31552
Task Category: Data Warehouse
Level: Error
Keywords: Classic
User: N/A
Computer: LA-SCOMRMS01.xxx.com
Description:
Failed to store data in the Data Warehouse.
Exception 'SqlException': Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding.One or more workflows were affected by this.
Workflow name: Microsoft.SystemCenter.DataWarehouse.StandardDataSetMaintenance
Instance name: State data set
Instance ID: {CA524D86-58D7-9C71-5B71-466EC8AF7F5A}
Management group: XXX
Log Name: Operations Manager
Source: Health Service Modules
Date: 10/7/2009 2:15:28 PM
Event ID: 31553
Task Category: Data Warehouse
Level: Error
Keywords: Classic
User: N/A
Computer: LA-SCOMRMS01.XXX.com
Description:
Data was written to the Data Warehouse staging area but processing failed on one of the subsequent operations.
Exception 'SqlException': Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding.One or more workflows were affected by this.
Workflow name: Microsoft.SystemCenter.DataWarehouse.CollectEntityHealthStateChange
Instance name: LA-SCOMRMS01.XXX.com
Instance ID: {1458430E-9F73-1C41-601F-D87FA689D18C}
Management group: XXX
Answers
-
Friday, October 23, 2009 9:17 PM
Ok, so we closed our pss case with the resolution, but we didn't find root cause. Below are the note of what the pss engineer found.
"I determined that the StandardDataSetMaintenance stored procedure was timing out when trying to aggregate the data in the State.vStateRaw table. Unfortunately, I was not able to see why the aggregation was taking so long, the data in the table was not corrupt and there wasn’t an excessive amount of it. To correct the issue I disabled the rule "Standard Data Warehouse Data Set maintenance rule" for the State data set. I then ran the StandardDataSetMaintenance stored procedure manually and passed it the DataSetId for the State data set. After the stored procedure completed the issue was resolved. We reenabled the rule and no critical alerts have returned."- Marked As Answer by martit01 Friday, October 23, 2009 9:17 PM
All Replies
-
Thursday, October 08, 2009 2:10 AMModeratorIt looks like your state staging tables are stuck..... can you count the records in the state staging tables in the datawarehouse database?
-
Thursday, October 08, 2009 7:39 PMHello Kevin,
I'm not a DBA so please be patient. Are you referring to the "State.StateStage" Table? If so, that one has a Row count of 123.
Thanks,
Tom -
Friday, October 09, 2009 6:51 AMModeratorThat's not bad then... wait a little time - and then see if it increases or goes to zero.
Is your warehouse database server size appropriately? Is it under load? Have you examined performance counters for the basic memoty, CPU, and disk I/O? -
Friday, October 09, 2009 7:58 AMHave you looked that the discoveries from the imported MP's are not set to update too frequently?
I'm here now thinking of MP's like IIS or such which had 60min interval discoveries for example.
If I remember right, Kevin has also blogged about those best practices what to do to the discoveries.
-Tero
MCT | MCSE | MCITP | MCTS SCOM & SCCM -
Monday, October 12, 2009 6:58 PMI've been watching the State.StateStage table, I've seen it go up to 245 row count, but it then drops back down to 0 and started to grow again.
Warehouse database is sized to 250Gb, currently with 93Gb of space available.
I've ran performance reports on the SQL server and do not see any performance issue within the past 30 days. -
Monday, October 12, 2009 11:20 PMModeratorHow often are you getting these alerts? Your staging tables are behaving normally - staging some data then processing it - so this looks like it might be a transient issue?
Does this only happen at night? Or constantly all day? During backups?
One thing to try (without much deeper digging) - is to run a sp_updatestats and a full manual reindex of the database.... it might be that the standard dataset maintenance is struggling to complete.
Performance reports wont tell you what you need to know. You need to run a perfmon - logical disks - on the database disk volumes, for avg disk sec per read, and avg disk sec per write - and collect this EVERY SECOND, for each disk. THen analyze that data looking for sustained periods above 15ms. -
Tuesday, October 13, 2009 10:58 PMThanks Kevin,
How do I run the full manual reindex of the OpsMgrDW database?
Thanks,
Tom -
Friday, October 16, 2009 6:20 PMI have the similar messages all the time, day and night, almost anytime, RMS and 2 MS all have the similar problem, and in DW, once a while, some processes which I do not know the name had been locked (time out), DW transaction log keeps full, RMS and MS all turned red which is very ugly. that is right after SP1 to R2 upgrade, we have no KMS either, not sure if Dell 4.0 MP or Exchange MP, Unix MP we just added during the R2 upgrade will cause the issue.
But I am still able to run report
the full alert is:
Data Warehouse event data dedicated maintenance process failed to perform maintenance operation. Failed to store data in the Data Warehouse.
Exception 'SqlException': Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding.
One or more workflows were affected by this. Workflow name: Microsoft.SystemCenter.DataWarehouse.StandardDataSetMaintenance
Instance name: Event data set
Instance ID: {4C8320EB-234A-F8DA-02BB-82BE4C469FD6}" -
Saturday, October 17, 2009 12:27 AMI have a PSS case opened with Microsoft to resolve this. I should have an update by early next week.
-
Saturday, October 17, 2009 3:24 AM
I have a PSS case opened with Microsoft to resolve this. I should have an update by early next week.
Martit, please keep us updated what Microsoft finds.
I doubt some latest MPs may cause what it happened, have you recently uploaded any MP?
Regards
Lisa -
Friday, October 23, 2009 9:17 PM
Ok, so we closed our pss case with the resolution, but we didn't find root cause. Below are the note of what the pss engineer found.
"I determined that the StandardDataSetMaintenance stored procedure was timing out when trying to aggregate the data in the State.vStateRaw table. Unfortunately, I was not able to see why the aggregation was taking so long, the data in the table was not corrupt and there wasn’t an excessive amount of it. To correct the issue I disabled the rule "Standard Data Warehouse Data Set maintenance rule" for the State data set. I then ran the StandardDataSetMaintenance stored procedure manually and passed it the DataSetId for the State data set. After the stored procedure completed the issue was resolved. We reenabled the rule and no critical alerts have returned."- Marked As Answer by martit01 Friday, October 23, 2009 9:17 PM
-
Tuesday, October 27, 2009 3:08 PM
Thank you Martit!
It seems that my problem is resolved too.
1. I disabled the rule "Standard Data Warehouse Data Set maintenance rule" for the State data set.
2. I then ran the StandardDataSetMaintenance stored procedure manually 3 times and passed it the DataSetId for the State data set.
a. first time it finished after 24 minutes;
b. second time it finished after 58 minutes;
c. third time it finished after 02 seconds.
3. I reenabled the rule and no critical alerts have returned.
Best regards,
Aleksandrs
-
Wednesday, October 28, 2009 7:27 AM
Hello!
The same error appeared again in my case after approximately 10 hours. I think it is necessary to find the root of this error.
Best regards,
Aleksandrs -
Saturday, November 28, 2009 12:25 AM
Hello lisa,
Just wondering if you do store your overrides in the default Management pack? -
Thursday, December 03, 2009 1:27 AM
good point that I will check, thanks, Maher
I do not store overrides in the the default MP, however I am very new and not sure if someone else did or not -
Monday, October 25, 2010 12:41 PM
1. I disabled the rule "Standard Data Warehouse Data Set maintenance rule" for the State data set.
2. I then ran the StandardDataSetMaintenance stored procedure manually 3 times and passed it the DataSetId for the State data set.
a. first time it finished after 24 minutes;
We are experiencing the same issue. StandardDataSet has been started manually after disabling the rule as above, and has been running for four days, four hours and ten minutes (and counting), generating a tempdb log of 12GB so far.
Are there any indicators as to how long this will take? I guess the procedure is doing *something* as long as the tempdb log keeps growing...
LastOptimizationActionSuccessfulCompletionDateTime for 'State data set' is 2010-10-03 01:00:27.657
Regards,
Magnus
-
Wednesday, December 08, 2010 12:00 PM
hmmm...
StandardDatasetMaintenance has been running for 48 days, 2 hours.
TempDB is 147GB.
It is running on 16 cores / 32GB RAM and there does not appear to be any significant load on the server.
Any clues?
Magnus
-
Wednesday, December 08, 2010 1:49 PMModeratorMagnus - are you saying your MANUAL run of the job has been running for 48 days straight????
Kevin Holman http://blogs.technet.com/b/kevinholman -
Thursday, January 27, 2011 11:17 PM
Kevin, yep.
Eventually we had to cancel the query and reboot the server due to pending patches.
Magnus
-
Thursday, February 17, 2011 7:36 PM
I finally got around to finding the "how to" steps the pss engineer used to resolve this issue. Just a heads up, for step #5 you will need to know the dataset, so I've included the following query to get that information:
SELECT SchemaName
FROM StandardDataset
1.Using the instance name section in the 31552 event, find the data set that is causing the problem. (See the highlighted portion in the event below)
Log Name: Operations Manager
Source: Health Service Modules
Event ID: 31552
Task Category: Data Warehouse
Computer: LA-SCOMRMS01.caa.com
Description:
Failed to store data in the Data Warehouse.
Exception 'SqlException': Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding.
One or more workflows were affected by this.
Workflow name: Microsoft.SystemCenter.DataWarehouse.StandardDataSetMaintenance
Instance name: State data set
Instance ID: {CA524D86-58D7-9C71-5B71-466EC8AF7F5A}
Management group: CAA-LA
2.Create an override to disable the maintenance procedure for this data set:
a.In the OpsMgr console go to Authoring-> Rules-> Change Scope to “Standard Data Set”
b.Right click the rule “Standard Data Warehouse Data Set maintenance rule” -> Overrides -> Override the rule -> For a specific object of class: Standard Data Set”
c.Select the data set that you found from the event in step 1.
d.Check the box next to Enabled and change the override value to “False”, then apply the changes.
3.Restart the “System Center Management” service on the RMS.
4.Wait approximately 10 minutes and then connect to the SQL server that hosts the OperationsManagerDW database and open SQL Management Studio.
5.Run the query below replacing the highlighted portion with the name of the data set from step 1.
USE [OperationsManagerDW]
DECLARE @DataSet uniqueidentifier
SET @DataSet = (SELECT DatasetId FROM StandardDataset WHERE SchemaName = 'Event ')
EXEC StandardDatasetMaintenance @DataSet
6.Once the query finishes follow steps 1-5 again, only this time set the rule to Enabled = True.
7.Monitor the event log for any timeout events.
Tom Martin Email: tmartin@caa.com -
Monday, February 28, 2011 11:09 AMSeems the problem was resolved by applying CU4
-
Monday, April 18, 2011 12:43 PM
Hi Martin,
I tried your solution. In my case the dataset is for the event data set. i changed it and ran the query. But I get the result in the first column as 0 and the second column as null.
Is this the expexted result ?
Thanks,
Dhanraj
-
Thursday, October 13, 2011 4:31 PMHypothetical, if the Standard Data Warehouse Data Set maintenance rule was disabled for a month and then this process was executed. Would the state data from a month ago be available or does is it removed in some fashion?

