none
SCOM 2016 - DB Average Wait Time is too high

    Question

  • Hi,

    we use SQL 2016 for SCOM Datawarehouse DB and get very often following warnings:

    DB Average Wait Time is too high

    The Average Wait Time of SQL instance "SQLINSTANZ" on computer "SQLSERVER.mydomain.de" is too high. See "Alert Context" tab for more details.

    Time Sampled: 04.10.2017 09:08:32
    Object Name: MSSQL$SQLINSTANZ:Locks
    Counter Name: Average Wait Time (ms)
    Instance Name: _Total
    Last Sampled Value: 20.2857151031494
    Number of Samples: 0

    Can somebody help related to Object Name MSSQL$SQLINSTANZ:Locks?

    Best Regards

    Birdal


    • Edited by _Birdal Friday, October 6, 2017 1:09 PM
    Wednesday, October 4, 2017 11:23 AM

All replies

  • Hi,

    "
    The SQLServer:Locks object in Microsoft SQL Server provides information about SQL Server locks on individual resource types. Locks are held on SQL Server resources, such as rows read or modified during a transaction, to prevent concurrent use of resources by different transactions. 
    "

    https://docs.microsoft.com/en-us/sql/relational-databases/performance-monitor/sql-server-locks-object

    Hope it is useful to you .

    Best Regards,

    Elton


    Please remember to mark the replies as answers if they help.
    If you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.com.

    Sunday, October 8, 2017 3:37 PM
    Moderator
  • Hi,

    althoug the SQL server in the SCOM infrastructure has no resource problem, these alerts are listed in SCOM related to Dataware hous DB.

    What is the solution?

    Best Regards

    Birdal

    Monday, October 9, 2017 7:07 AM
  • Hi Birdal,

    This warniung indicates some performance problems (well, in the most cases at least).

    Please make sure your tempdb is sized properly (I would say about 20 of the DW DB itself). It should be pre-sized so that it does not constantly autgrow. It still should be able to autogrow.

    Also, ensure you have the same number of data file of the tempdb as the number of processors. (4x CPUs --> 4 tempdb .mdf files with the same size). So you take the DW DB, calcualte those 20%, and then the value devide by the number of processors and you get the size of each data file. Ensure the temdb has a simople recovery model.

    Ensure the "optmize for ad-hoc workloads" settings is enabled also.

    What is the memory config of your DW instance?

    Regards,


    Stoyan (Please take a moment to "Vote as Helpful" and/or "Mark as Answer" where applicable. This helps the community, keeps the forums tidy, and recognizes useful contributions. Thanks!) Blog: https://blog.pohn.ch/ Twitter: @StoyanChalakov

    Monday, October 9, 2017 8:11 AM
    Moderator
  • Hi Stoyan,

    thank you for your feedback. The database administrator and I have checked the settings.

    Here are my answers:

    1) Autogrow of tempdb

    tempdb has the possibility to autogrow, but NOT continiously.

    2) The number of tempdb

    2 Processors exists > 2 tempdb files are available

    Each trempdb file has the same size: Initial size 1024 MB and

    3) Recovery Mode

    tempdb has simple recory mode.

    4) ad-hoc workloads

    This parameter was NOT enabled, I set it to true (enabled).

    5) Memory of DW instance

    Total memory for the virtual machine is 8 GB.

    6 GB for DW instance reserved. 2 GB for other (OS, etc.).

    Best Regards

    Birdal

    Monday, October 9, 2017 2:14 PM
  • Hi Birdal,

    I now saw that you actually observe a high avarage db wait time, because of Locks, so my db related stuff are not that relevant to the particular issue, still it doesn't harm that we went over them.

    I've seen the behavior only on rere cases and as far as I remeber it was because of some rules, part of some custom Management Pack. Since when dou you observe them? Do you refer to the alerts or to the events themselves? I ask this, because it could be that the alert threshold is somehow misconfigured.

    Can you trace back those alert/events?

    Regards,


    Stoyan (Please take a moment to "Vote as Helpful" and/or "Mark as Answer" where applicable. This helps the community, keeps the forums tidy, and recognizes useful contributions. Thanks!) Blog: https://blog.pohn.ch/ Twitter: @StoyanChalakov

    Monday, October 9, 2017 2:53 PM
    Moderator
  • Hi Stoyan,

    the change (ad hoc workloads > TRUE) did not solve the issue. I got today 7 warnings again.

    We alert only some monitors/rules, especially memory, cpu, logical disk free space, but also some services, such as WSUS, AntiVirus software service, some custom services.

    All other warnings/errors in the monitoring will be not alerted (only events).

    What is the tool to trace any SCOM alert: "starttracing.cmd VER?

    Or any other tracing method/tool?

    Best regards

    Birdal

    Tuesday, October 10, 2017 9:23 AM
  • Hi Birdal,

    I can give you a guide in regards to the verbose tracing, but I think that in this case the issue is more on the DB side. Unfortunately I haven't encountered this issue, haven't seen such issues here in the forum also.

    So if the rest of the guys/girls do not have an idea, maybe a support case would be the way to go. What do you think?

    Regards,


    Stoyan (Please take a moment to "Vote as Helpful" and/or "Mark as Answer" where applicable. This helps the community, keeps the forums tidy, and recognizes useful contributions. Thanks!) Blog: https://blog.pohn.ch/ Twitter: @StoyanChalakov


    Tuesday, October 10, 2017 9:19 PM
    Moderator
  • Hi Birdal,

    I now saw that you actually observe a high avarage db wait time, because of Locks, so my db related stuff are not that relevant to the particular issue, still it doesn't harm that we went over them.

    I've seen the behavior only on rere cases and as far as I remeber it was because of some rules, part of some custom Management Pack. Since when dou you observe them? Do you refer to the alerts or to the events themselves? I ask this, because it could be that the alert threshold is somehow misconfigured.

    Can you trace back those alert/events?

    Regards,


    Stoyan (Please take a moment to "Vote as Helpful" and/or "Mark as Answer" where applicable. This helps the community, keeps the forums tidy, and recognizes useful contributions. Thanks!) Blog: https://blog.pohn.ch/ Twitter: @StoyanChalakov


    Wednesday, October 11, 2017 12:16 PM
  • Hi Stoyan,

    thank you for your help.

    I will speak about this issue with the database admin.

    Perhaps he has any idea.

    Best Regards

    Birdal

    Wednesday, October 11, 2017 12:19 PM
  • Hi Stoyan,

    unfortunately I could not find the issue sorce for "DB Average Wait Time is too high" on datawarehouse DB.

    CPU / Memory is OK. TempDB ist O, and, and...

    I don't know if this monitor has any bug: "false positives".

    Best Regards

    Birdal

    Wednesday, February 28, 2018 7:28 AM
  • Hi Stoyan,

    I deployed now more CPU & memory ressources on all DB servers (Report Server DB, Opertion DB, Datawarehouse DB).

    I changed also TemDB so that auto-growing is disabled. TepDB has one file for each CPU node. I defined the size of each temp file 2 GB.

    Unfortunately, these changes did not solve the problem "DB Average Wait Time is too high ".

    I want to trace performance counters. But the issue is temporary. I cannot enable any performance monitoring tracing for an "undefined" time.

    Best Regards

    Birdal

    Thursday, March 8, 2018 1:24 PM
  • Hi Birdal,

    sorry to hear that it is not resolved. Have you considered opening a support call in regards to this?

    Stoyan


    Stoyan (Please take a moment to "Vote as Helpful" and/or "Mark as Answer" where applicable. This helps the community, keeps the forums tidy, and recognizes useful contributions. Thanks!) Blog: https://blog.pohn.ch/ Twitter: @StoyanChalakov

    Thursday, March 8, 2018 2:16 PM
    Moderator
  • Hi Stoyan,

    no, we did not do it. I wanted first that I will try "all" possibilities to solve the issue.

    What ist the best TeampDB file size for each file, for example Operations Manager DB has 25 GB capacity, 4 CPU core, 16 GB memory?

    How about with TempDB log file size?

    Best Regards

    Birdal

    Friday, March 9, 2018 9:03 AM
  • Hi

    It seems that i have the same issue. I can't detect any locks, but SCOM reports average wait time (up to 12000, average about 900)

    Tempfiles are ok, cpu and memory also

    Andy

    Thursday, April 26, 2018 1:19 PM