locked
SLA Calculation time lag RRS feed

  • Question

  • Hi there,

    I experience that the workflows behind the SLA calculations sometimes are quite unpredictable. I know I can go to the MP and change the interval from 60 seconds to something less, but is that the solution? I sometimes see, that it can take hours before an incident has a calculated resolution date, and I am not sure why. Has anyone else experience with this, and is there a good explanation behing the calculations and the ebedded workflows? Thanks - Lasse

    Monday, June 17, 2013 4:39 PM

Answers

All replies

  • In this thread Travis Wright explained how the workflows are processed in SCSM: Workflow starting too slow

    Begin of quote:

    A couple of things to note:

    The <PollingInterval> element does control how often we query the database.  Once the query timer fires (controlled by <PollingInterval) the worklow (including the query and the rest of the workflow that actually does something) is put in a queue to the workflow service.  The amount of time it takes for the query to be started, finish, and the rest of the workflow to run is therefore dependent on how long the queue is.  The busier the system is the longer it will take for the workflow to move through the queue.  This is one reason that the PollingInterval is set fairly high at 100 seconds out of the box.  If you submit workflows to the workflow queue more frequently by lowering the PollingInterval then you will potentially lengthen the queue and adversely affect not just that workflow but also other workflows.  Definitely be careful what you do here!

    Another thing to note is that the watermark used by the subscription queries is updated every 10 seconds, so setting your polling interval to anything less than 10 seconds is pointless.


    Travis Wright Senior Program Manager Microsoft

    End of quote

    Hope this helps.


    Andreas Baumgarten | H&D International Group

    Monday, June 17, 2013 8:50 PM
  • I have a very similar problem, except the delay is not just on SLA calculations. It's ALL workflows!!

    The workflows queue up as expected, but can take as long as 45 minutes to execute (typically around 6 to 10 minutes) , during which time no workflows run at all. When the workflows do kick off they run very quickly. I can check the start and finish times in the database and see them all starting together, but delayed in relation to the scheduled time. It's like the workflow process is unable to start or is hung and then kicks back into life.

    It doesn't matter if there are a small or large number of workflows in the queue and I don't see high CPU usage on the workflows server or SQL database server.

    Any ideas?


    • Edited by snedster Friday, June 28, 2013 10:24 AM typo
    Friday, June 28, 2013 10:23 AM
  • Maybe this blog post is helpful: Troubleshooting Workflow Performance and Delays

    Andreas Baumgarten | H&D International Group

    Sunday, June 30, 2013 7:36 PM
  • Hi There,

    I looked through the articles  Thanks Andreas, and it is good reading. My issue is slightly different though as I am trying to understand what lies behind the SLA calculations.

    What I would like to understand is, are queues calculated all the time, or is calculated upon creation and then the SLA calculation is just referencing to the already calculated queue?

    Are there any workflows I can take a look at related to calculating the SLA targets? I can see the entries in the history log, but no workflows are running in the workflows status, that is related. I can see the polling interval in the Management pack, but I am not too keen on changing the curteria, as I am not sure if it will make the calculation go faster or slower, and never complete before the Next incident needs to be calculated.

    I have made som changes through since I wrote this question. The queue I used in the SLA's were based on a custom User property, called Region. It was imported through CSV jobs and powershell scripts. I think that i might have been too much to related a queue to a user property, related to the incident in a custom MP. So instead I have pushed the Region value into the User.Notes property. This seesm to Work better and also limited the scope of Users by cleaning up the AD connector to add fewer objects. This is all trial and error stuff, which is time consuming, that is the reason why I initially was looking to understand the SLA calculation. But I will have a look at the Performance queries provided by Travis and see if it brings up any new stuff.

    I am leaving the tread open for a bit more, to see if anyone can cast some more light on this matter. 

    Thanks

    Lasse

    Tuesday, July 2, 2013 6:55 AM