none
Performance Monitor and Avg. Disk Queue Length

    Question

  • I need a link to a Microsoft website of how "Avg. Disk Sec/Transfer or Avg.Disk Queue Length" is figured out as an "Average" as the name implies (or an expanded definition beyond the explanation inside perfmon).  Data points are gathered at set intervals in perfmon - so a little confused on the average portion.  The only explanation I could find was on http://www.demandtech.com/FAQsdisk.htm where it states: "The formula used in the calculation to derive % Disk Time corresponds to Little's Law, a well-known equivalence relation that shows the number of requests in the system as a function of the arrival rate and service time. According to Little's Law, Avg Disk sec/Transfer * Disk transfers/sec properly yields the average number of requests in the system, more formally known as the average Queue length. The average Queue length value calculated in this fashion includes both IRPs queued for service and actually in service.........For example, if % Disk Time is 78.3%, Ave Disk Queue Length is 0.783. When % Disk Time is equal to 100%, then Ave Disk Queue Length shows the actual value before capping. We recently had a customer reporting values like 2.63 in this field. That's a busy disk! The interpretation of this Counter is the average number of disk requests that are active and queued – the average Queue Length."

    Can anyone direct me to a Microsoft site that has the same explanation and reference to "Little's Law"?  Or is his explanation totally wrong?

    Friday, July 24, 2009 6:53 PM

Answers

  • You will find it here:

    Examining and Tuning Disk Performance

    Avg. Disk Sec/Transfer:
    Indicates how fast data is being moved (in seconds). Measures the average time of each data transfer, regardless of the number of bytes read or written. Shows the total time of the read or write, from the moment it leaves the Diskperf.sys driver to the moment it is complete.
    A high value for this counter might mean that the system is retrying requests due to lengthy queuing or, less commonly, disk failures.

    Avg. Disk Queue Length:
    Tracks the number of requests that are queued and waiting for a disk during the sample interval, as well as requests in service. As a result, this might overstate activity. If more than two requests are continuously waiting on a single-disk system, the disk might be a bottleneck. To analyze queue length data further, use Avg. Disk Read Queue Length and Avg. Disk Write Queue Length.

    Does this answer your question?


    Certifications: MCSA 2003 MCSE 2003
    Saturday, July 25, 2009 11:58 AM

All replies

  • You will find it here:

    Examining and Tuning Disk Performance

    Avg. Disk Sec/Transfer:
    Indicates how fast data is being moved (in seconds). Measures the average time of each data transfer, regardless of the number of bytes read or written. Shows the total time of the read or write, from the moment it leaves the Diskperf.sys driver to the moment it is complete.
    A high value for this counter might mean that the system is retrying requests due to lengthy queuing or, less commonly, disk failures.

    Avg. Disk Queue Length:
    Tracks the number of requests that are queued and waiting for a disk during the sample interval, as well as requests in service. As a result, this might overstate activity. If more than two requests are continuously waiting on a single-disk system, the disk might be a bottleneck. To analyze queue length data further, use Avg. Disk Read Queue Length and Avg. Disk Write Queue Length.

    Does this answer your question?


    Certifications: MCSA 2003 MCSE 2003
    Saturday, July 25, 2009 11:58 AM
  • Sort of...  Last week, I spent 30 minutes telling my supervisor that performance monitor does not average its staticstics together - it only records the data from a point in time.  Example: If I have Permon to take samples of % CPU usage every minute, and at 4:01 PM it records 12%, and at 4:02 PM it records 17% and at 4:03 PM it records 12% - So I was explaining to him till I was blue in the face, that these are NOT being averaged together - these are indeed snapshots of a moment in time - at precisely 4:01 it was 12%, 4:02 it was 17%, etc - none of those data points are averages - just actual usage at the recorded time.  Then he challenged me: "What about all those average counters?"... good point.  The definition of average is at least two data points.  So, for "PhysicalDisk\Avg. Disk Queue Length" where it states: "Tracks the number of requests that are queued and waiting for a disk during the sample interval" - what does that mean?  It states "Sample interval" - are those the two data points?  For instance, if at 4:01 it is a value of 1, and 4:02 the actual reading is a value of 2, does it instead average the 4:01 and 4:02 entry and write in 1.5 for the 4:02 entry since that is the average of those two data points?  Or does "Average" mean, at the sample time (say 4:01 pm), it is looking at all data trying to enter the hard disk, and some of that data coming in has a queue length of 1 and some have a queue length of two - and at that data point sample interval it will take all that and average those incoming requests to the hard drive and put that down as the value at 4:01 pm?

    Monday, July 27, 2009 5:43 PM
  • It looks like nobody every replied to your last question, and it's an answer I'd love to know.  Also does anybody know where I can find a GOOD explanation of how to read/interpret WMI counter data?  I am using a script to retrieve from WMI the values for several objects and counters, and some of the values I get back make no sense at all, especially the disk ones.  For example:

    Physical Disk
    =============
    AvgDiskReadQueueLength: 37092584869000
    AvgDiskWriteQueueLength: 4655724278000
    DiskReadBytesPersec: 19185557504
    DiskWriteBytesPersec: 14854675968
    PercentDiskReadTime: 37092584869000
    PercentDiskWriteTime: 4655724278000

    My questions abound! 

    1. What do any of those numbers mean?  What unit are they measuring?  It would seem the first two measure the number of disk read/write actions, the next two are in bytes, and the last two are....second?  milliseconds?  With the values returned being so astronomical it's hard to even guess what they could be.

    2. How do I convert those numbers into something that means something?  For example, some of the web postings I've seen related to AvgDiskReadQueueLength indicate that values over 1 or 2 are a bad sign.  Well, in that case, I guess my machine is near death at 37092584869000.  But then at the same time if I open up Performance Monitor and start graphing AvgDiskReadQueueLength, it gives me values like .0010, .0008, .0015, etc.  How do I equate those values to a number like 37092584869000?

    3. All those numbers do nothing but go up....so at one point, 37092584869000 was my AvgDiskReadQueueLength, but right now it's 37092589010000....just going up and up.  So to find out how bad things were between my two readings, do I need to subtract one from the other and divide by some magic number and then take the inverse reciprocal of the square root?  It feels about that opaque to me, right now.

    Any help is GREATLY appreciated!!!

     

     

     

     

    Tuesday, April 12, 2011 8:07 PM
  • Ever got any clue about those values SCOMinator?

    I am facing the same issue, trying to get way to explain those high numbers...

    Thanks for the info.

    Thursday, March 22, 2012 6:57 PM
  • Yes and no.....sadly.  It's been almost a year, so my memory (and my own doc) are a bit fuzzy, but here's what I found out.  And it's little wonder my question stumped the internet!  If you take an example of AvgDiskWriteQueueLength in order to find out how to get a "cooked" value from the raw value, you have to jump through these hoops:

    1. Find a website where the parameter you are working with is listed (I found this one: http://msdn.microsoft.com/en-us/library/aa394308%28v=VS.85%29.aspx)
    2. From that website get the counter type and translate that counter type name into what appears to be a completely random numerical value (in this case it's 5571840)
    3. Go to yet a different website (http://msdn.microsoft.com/en-us/library/aa389383.aspx) to find out what Cooking Type the numerical Counter Type is (PERF_COUNTER_100NS_QUEUELEN_TYPE in this example)
    4. Go to yet again a different website (http://technet.microsoft.com/en-us/library/cc781696%28v=ws.10%29.aspx) and find the formula on that website

    5. But then you have to decipher what the formula means.  In the example, the formula is (TB(X1-X0)) / (Y1-Y0), but then you have to read and try to understand what the heck they mean when they then say "....where TB represents the performance frequency time base, the denominator (Y) represents the performance time measurement, and the numerator (X) represents counter data."

    Whoever at Microsoft developed this......and whoever documented this.......should both be given jobs in the Federal Gov't in the Treasury/IRS, writing tax code.  They'd be perfect for the job.

    Thursday, March 22, 2012 7:26 PM
  • Stumbled across this thread looking for something else but I'm using powershell to get these cooked values automatically.  

    Here is the page I found on the how to.

    http://blogs.msdn.com/b/san/archive/2011/09/10/using-powershell-to-collect-information-for-storage-troubleshooting-part-2-performance-counters.aspx

    Monday, April 09, 2012 8:28 PM