none
File Protection very slow RRS feed

  • Question

  • I have SCDPM 2010 running on Windows Server 2008R2 on a physical machine. I use it to backup my Exchange, SQL and Sharepoint servers which are vitual and these backup without issue.

    However I am trying to backup data from a virtual machine's iSCSI SAN and it is taking forever. Its been running for 30mins this time and only done 7.5MB. There is over 1.5TB on that SAN I want to backup.

    Why is it taking so long?

    Monday, November 7, 2011 5:53 PM

Answers

  • Hello Andrew,

    This is a concern that has repeated itself on this forum many times before. Due to the under the hood complexity of the DPM architecure and the components it relies on, the answer has not always been the same for slow synchs or replicas.

    As a general rule of thumb the initial replica speed benchmarks can be found here: http://technet.microsoft.com/en-us/library/bb795811.aspx

    Working our way from DPM all the way to the SAN device....


    DPM
    ****
    Here are a few articles to help make sure the DPM is running as optimally as possible.

    DPM Managing Performance:  http://technet.microsoft.com/en-us/library/ff399471.aspx
    How DPM Operations Affect Performance:  http://technet.microsoft.com/en-us/library/ff399315.aspx
    DPM and Memory:  http://technet.microsoft.com/en-us/library/ff399244.aspx
    Performance Counters:  http://technet.microsoft.com/en-us/library/ff399366.aspx
    Improving Performance:  http://technet.microsoft.com/en-us/library/ff399571.aspx
    Managing DPM Performance on a WAN:  http://technet.microsoft.com/en-us/library/ff634226.aspx
    How Protection Groups Changes Affect Jobs:  http://technet.microsoft.com/en-us/library/ff399316.aspx

    Next we need to look at the VSS health of the DPM server.  In the event logs do you see any VSS related errors?

    Make sure the pagefile is big enough. A setting of "system managed" for the PF does not always meet the needs of the DPM server. In the article above it gives an example of: "DPM requires a pagefile size that is 0.2 percent the size of all recovery point volumes combined, in addition to the recommended size (generally, 1.5 times the amount of RAM on the computer). For example, if the recovery point volumes on a DPM server total 3 TB, you should increase the pagefile size by 6 GB."

    Try also toggling off chimney and RSS on the DPM server to see if it makes any difference in the data transfer rate. 
         To determine the current status of TCP Chimney Offload or RSS:   netsh int tcp show global
         To disable RSS: netsh int tcp set global rss=disabled
         To disable chimney:   netsh int tcp set global chimney=disabled

    951037        Information about the TCP Chimney Offload, Receive Side Scaling, and Network Direct Memory Access features in Windows Server 2008
    http://support.microsoft.com/default.aspx?scid=kb;EN-US;951037


    Is the DPM performance in general sluggish or slow?  Here are some basic perfmon counters that may assist you in figuring that out:
    Logical Disk/Physical Disk
    ******************
    \%idle
    • 100% idle to 50% idle = Healthy
    • 49% idle to 20% idle = Warning or Monitor
    • 19% idle to 0% idle = Critical or Out of Spec

    \%Avg. Disk Sec Read or Write
    • .001ms to .015ms  = Healthy
    • .015ms to .025 = Warning or Monitor
    • .026ms or greater = Critical or Out of Spec

    Current Disk Queue Length (for all instances)
    80 requests for more than 6 minutes.
    • Indicates possibly excessive disk queue length.

    Memory
    *******
    \Pool Non Paged Bytes*
    • Less that 60% of pool consumed=Healthy
    • 61% - 80% of pool consumed = Warning or Monitor.
    • Greater than 80% pool consumed = Critical or Out of Spec.

    \Pool Paged Bytes*
    • Less that 60% of pool consumed=Healthy
    • 61% - 80% of pool consumed = Warning or Monitor.
    • Greater than 80% pool consumed = Critical or Out of Spec.

    \Available Megabytes
    • 50% of free memory available or more =Healthy
    • 25% of free memory available = Monitor.
    • 10% of free memory available = Warning
    • Less than 100MB or 5% of free memory available = Critical or Out of Spec.

    Processor
    *******
    \%Processor Time (all instances)                                                                   
    • Less than 60% consumed = Healthy
    • 51% - 90% consumed = Monitor or Caution
    91% - 100% consumed = Critical


    SAN
    ****
    Is the SAN volume heavily fragmented?
    Has there been any test run on the SAN itself to determine if the read\write is at an acceptable speed?
    If you were to just copy a large amount of data (copy and paste and not using DPM) is that performance also slow?


    Yes, this is a large laundry list of things to check on and there is even more to look at and discuss but let's stop here for now to see what we find out.


    Thanks,
    Shane

     

     

    Tuesday, November 15, 2011 2:42 PM
    Moderator

All replies

  • Hello Andrew,

    This is a concern that has repeated itself on this forum many times before. Due to the under the hood complexity of the DPM architecure and the components it relies on, the answer has not always been the same for slow synchs or replicas.

    As a general rule of thumb the initial replica speed benchmarks can be found here: http://technet.microsoft.com/en-us/library/bb795811.aspx

    Working our way from DPM all the way to the SAN device....


    DPM
    ****
    Here are a few articles to help make sure the DPM is running as optimally as possible.

    DPM Managing Performance:  http://technet.microsoft.com/en-us/library/ff399471.aspx
    How DPM Operations Affect Performance:  http://technet.microsoft.com/en-us/library/ff399315.aspx
    DPM and Memory:  http://technet.microsoft.com/en-us/library/ff399244.aspx
    Performance Counters:  http://technet.microsoft.com/en-us/library/ff399366.aspx
    Improving Performance:  http://technet.microsoft.com/en-us/library/ff399571.aspx
    Managing DPM Performance on a WAN:  http://technet.microsoft.com/en-us/library/ff634226.aspx
    How Protection Groups Changes Affect Jobs:  http://technet.microsoft.com/en-us/library/ff399316.aspx

    Next we need to look at the VSS health of the DPM server.  In the event logs do you see any VSS related errors?

    Make sure the pagefile is big enough. A setting of "system managed" for the PF does not always meet the needs of the DPM server. In the article above it gives an example of: "DPM requires a pagefile size that is 0.2 percent the size of all recovery point volumes combined, in addition to the recommended size (generally, 1.5 times the amount of RAM on the computer). For example, if the recovery point volumes on a DPM server total 3 TB, you should increase the pagefile size by 6 GB."

    Try also toggling off chimney and RSS on the DPM server to see if it makes any difference in the data transfer rate. 
         To determine the current status of TCP Chimney Offload or RSS:   netsh int tcp show global
         To disable RSS: netsh int tcp set global rss=disabled
         To disable chimney:   netsh int tcp set global chimney=disabled

    951037        Information about the TCP Chimney Offload, Receive Side Scaling, and Network Direct Memory Access features in Windows Server 2008
    http://support.microsoft.com/default.aspx?scid=kb;EN-US;951037


    Is the DPM performance in general sluggish or slow?  Here are some basic perfmon counters that may assist you in figuring that out:
    Logical Disk/Physical Disk
    ******************
    \%idle
    • 100% idle to 50% idle = Healthy
    • 49% idle to 20% idle = Warning or Monitor
    • 19% idle to 0% idle = Critical or Out of Spec

    \%Avg. Disk Sec Read or Write
    • .001ms to .015ms  = Healthy
    • .015ms to .025 = Warning or Monitor
    • .026ms or greater = Critical or Out of Spec

    Current Disk Queue Length (for all instances)
    80 requests for more than 6 minutes.
    • Indicates possibly excessive disk queue length.

    Memory
    *******
    \Pool Non Paged Bytes*
    • Less that 60% of pool consumed=Healthy
    • 61% - 80% of pool consumed = Warning or Monitor.
    • Greater than 80% pool consumed = Critical or Out of Spec.

    \Pool Paged Bytes*
    • Less that 60% of pool consumed=Healthy
    • 61% - 80% of pool consumed = Warning or Monitor.
    • Greater than 80% pool consumed = Critical or Out of Spec.

    \Available Megabytes
    • 50% of free memory available or more =Healthy
    • 25% of free memory available = Monitor.
    • 10% of free memory available = Warning
    • Less than 100MB or 5% of free memory available = Critical or Out of Spec.

    Processor
    *******
    \%Processor Time (all instances)                                                                   
    • Less than 60% consumed = Healthy
    • 51% - 90% consumed = Monitor or Caution
    91% - 100% consumed = Critical


    SAN
    ****
    Is the SAN volume heavily fragmented?
    Has there been any test run on the SAN itself to determine if the read\write is at an acceptable speed?
    If you were to just copy a large amount of data (copy and paste and not using DPM) is that performance also slow?


    Yes, this is a large laundry list of things to check on and there is even more to look at and discuss but let's stop here for now to see what we find out.


    Thanks,
    Shane

     

     

    Tuesday, November 15, 2011 2:42 PM
    Moderator
  • Hello,

    Have you had a chance to look at any of the above?


    Thanks,
    Shane

    Friday, November 18, 2011 9:29 PM
    Moderator