none
DPM failing and slowness during the day. RRS feed

  • Question

  • I am having any issue in DPM that the synchronization are failing between 8:00Am and 6:00PM. This happens to be when our users are in. The Syncs are taking forever and eventually fail. We are currently at 30 minutes and the data transfered is 23.19 MB. At night the Sync finish and take about just a few minutes. I tried recovering a file to original location today and it timed out saying it lost communication with the agent on the server or something to that effect. However i turned around and recovered the file to the C drive of the DPM server and it was instant. 
    Ideas? 

    Thanks  
    Tuesday, April 26, 2011 6:35 PM

Answers

  • Hello,

    What version of DPM is this? 2007 or 2010?
    What binaries is this server?  DPM 2010 7707 for example. You can get this by clicking the "i" at the top right hand corner of the DPM console.

    Can you give me information about the jobs in place on this server.
       a.) How many protection groups?
       b.) Are the protection groups staggerd on their synchronizations?
       c.) On the job that is taking so long, what is the datasource? Exchange storage groups? SQL? HyperV? File server?
       d.) How big is that datasource that is failing?
       e.) Are you using compression on the wire?
       f.) As far as the OS is concerned are chimney and\or rss enabled?
            from a command prompt type: "netsh int tcp show global"
            this should show if they are or not. 
       g.) If you go to the target server and look at the event logs, do you see any errors in the system or application event logs?
       h.) Do you also have bandwidth throttling enabled?
       i.)  What is the link speed between the DPM and target server?


    Thanks
    Shane

    • Marked as answer by JustusIV Friday, April 29, 2011 5:01 PM
    Wednesday, April 27, 2011 1:02 PM

All replies

  • So i created a filter to see more back information. I have hundreds of sync fails with the error 

     

    Product

    Data Protection Manager

    ID

    32519

    Source

     

    Version

    3.0

    Symbolic Name

     

    Message

    The backup job failed because another task exists for the same data source.

     So what i take from this error is my first job is still running and my second job and hour later is failing

    Thoughts?

     

    In addition I ran a job this morning at 7:00AM the sync took 00:28 

    I ran a second sync at 7:24AM only 24 minutes later, and the sync took 08:35

    Can i dig down and see why the heck the sync is taking so long one time and not another?


    Wednesday, April 27, 2011 12:21 PM
  • Hello,

    What version of DPM is this? 2007 or 2010?
    What binaries is this server?  DPM 2010 7707 for example. You can get this by clicking the "i" at the top right hand corner of the DPM console.

    Can you give me information about the jobs in place on this server.
       a.) How many protection groups?
       b.) Are the protection groups staggerd on their synchronizations?
       c.) On the job that is taking so long, what is the datasource? Exchange storage groups? SQL? HyperV? File server?
       d.) How big is that datasource that is failing?
       e.) Are you using compression on the wire?
       f.) As far as the OS is concerned are chimney and\or rss enabled?
            from a command prompt type: "netsh int tcp show global"
            this should show if they are or not. 
       g.) If you go to the target server and look at the event logs, do you see any errors in the system or application event logs?
       h.) Do you also have bandwidth throttling enabled?
       i.)  What is the link speed between the DPM and target server?


    Thanks
    Shane

    • Marked as answer by JustusIV Friday, April 29, 2011 5:01 PM
    Wednesday, April 27, 2011 1:02 PM
  • Thanks for response 

    It is DPM 2010 Version 3.0.7696.0

    I will be more than happy to provide what i can i just need to figure this out.

     

    a)There is 1 protection group, No bare metal at all.

    b)N/A

    c)It is a file server 

    d)9TB on a MD3000 attached via iSCSI

    e)It is disabled.

    f)The OS of the DPM server?

    2008 R2 SP1

     

    Receive-Side Scaling State          : enabled

    Chimney Offload State               : automatic

    g)Not seeing anything to unusual.

    h)yes this is enabled. Maybe my problem? Work hours 7AM - 6PM 128 Kbps Non-work 9999 Mbps I didn't set this up but one of our guys must have.

    i)Network link is gigabit.

     

     

    The throttle appears to be wrong i will play with that first. I believe that got set because DPM just hammered our file server when it was first setup.

    *UPDATE*

    I cranked up the throttle to 100Mbps and it is finishing now in under 5 minutes ever time. That must be my problem. However i am a little worried about our MD3000 It can have up to about 350 users on it and then i am really hitting it with DPM. Anything i should be careful of?


    • Proposed as answer by ShaneB. _ Thursday, April 28, 2011 5:23 PM
    • Unproposed as answer by ShaneB. _ Thursday, April 28, 2011 5:23 PM
    Wednesday, April 27, 2011 2:12 PM
  • Hello,


    What time do you have your recovery points created?  What time do you have your synchronizations taking place?   If you use compression on the wire it will make the jobs run a little faster at the cost of consuming some of the CPU on both the DPM and file server. I've never seen it cause the CPU to spike more than 2%-5% than what would be considered the norm. and of course the CPU spike dimishes after the job is complete.

    If the server is getting hammered by a large amount of user traffic as well as DPM and the performance hit is noticeable you may want to consider moving some of the files to another server and having DPM create a PG for that file server.

    I'd just monitor both servers for the time being and see how they behave.

    Articles that may be of use to you.
    *****************************
    DPM Managing Performance: http://technet.microsoft.com/en-us/library/ff399471.aspx
    How DPM Operations Affect Performance: http://technet.microsoft.com/en-us/library/ff399315.aspx
    DPM and Memory: http://technet.microsoft.com/en-us/library/ff399244.aspx
    Performance Counters: http://technet.microsoft.com/en-us/library/ff399366.aspx
    Improving Performance: http://technet.microsoft.com/en-us/library/ff399571.aspx
    Managing DPM Performance on a WAN: http://technet.microsoft.com/en-us/library/ff634226.aspx
    How Protection Groups Changes Affect Jobs: http://technet.microsoft.com/en-us/library/ff399316.aspx

     

    Side note: DPM 2010 version 7707 is out: http://support.microsoft.com/kb/2465832

    Thanks,
    Shane

     

    Thursday, April 28, 2011 5:41 PM