none
DPM 2016 File Server Backup SLOW

    Question

  • Hello,

    My setup is DPM 2016 VM running on Server 2016 and Server 2016 H-V host.  The host also hold the storage VHDX files in storage spaces. 

    I can backup H-V VMs incredibly fast.  For instance, one protection group of ~20 servers @3TB of data finished in about 20 hours.  All of these VMs reside in my SAN (CSV connected)

    Now I'm attempting to backup 2 large file servers.  1 is 2008 R2 physical  with about 1.3 TB of data that resides on the SAN (iscsi connected).  The second is a 2012 R2 VM but the data is also connected over iscsi to my SAN.  This is the same SAN that my VMs use. 

    The backup of the servers is PAINFULLY slow.  For instance, the 2012 R2 file server backup is 2 hours into it, and has backed up 6GB.  I've performed copy jobs between these servers and they all go much faster.  NOT blazingly fast, but average about 400 Mbps. -Edit- I should add that the copy job of SMALL files is about 400Mbps.  The copy of LARGE files is easily 800Mbps.  This is using Gb nics and during production hours (other contention is possible).  The point is, it behaves as expected, outside of DPM.

    Any ideas?  With this performance, DPM is DOA.

    Thx,

    DML


    DLovitt


    Thursday, October 27, 2016 9:37 PM

Answers

  • For anyone following or stumbling upon this...

    It indeed appears that having the source deduped causes this perceived (actual?) slowness.  After running Resource Monitor and monitoring the files getting processed by the DPMRA process (thanks for the suggestion, Mike), I determined that it appears to work like this:

    1. DMPRA goes through the entire file structure, alphabetically, beginning to end and processes the UNDEDUPED portions of the hundreds of thousands or millions of files.  This process is indeed fairly painfully slow... but all is not lost - see step 2.  In my case, the amount of actual bits left in the file system (as opposed to the chunk store) after dedup was really small, like 40GB on 2.1TB worth of files (if the files were rehydrated - actual disk usage is 1.3TB).  And this processing took ~34 hours.  That's 34 hours to write ~40GB to my DMP disk storage. 

    2. DPMRA then processes the chunk store.  For those potentially uninitiated, the chunk store is where server 2012/R2 with deduped volumes, writes duplicate data found in any files on the volume and leaves behind the reparse point to that chunk of data in the file system.  At this point, DPM starts REALLY backing up the files, so to speak.  Performance wasn't STELLAR, but at least I could see IO performance going North of 400Mbps.  It was still pretty bursty with loads of speed fluctuations,  but the whole backup finished within 12 more hours or so. 

    So, it appears to me to be at least tolerable at this point, and not so dismal as I was originally thinking.


    DLovitt

    • Marked as answer by Darius Lovitt Tuesday, November 01, 2016 10:35 PM
    Tuesday, November 01, 2016 10:31 PM

All replies

  • Hi,

    If the file servers contain millions of small files, then the initial replica will definitely take longer, but I agree that that seems to be excessively slow.  You can use resource monitor or performance monitor to look at disk IO performance related to dpmra reads on the file servers and see if there is a bottleneck there.  


    Please remember to click “Mark as Answer” on the post that helps you, and to click “Unmark as Answer” if a marked post does not actually answer your question. This can be beneficial to other community members reading the thread. Regards, Mike J. [MSFT] This posting is provided "AS IS" with no warranties, and confers no rights.

    Thursday, October 27, 2016 9:52 PM
    Moderator
  • 20 hours later- 30 GB completed.  I've looked at resource monitor and I don't see anything causing a bottleneck.  This is a screen shot of my traffic pattern on the DMP server while this initial backup is performed.  The 2012 R2 file server currently getting backed up is deduped, BTW.  I'm wondering if DPM is processing all of the non-deduped data bits first and then will go through the chunk store QUICKLY. 

    Any other suggestions?

    Thank you,


    DLovitt

    Friday, October 28, 2016 3:01 PM
  • For anyone following or stumbling upon this...

    It indeed appears that having the source deduped causes this perceived (actual?) slowness.  After running Resource Monitor and monitoring the files getting processed by the DPMRA process (thanks for the suggestion, Mike), I determined that it appears to work like this:

    1. DMPRA goes through the entire file structure, alphabetically, beginning to end and processes the UNDEDUPED portions of the hundreds of thousands or millions of files.  This process is indeed fairly painfully slow... but all is not lost - see step 2.  In my case, the amount of actual bits left in the file system (as opposed to the chunk store) after dedup was really small, like 40GB on 2.1TB worth of files (if the files were rehydrated - actual disk usage is 1.3TB).  And this processing took ~34 hours.  That's 34 hours to write ~40GB to my DMP disk storage. 

    2. DPMRA then processes the chunk store.  For those potentially uninitiated, the chunk store is where server 2012/R2 with deduped volumes, writes duplicate data found in any files on the volume and leaves behind the reparse point to that chunk of data in the file system.  At this point, DPM starts REALLY backing up the files, so to speak.  Performance wasn't STELLAR, but at least I could see IO performance going North of 400Mbps.  It was still pretty bursty with loads of speed fluctuations,  but the whole backup finished within 12 more hours or so. 

    So, it appears to me to be at least tolerable at this point, and not so dismal as I was originally thinking.


    DLovitt

    • Marked as answer by Darius Lovitt Tuesday, November 01, 2016 10:35 PM
    Tuesday, November 01, 2016 10:31 PM
  • Aaaand still... it's extremely slow! Went from DPM 2012 R2

    Can you imagine a 15TB deduped volume?

    I'm using DPM 2016 with Modern Storage. Consistency check that lasts up to 150 hours? And after that Sync process, that lasts up to 5 hr! After - a recovery point.

    And here we go: backup for your File Server - 6 days old!!! Great!

    After that - replica become inconsistent again! Infinity loop

    Tuesday, September 19, 2017 5:41 PM
  • Is there any progress on this issue? We're backing up 6-7 TB of data, and it's painfully slow, running DPM 2016 UR2...

    Anybody running UR4? Are there any improvements? From what I read from the release note it looks like it doesn't address this issue.

    Thursday, February 15, 2018 7:28 PM
  • After months of babysitting, I updated to UR5 and... more of the same.  I had to blow away the existing backup sets and recreate -- I've got 4x 40TB deduped NTFS volumes and 40x 10TB volumes.  Especially on the ones with a very large file count, it just drags.  They're all virtual drives running on a Dell MD3460, backing up to an identical MD3460, connected via 12Gbps HBA.

    I have a separate DPM instance backing up my VMs, and it's respectably fast.  Maybe I'll point this DPM instance at the vhost instead of the file server VMs...  Except it's over 75TB of storage in one backup set and DPM says no dice.

    What's really frustrating is it's not for lack of speed -- from the DPM box, I can load test the file servers and the results come back R/W @ 1.5GBps/0.8GBps -- fast!  DPM backs up at... 5MBps (for a few hours, until it decides to speed up to ~200MBps).

    Tuesday, May 22, 2018 11:04 AM