none
Data transferred with BMR / System State backups RRS feed

  • Question

  • Had a problem with VM backups recently which has led me to investigate how DPM performs BMR/System State backups.

    I know that DPM uses Windows Server Backup for System State/BMR backups of Windows 2008/2008 R2 servers (running it with the appropriate switches in a .CMD file).

    For BMR backups, DPM instructs wsbackup to write the backup file directly to a share on the DPM server (linked to the partition allocated for it in the Storage Pool)?

    For System State, DPM instructs wsbackup to write the backup file to a local drive on the protected server before transferring this to the DPM using the DPM agent?

    My understanding was that DPM only ever transfers changes since the last backup (i.e. Incremental). Does DPM do this for BMR/System State backups (if it depends on wsbackup)? E.g. The help info with wsbackup says that it cannot create incremental snapshots when the destination is a UNC, which is what DPM instructs it to do with a BMR backup.

    So what I am asking is, whenever DPM performs a BMR or System State Backup, does it transfer the entire backup each time? For a BMR backup of say 50GB, would the full 50GB have to be transferred to the DPM server for each backup (i.e. a lot of data!)

    What had led me to ask this question is an apparently unrelated issue:-

    I noticed that backups of some of my VMs were occasionally faling due to DPM losing connection to the agent on the Hyper-V server.

    I could not work out why and when I tried pinging the protected server from DPM (at the time of the backup) it would not respond (not even using its IP!) So it looked as if there was some underlying IP connectivity problem which only ocurred at night, that resulted in a temporary loss of basic IP connectivity between DPM and the Hyper-V host.

    On checking the event logs, there was no indication of any problems on either server, and the loss of connectivity only ocurred at night and was ok by the morning. It never happened during the day.

    When I checked what was happening on DPM before the first VM failed to backup I noticed that DPM has started a BMR backup of another server 15 minutes earlier (and  takes over an hour but DPM doesn't show how much data is transferred).

    I am now working on the assumption that the problem may be due to the high volume (and tranfer rate) of the BMR backup, saturating the 100Mbit WAN link (that the DPM server sits at the end of), causing temporary loss of IP connectivity between DPM and the Hyper-V host. Hence by question about the amount of data transferred for the BMR backup.

    I have changed the start time of the BMR backup to 5am so that it doesn't overlap with the backup of the VMs, to see if that makes any difference.

    As an aside: DPM's complete dependence on wsbackup.exe and ntbackup.exe for System State/BMR backups really isn't good. Why can't the DPM agent have this functionality built-in?

    Regards,

    Bruce.




    • Edited by Redbruce Thursday, April 12, 2012 4:16 PM
    Thursday, April 12, 2012 4:05 PM

Answers

  • Hi,

    Answer-1)  Correct, For disk based backups, the GUI only allows scheduling a recovery point at least one / week.  Nothing prevents you from canceling the job once scheduled.

    Answer-2) DPM copies the full amount each time to the secondary DPM Server.

    DPM relies on the Windows volume shadow copy service (VSS) to maintain previous versions of file data.  It does this by performing a "copy-on-write" (COW) operation.  When a write comes in to overwrite a block on the replica volume, VSS reads that block and calculates a CRC of the block.  It compares the CRC with the data to be written, if they are different, VSS knows that it needs to preserve that block so it writes it to the recovery point volume.  It does this read / compare and copy if necessary for every block write done on the replica volume.   This is why the recovery point volume does not have to keep a full copy of the VHD coming over from the protected server. 

    The DPM file system filter driver on the primary DPM server just tracks writes, it does not do a read  / compare operaton for every block written and not track changes for like blocks. That would cause excess IO and not add any real benefit, so consequently files like BMR VHD's that may have lots of duplicate blocks every time are added to the changed block bitmap and need to be transfered each time to the secondary.

    Answer-3)  DPM uses QOS serices for throttling and once a sesion is started at the negociated speed, it will remain at that speed until completed.   New jobs adhere to the throttling limit configured at the time the job starts.


    Please remember to click “Mark as Answer” on the post that helps you, and to click “Unmark as Answer” if a marked post does not actually answer your question. This can be beneficial to other community members reading the thread. Regards, Mike J. [MSFT] This posting is provided "AS IS" with no warranties, and confers no rights.

    Tuesday, May 8, 2012 4:13 PM
    Moderator
  • Hi,

    You are correct in your findings, each time Windows backup creates a backup, the entire output file(s) gets transmitted to the DPM server as part of that job.  In the case of BMR, Windows backup does the transmission, so DPMRA cannot show you the amount transfered.  For SystemState, DPMRA will transmit the whole contents of the WindowsImagebackup folder and can show the amount transfered.

    As far as block level changes question, the DPM file system filter driver on the protected server just tracks writes, it does not do a read  / compare operaton for every block written and not track changes for like blocks. That would cause excess IO, so consequently files like systemstate VHD's that get overwritten during each backup may have lots of duplicate blocks every time, but those overwritten blocks are added to the changed block bitmap and need to be transfered each time to the DPM server. 

    As the side question why can't DPM agent have this built in ?   Why re-invent the wheel, Windows backup knows what is needed to backup for BMR and systemstate and creates it in a way that can be restored by WinRe environment and is fully supported by the Windows group. 


    Please remember to click “Mark as Answer” on the post that helps you, and to click “Unmark as Answer” if a marked post does not actually answer your question. This can be beneficial to other community members reading the thread. Regards, Mike J. [MSFT] This posting is provided "AS IS" with no warranties, and confers no rights.

    Thursday, April 12, 2012 7:37 PM
    Moderator

All replies

  • Hi,

    You are correct in your findings, each time Windows backup creates a backup, the entire output file(s) gets transmitted to the DPM server as part of that job.  In the case of BMR, Windows backup does the transmission, so DPMRA cannot show you the amount transfered.  For SystemState, DPMRA will transmit the whole contents of the WindowsImagebackup folder and can show the amount transfered.

    As far as block level changes question, the DPM file system filter driver on the protected server just tracks writes, it does not do a read  / compare operaton for every block written and not track changes for like blocks. That would cause excess IO, so consequently files like systemstate VHD's that get overwritten during each backup may have lots of duplicate blocks every time, but those overwritten blocks are added to the changed block bitmap and need to be transfered each time to the DPM server. 

    As the side question why can't DPM agent have this built in ?   Why re-invent the wheel, Windows backup knows what is needed to backup for BMR and systemstate and creates it in a way that can be restored by WinRe environment and is fully supported by the Windows group. 


    Please remember to click “Mark as Answer” on the post that helps you, and to click “Unmark as Answer” if a marked post does not actually answer your question. This can be beneficial to other community members reading the thread. Regards, Mike J. [MSFT] This posting is provided "AS IS" with no warranties, and confers no rights.

    Thursday, April 12, 2012 7:37 PM
    Moderator
  • Hi,

    .... 

    As the side question why can't DPM agent have this built in ?   Why re-invent the wheel, Windows backup knows what is needed to backup for BMR and systemstate and creates it in a way that can be restored by WinRe environment and is fully supported by the Windows group. 


    Please remember to click “Mark as Answer” on the post that helps you, and to click “Unmark as Answer” if a marked post does not actually answer your question. This can be beneficial to other community members reading the thread. Regards, Mike J. [MSFT] This posting is provided "AS IS" with no warranties, and confers no rights.

    DPM's dependence on Windows Backup causes several different problems...

    Being dependent on Windows Backup means...

    1. DPM is unable to determine how big the replica volume needs to be when setting it up for the first time, and has to guess at 30GB. In my experience this is very often too small, resulting in constant and inexplicable "replica inconsistent" error messagers. And believe me, when you're setting up DPM for the first time it can be very demoralising/frustrating trying to work out why this is happening. It took me ages to work this one out and I strongly suspect that if any users give up on DPM early (for sheer frustration), this will be the primary reason why.
    2. DPM cannot report errors related to System State/BMR backups. When there are issues you have to go onto the server itself and search for Errors in the Event log reported by Windows Backup. This is slow, cumbersom and counter-intuitive.
    3. DPM cannot do VSS snapshots of System State/BMR files using DPMRA agent. Consequently, it has to transfer potentially huge amounts of data for every backup (which is ironic because Windows Server backup can do VSS snapshots when working by itself to a locally attached disk). Not very helpful to customers desperately trying squeeze their nightly backups over a WAN link. And (going back point two), DPM cannot report to the user what progress it is being made.
    4. If you select System State, the agent seems at liberty to designate any drive it sees fit to hold the temporary backup file, which could cause problems if you have the disc setup for a particular purpose (DPM doesn't ask). On one occasion, DPM selected a USB drive that happened to be plugged in when I installed the agent and setup protection. After I unplugged it, the System State backups stopped working and I had no idea why. I had to fix it by manually editting the .xml file, but it required a lot of research/time to resolve.
    5. If you select BMR instead, it solves the problem above, but at the same time introduces another, potentially serious security problem. Creating shares on DPM server (which it does to allow Windows Backup to connect) requires that you enable "file sharing" on DPM. File sharing is probably most serious security weakness on any server, leaving the DPM server far more vunerable to malacious hackers/worms. You could potentially have a nightmare situation where all the production servers and all DPM server(s) are infected simultaneously. Being able to disable file sharing on DPM would, I believe, significanly reduce the risk of the DPM server being infected/hacked, and provide some assurance that the DPM server will be better protected in the event of a serious worm/hacker attack, to be used for DR.

    None of these problems need to exist if the SS/MBR backup functionality was implemented with the DPMRA agent.

    The problems related to this have been so serious that had I known that DPM was dependent on Windows backup, I may not have chosen it over the the other backup systesm I looked at.

    How difficult would it be for the DPM team to have made use of the .DLLs used by Windows Backup to implement SS/BMR within the agent. Or if no DLL is available, obtain the specs from the Windows team to implement SS/BMR directly into the DPM agent EXE? I suspect that there is some internal politics going on here.

    Sorry if this sounds negative, but I thought it best to be honest about the issues it has caused me.

    Regards,

    Bruce.


    • Edited by Redbruce Thursday, April 12, 2012 11:13 PM
    Thursday, April 12, 2012 11:08 PM
  • Hi Bruce,

    Thanks for the feedback, I can certainly pass this on to the product group for consideration.

    I would like to comment on some of the feedback.

    1. Yes DPM 2010 does a poor job of estimating the replica size for BMR, and I agree we could probably use some Windows API's to query how large BMR would be (basically it's critical volume(s) used space minus excluded files), but I don't think even Windows backup knows how large systemstate would be ahead of time because added roles and features vary between servers and there is no way of knowing the dataset size for that, so in-box DPM solution would not help.   

    2. Agreed it's different troubleshooting method, but troubleshooting just the same.  Most of the time the failures are going to be space, or VSS infrastructure related, both of which DPM reports to you today and would not be able to offer much more than that.

    3. Windows backup takes advantage of local shapshots for incremental backup, but cannot do that for remote backups.  DPM does use VSS to take a local snapshot on the protected server for systemstate backups before transmission, but it is deleted when completed. We can't do incrementals because the snapshots that are maintained are on the DPM server and VSS maintains previous versions for DPM. I suppose we could bake in some logic when making the systemstate ourselves that we only overwrite block that are different so our filter would only track the block level changes.

    4. When the DPM agent is installed, DPM chooses the disk with the most free space to use for systemstate backup.  That never changes once configured unless you manually change it yourself.  I suppose we should be smarter and eliminate removable media as possible candidate.

    5. DPM cannot function without file sharing enabled, so file sharing is already turned on. The BMR share(s)  theemselves pose no additional risk.


    Please remember to click “Mark as Answer” on the post that helps you, and to click “Unmark as Answer” if a marked post does not actually answer your question. This can be beneficial to other community members reading the thread. Regards, Mike J. [MSFT] This posting is provided "AS IS" with no warranties, and confers no rights.

    Friday, April 13, 2012 12:15 AM
    Moderator
  • Hi Mike,

    I just wanted to follow up on Bruces question about BMR backups.

    1. So from my understanding of DPM BMR and from your comments is that it copies the full amount of data to the DPM server every time a DPM BMR runs, i.e no incremental or block level changes only. So a BMR with -allcritial flag which would include c:\40GB, e:\80GB, f:\100GB used space for example will do a BMR of 220GB every time it runs.

    Also with backup to disk, not tape, is weekly the least scheduled BMR that can be performed, i.e. 1am every sunday... no option for fortnightly or montly?

    2. How does a secondery DPM handle BMR's. Does it also copy the full amount or just the block level changes between the primary and seconday?

    3. Also with the throttling between Primary and Secondery DPM servers if i've set it to say 1MB between 7am and 7pm, and i've got jobs that start at 5am and run over into the throttled zone, does it dynamiclly throttle the bandwidth for that backup? Because i've monitored this and it seem like if the job starts outside the throttled zone it will continue at the unthrottled speed therfore only throttling jobs that start inside the throttle time zone, is this true?

    Thanks for your help Mike

    regards,

    Michael.

    Tuesday, May 8, 2012 11:50 AM
  • Hi,

    Answer-1)  Correct, For disk based backups, the GUI only allows scheduling a recovery point at least one / week.  Nothing prevents you from canceling the job once scheduled.

    Answer-2) DPM copies the full amount each time to the secondary DPM Server.

    DPM relies on the Windows volume shadow copy service (VSS) to maintain previous versions of file data.  It does this by performing a "copy-on-write" (COW) operation.  When a write comes in to overwrite a block on the replica volume, VSS reads that block and calculates a CRC of the block.  It compares the CRC with the data to be written, if they are different, VSS knows that it needs to preserve that block so it writes it to the recovery point volume.  It does this read / compare and copy if necessary for every block write done on the replica volume.   This is why the recovery point volume does not have to keep a full copy of the VHD coming over from the protected server. 

    The DPM file system filter driver on the primary DPM server just tracks writes, it does not do a read  / compare operaton for every block written and not track changes for like blocks. That would cause excess IO and not add any real benefit, so consequently files like BMR VHD's that may have lots of duplicate blocks every time are added to the changed block bitmap and need to be transfered each time to the secondary.

    Answer-3)  DPM uses QOS serices for throttling and once a sesion is started at the negociated speed, it will remain at that speed until completed.   New jobs adhere to the throttling limit configured at the time the job starts.


    Please remember to click “Mark as Answer” on the post that helps you, and to click “Unmark as Answer” if a marked post does not actually answer your question. This can be beneficial to other community members reading the thread. Regards, Mike J. [MSFT] This posting is provided "AS IS" with no warranties, and confers no rights.

    Tuesday, May 8, 2012 4:13 PM
    Moderator
  • Hi Mike,

    I thought I'd better tag my question to the end of this thread as it directly relates to the questions posed by the OP.

    My question is this:

    Does adding the C: drive of a server to a protection group as a volume to be protected as well as choosing BMR effectively mean that the C: drive is backed up twice?

    E.g. I have a server with a C: drive containing 100GB of data, I choose it as a volume to be protected and also choose to have the server protected by BMR. By my reading of things the DPM server will then contain 2 copies of the C: drive in different data sources as follows:

    Data source - Volume C: - 100GB

    Data source - BMR (System state + C: drive) - 100GB

    Is this analysis correct?

    Many thanks.

    Tuesday, June 4, 2013 1:45 PM
  • Hi Alexrose,

    For the most part - your understanding is correct.   What will be missing in the C: Volume backup are files that have hardlinks - as DPM will not protect them.  If you wanted to recover individual files from BMR backup, you could restore it on a Windows 2008 (or later) O.S. - them manually mount the .vhd in disk management and copy out individual files.   So it depends on what is the most likely recovery scenario, having to restore the whole BMR backup or doing individual file recoveries, then choose if the extra disk space required for doing both is worth the convenience of doing item level recover via DPM console, or mounting BMR .vhd to get to individual files. 


    Please remember to click “Mark as Answer” on the post that helps you, and to click “Unmark as Answer” if a marked post does not actually answer your question. This can be beneficial to other community members reading the thread. Regards, Mike J. [MSFT] This posting is provided "AS IS" with no warranties, and confers no rights.

    • Proposed as answer by AlexRose Wednesday, June 5, 2013 10:02 AM
    Tuesday, June 4, 2013 3:10 PM
    Moderator
  • Hi Mike,

    Thanks for the swift reply. This is my 4th install of DPM and it's just lucky that at the previous 3 sites all the servers have had small system drives with data kept on seperate drives so I've managed to dodge a bullet there!

    Many thanks,

    Alex

    Wednesday, June 5, 2013 10:01 AM